-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve http output handling #966
Improve http output handling #966
Conversation
1bfc70c
to
261eb9d
Compare
aee575b
to
261eb9d
Compare
* Add MaxConcurrentRequests configuration per output in order to limit the number of requests/connections. * Refactor HTTP auth headers handling, eliminate mutex on that code path. * Extract common configuration for http, refactor NewClient to avoid adding one more parameter. * Refactor default configuration definition in order to avoid typos with repetitive Output name prefix and avoid repetitive use of defaults Signed-off-by: Aleksandr Maus <aleksandr.maus@elastic.co>
261eb9d
to
0f9a5ea
Compare
didn't realize that the merge commits are not allowed in this repo, redid it with rebase |
Signed-off-by: Aleksandr Maus <aleksandr.maus@elastic.co>
Signed-off-by: Aleksandr Maus <aleksandr.maus@elastic.co>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aleksmaus, Issif The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
LGTM label has been added. Git tree hash: 058531bc27dec5657f2c1f1dcfc330c397394126
|
What type of PR is this?
/kind bug
/kind cleanup
Any specific area of the project related to this PR?
/area config
/area outputs
What this PR does / why we need it:
Multiple improvements of http outputs handling.
Addressing the following areas:
For example the Elasticsearch output is limited to one request at a time when basic auth is used, but will create unlimited number of requests at a time when the custom headers are used to pass basic auth or api key headers: https://github.com/falcosecurity/falcosidekick/blob/master/outputs/elasticsearch.go#L60. Which leads to unpredictable runtime characteristics in terms of connections use etc.
falcosidekick/outputs/client.go
Line 11 in a36af89
config.go
defaults settings is prone to mistakes (looking at previous PRs) where the same prefix has to be retyped every time and some defaults are set twice, for examplefalcosidekick/config.go
Line 400 in a36af89
and
falcosidekick/config.go
Line 409 in a36af89
It also required multiple lines of default definitions when needed to add the new setting per output.
Here is the list of changes addressing the points listed above:
Add
MaxConcurrentRequests
configuration per output in order to limitthe number of requests/connections. Decided to go with 1 as default, because it's already one at time in multiple cases due to the use of the mutex for the auth headers. This is item number 2 listed in the "short term" sections Improve Outputs throughput handling #963.
I'm using semaphore in order to limit the number of requests. I left one TODO item there, that eventually we want to make http requests and semaphore cancellable to make the http client implementation more robust.
Refactor HTTP auth headers handling, addressing this FIXME
falcosidekick/outputs/client.go
Line 11 in a36af89
This eliminates the unexpected limit on the number of requests in some cases and no limit in another cases with the same output.
Extract common configuration for http, refactor NewClient to avoid
adding one more parameter for
MaxConcurrentRequests
.Now the outputs cat update requests properties including headers before the request is executed, also allows to use already existing SetBasicAuth method implemented with Go http.Headers.
Refactor default configuration definition in order to avoid typos with
repetitive Output name prefix and avoid repetitive use of defaults. This item is not strictly necessary, but I think makes it easier to manage and add new outputs defaults and avoid errors. Some of the configurations like AWS, GCP a non-output configuration are still left inlined for now, but could be extracted later if needed.
Example of the throughput improvement for Elasticsearch Output:
As I mentioned above due to the current use of mutex lock on basic auth header, the requests are executed one after another, one request at a time if the user uses username and password for auth: because of this
https://github.com/falcosecurity/falcosidekick/blob/master/outputs/elasticsearch.go#L60
So current through numbers with Elasticsearch cloud instance are about 4 docs/sec
Now with addition of
MaxConcurrentRequests
configuration, the user can adjust the number of the requestsNow if we combine this PR with with http client reuse we get about 10 doc/sec with default 1 request at a time
This throughput could improved further with the addition of batching, which I'll have shortly.
Which issue(s) this PR fixes:
Related to # #963
First steps.
Special notes for your reviewer: