-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filebeat async publisher support #782
Conversation
b1fc9bd
to
8d69148
Compare
8d69148
to
088976f
Compare
@@ -307,6 +307,15 @@ filebeat: | |||
------------------------------------------------------------------------------------- | |||
|
|||
|
|||
===== publish_async | |||
|
|||
Experimental! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there was an asciidoc tag for that? @dedemorton ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know of one, but I'll ask those who are wiser in the ways of asciidoc. @debadair or @palecur Do you know if there is a tag that we could use here for experimental config options? Do we have a convention for documenting experimental options?
@urso By experimental, I assume we mean that customers can try the option, but their mileage may vary (and there's no guarantee that the option will be supported in future versions). If so, maybe we need a sentence that basically conveys "use at your own risk."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the sources from here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html The experimental[]
string seems to generate a warning. Not sure if that works well in this context, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By experimental, I assume we mean that customers can try the option, but their mileage may vary
Jup. It's a trade-off between throughput and memory usage when using load-balancing. In future versions I'd consider to remove the flag and enable async mode whenever load-balancing is enabled in output plugins.
LGTM |
// collect collects finished bulk-Events in order and forward processed batches | ||
// to registrar. Reports to registrar are guaranteed to be in same order | ||
// as bulk-Events have been received by the spooler | ||
func (p *asyncLogPublisher) collect() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea how we could test if this works as expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by having all filebeat system tests use both modes. I think we can easily craft unit test for sync and async mode.
LGTM |
IMHO, depends on when we release this. If we put it in 1.2, then it should be default off and maybe even undocumented (although I don't particularly like having undocumented options). But if we put it in 2.0-beta1, then we could even make it the default. That's exactly what the betas are for, after all, and we'd lose feedback if we hide the option. @urso is there a particular reason not to be confident about this? |
@tsg Good point. 👍 |
If something goes wrong, we've got a very bad memory leak. But from my manual testings all fine. Will add some unit tests to check the filebeat publisher modes working as expected. There is no actual reason to have this feature experimental. But users might be unhappy about increased CPU/memory usage if enabled by default. Will need to run some more tests with recent changes in publisher pipeline, as these might have reduced overall memory usage for async mode. |
088976f
to
0702590
Compare
if ctx.sync { | ||
func (c *client) getClient(opts []ClientOption) (Context, eventPublisher) { | ||
ctx := makeContext(opts) | ||
if ctx.Sync { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code makes it look like, Async is the default (which I think it isn't). The code is correct, just my first impression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code you are looking at is in libbeat. Yes in libbeat the default is Async. It is filebeat passing the Sync flag enforing the publisher to do a blocking sync call.
LGTM |
- Add new config option to have filebeat publisher pipeline run in async mode. This can benefit load-balancing performance, as the publisher pipeline will be kept more busy with new bulk-events to be published. - Add test filebeat publisher sync/async mode do process all events + keep correct order when forwarding finished events to registrar. - Start exposing publisher Context and message type. We might see some more fields exposed in future, to make it more easy to hook into publisher pipeline in libbeat.
c34fa38
to
a6c2dc1
Compare
Just read about this on the blog:
Maybe this isn't the right place to ask, but how would I recreate such a test with the timings mentioned. |
No fully fledged test framework in place yet. For filebeat testing so far we've got a set of config files with different output options like console, file and logstash output. When using console output we do pipe the output to /dev/null. Logstash itself is configured with beats input only and null output. As input we're using NASA HTTP logs Timings are collected as described in this post. Difference is I'm using topbeat and govarbeat to collect CPU/memory/throughput stats to elasticsearch/file/console for example. |
Thanks I will give that a try. I guess |
yes, using /dev/null to get an idea how fast we can get, as outputs always generate some kind of back presssure. |
Add new config option to have filebeat publisher pipeline run in async mode.
This can benefit load-balancing performance, as the publisher pipeline will
be kept more busy with new bulk-events to be published.