-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1.5 performance regression fixes #2869
Conversation
here's what the generated filter function looks like across versions for the following config:
1.4.2 @filter_func = lambda do |event, &block|
extra_events = []
@logger.debug? && @logger.debug("filter received", :event => event.to_hash)
if ((event["[foo]"] == 1))
newevents = []
extra_events.each do |event|
@filter_clone_2.filter(event) do |newevent|
newevents << newevent
end
end
extra_events += newevents
@filter_clone_2.filter(event) do |newevent|
extra_events << newevent
end
if event.cancelled?
extra_events.each(&block)
return
end
end
if ((event["[bar]"] == 1))
newevents = []
extra_events.each do |event|
@filter_clone_3.filter(event) do |newevent|
newevents << newevent
end
end
extra_events += newevents
@filter_clone_3.filter(event) do |newevent|
extra_events << newevent
end
if event.cancelled?
extra_events.each(&block)
return
end
end
newevents = []
extra_events.each do |event|
@filter_clone_4.filter(event) do |newevent|
newevents << newevent
end
end
extra_events += newevents
@filter_clone_4.filter(event) do |newevent|
extra_events << newevent
end
if event.cancelled?
extra_events.each(&block)
return
end
extra_events.each(&block)
end 1.5.0-rc2 @filter_func = lambda do |event, &block|
events = [event]
@logger.debug? && @logger.debug("filter received", :event => event.to_hash)
events = events.flat_map do |event|
events = [event]
if ((event["[foo]"] == 1)) # if [foo] == 1
events = events.flat_map do |event|
next [] if event.cancelled?
new_events = []
@filter_clone_2.filter(event){|new_event| new_events << new_event}
event.cancelled? ? new_events : new_events.unshift(event)
end
end
events
end
events = events.flat_map do |event|
events = [event]
if ((event["[bar]"] == 1)) # if [bar] == 1
events = events.flat_map do |event|
next [] if event.cancelled?
new_events = []
@filter_clone_3.filter(event){|new_event| new_events << new_event}
event.cancelled? ? new_events : new_events.unshift(event)
end
end
events
end
events = events.flat_map do |event|
next [] if event.cancelled?
new_events = []
@filter_clone_4.filter(event){|new_event| new_events << new_event}
event.cancelled? ? new_events : new_events.unshift(event)
end
events.flatten.each{|e| block.call(e) }
end 1.5.0 fix/perf_regression def filter_func(event)
events = [event]
@logger.debug? && @logger.debug("filter received", :event => event.to_hash)
if ((event["[foo]"] == 1)) # if [foo] == 1
events = @filter_clone_2.multi_filter(events)
end
if ((event["[bar]"] == 1)) # if [bar] == 1
events = @filter_clone_3.multi_filter(events)
end
events = @filter_clone_4.multi_filter(events)
events
end the new generated code is a lot more straightforward and easier to understand. note the def multi_filter(events)
result = []
events.each do |event|
result << event
filter(event){|new_event| result << new_event}
end
result
end also the cancelled event handling has been moved up in the filter_func(event).each { |e| @filter_to_output.push(e) unless e.cancelled? } eventually (post 1.5) I'd like to get rid of the |
here's what the generated output function looks like across versions for the following config:
1.4.2@output_func = lambda do |event, &block|
@logger.debug? && @logger.debug("output received", :event => event.to_hash)
if ((event["[foo]"] == 1))
@output_stdout_2.handle(event)
end
if ((event["[bar]"] == 1))
@output_stdout_3.handle(event)
end
@output_stdout_4.handle(event)
end 1.5.0-rc2note the totally useless multiple events handling introduced here. @output_func = lambda do |event, &block|
events = [event]
@logger.debug? && @logger.debug("output received", :event => event.to_hash)
events = events.flat_map do |event|
events = [event]
if ((event["[foo]"] == 1)) # if [foo] == 1
@output_stdout_2.handle(event)
end
events
end
events = events.flat_map do |event|
events = [event]
if ((event["[bar]"] == 1)) # if [bar] == 1
@output_stdout_3.handle(event)
end
events
end
@output_stdout_4.handle(event)
end 1.5.0 fix/perf_regressiondef output_func(event)
@logger.debug? && @logger.debug("output received", :event => event.to_hash)
if ((event["[foo]"] == 1)) # if [foo] == 1
@output_stdout_2.handle(event)
end
if ((event["[bar]"] == 1)) # if [bar] == 1
@output_stdout_3.handle(event)
end
@output_stdout_4.handle(event)
end |
end | ||
result | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment on this? I am not sure what it exactly does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally! on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I did not run the test and we don't have enough code harness for the pipeline.. |
see inline comment for |
@colinsurprenant ty for correctly answering my questions, I have no doubt it is faster than what we have before with the removal of the lambda. But lets get more eyes into it :) |
@@ -16,6 +16,13 @@ | |||
class LogStash::ShutdownEvent; end | |||
class LogStash::FlushEvent; end | |||
|
|||
module LogStash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add an extra module level here to add a "subject" example:
module LogStash::Signal
FLUSH = LogStash::FlushEvent.new
SHUTDOWN = LogStash::ShutdownEvent.new
end
Then:
@input_to_filter.push(LogStash::Signal::SHUTDOWN)
or
@input_to_filter.push(Signal::SHUTDOWN)
since we're already inside the namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum..around 10 plugins use it https://github.com/search?p=1&q=user%3Alogstash-plugins+LogStash%3A%3ASHUTDOWN&ref=searchresults&type=Code&utf8=%E2%9C%93 (I couldn't find a more accurate search for this)
Alright, lets postpone this to > 1.5
wow, I love this refactor @colinsurprenant! |
If I understood the changes correctly, I like it! definitely a clearer filter function is generated, and removing the need to create new shutdown event objects each time is nice. output is a step or two shorter! beyond what @ph and @jsvd mention, I don't see anything that jumps out at me. I have not tested it though. go go performant colin! |
I just updated the performance numbers in #2870 to reflect all latests commits. |
+1 to merge. nice work @colinsurprenant |
LGTM |
LGTM tests are running |
rework output function init removed double logger.debug use event constant fix LogStash::SHUTDOWN to actually be the shutdown signal event push in the pipeline fix outputs conditionals handling code gen methods not lambdas, remove intermediate arrays, base multi_filter to simplify code gen remove block from filter_func force_encoding at compile time commented multi_filter method typo DRY & cleanups optimizations: string constants, timestamp initialization, useless logger instance specs for signal events Filters::Base interface and multi_filter method closes #2870
688cf28
to
1d7d14c
Compare
Merged sucessfully into 1.5 master! |
rework output function init removed double logger.debug use event constant fix LogStash::SHUTDOWN to actually be the shutdown signal event push in the pipeline fix outputs conditionals handling code gen methods not lambdas, remove intermediate arrays, base multi_filter to simplify code gen remove block from filter_func force_encoding at compile time commented multi_filter method typo DRY & cleanups optimizations: string constants, timestamp initialization, useless logger instance specs for signal events Filters::Base interface and multi_filter method closes #2870 Fixes #2869
performance fixes to the pipeline and filters and output code generation
this is also the
fix/perf_regression
share branch