Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] refactor input plugin and pipeline shutdown #3211

Closed

Conversation

colinsurprenant
Copy link
Contributor

  • as proposed in [Design] plugins shutdown sequence refactor proposal #3210
  • includes code cleanups in the files I touched
  • most important/relevant changes are in logstash/pipeline.rb and logstash/plugin.rb
  • thes changes works with the "core" plugins referenced below (the unreferenced "core" plugins do not require any changes)

@colinsurprenant
Copy link
Contributor Author

some of the design goals, excluding better semantic and simplification are:

  • thread safety: the methods involving shutdown stop, stop? and close must be thread safe because they will typically be called from the pipeline thread but the plugin run in its own thread. One example is with stop and stop?. It's not correct to simply update an instance variable of the plugin object to signal a stop. In JRuby, while instance variable updates are thread safe, they are not volatile. This means that updates to the instance variable from one thread on one core might not be visible from another thread on another core. For this a mutex or an atomic container is required to ensure updates are volatile.
  • performance: a typical input plugin will involve a read loop. One way of checking for a stop signal is to just check at each loop cycle like this:
while !stop?
  ... read ...
end

I wanted to make sure the stop? method was not too costly to call at each loop cycle. I compared the speed of different approaches for implementing a volatile signal: explicit mutex lock/unlock, mutex synchronize block and AtomicBoolean versus a bare instance variable read.

We can see that volatile reads are obviously slower than a bare instance variable read. But we can also see that we can achieve ~5.5M iterations per seconds. This is about 2 order of magnitudes faster than the pipeline throughput so the impact on the pipeline of calling stop? is negligible, any actual IO read, codec decoding and event creation will be a lot slower than calling stop?.

@jsvd
Copy link
Member

jsvd commented May 22, 2015

To make all plugins consistent, the bookkeeping method should remain "teardown" in the inputs/base.rb. The job of teardown is to cleanup after the plugin is stopped (like close).

However, to reduce the amount of code a user has to write if he wants to add behaviour to teardown, all ensure blocks on plugin workers should call close, that would then log the "closing" message and call teardown on the plugin.

So, instead of this proposal of:

control +c -> (pipeline) calls stop on inputs -> each input sets stop? == true -> input exits #run -> input worker calls plugin.close

could be:

control +c -> (pipeline) calls stop on inputs -> each input sets stop? == true -> input exits #run -> input worker calls plugin.close -> plugin calls teardown on self

This way a user doesn't have to remind him/herself to call super, and all plugins use teardown for cleanup.

@jsvd
Copy link
Member

jsvd commented May 22, 2015

Also, I'd suggest renaming .stop? to .stopping? as it's usually more consistent with predicate method naming

def teardown
# nothing by default
finished
# if you override stop, don't forgt to call super
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/forgt/forget

def to_s
return "#{self.class.name}: #{@params}"
# if you override close, don't forgt to call super
def close
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colinsurprenant I know we've had this conversation but can this method be named teardown so all cleanup methods are consistently named across plugin types (filters and outputs still use teardown)?
My two reasons for making this teardown are:

  1. less changes needed: otherwise we need to refactor all plugins that define teardown to use close instead
  2. arguably a better metaphor: close might make sense in plugins such as tcp/udp where you cleanup by closing a socket, but in others the plugin isn't actually closing anything. Therefore teardown (pronunciation) is more suitable, used commonly in test suites to convey the "cleanup" action.

@jsvd
Copy link
Member

jsvd commented Aug 7, 2015

Another step to do: remove all calls to finished on all plugins. Currently this needs to be done on:

/tmp/plugins % grep "  finished" . -r
./logstash-input-file/lib/logstash/inputs/file.rb:    finished
./logstash-input-ganglia/lib/logstash/inputs/ganglia.rb:    finished
./logstash-input-gelf/lib/logstash/inputs/gelf.rb:    finished
./logstash-input-gemfire/lib/inputs/gemfire.rb:    finished
./logstash-input-generator/lib/logstash/inputs/generator.rb:    finished
./logstash-input-imap/lib/logstash/inputs/imap.rb:    finished
./logstash-input-kafka/lib/logstash/inputs/kafka.rb:    finished
./logstash-input-pipe/lib/logstash/inputs/pipe.rb:    finished
./logstash-input-rabbitmq/lib/logstash/inputs/rabbitmq/bunny.rb:      finished
./logstash-input-rabbitmq/lib/logstash/inputs/rabbitmq/march_hare.rb:      finished
./logstash-input-sqs/lib/logstash/inputs/sqs.rb:    finished
./logstash-input-stdin/lib/logstash/inputs/stdin.rb:    finished
./logstash-input-stdin/lib/logstash/inputs/stdin.rb:    finished
./logstash-input-syslog/lib/logstash/inputs/syslog.rb:    finished
./logstash-input-varnishlog/lib/logstash/inputs/varnishlog.rb:    finished
./logstash-output-cloudwatch/lib/logstash/outputs/cloudwatch.rb:      finished
./logstash-output-file/lib/logstash/outputs/file.rb:    finished
./logstash-output-gemfire/lib/logstash/outputs/gemfire.rb:    finished
./logstash-output-jira/lib/logstash/outputs/jira.rb:      finished
./logstash-output-jms/lib/logstash/outputs/jms.rb:  finished
./logstash-output-kafka/lib/logstash/outputs/kafka.rb:      finished
./logstash-output-loggly/lib/logstash/outputs/loggly.rb:      finished
./logstash-output-lumberjack/lib/logstash/outputs/lumberjack.rb:      finished
./logstash-output-nagios_nsca/lib/logstash/outputs/nagios_nsca.rb:      finished
./logstash-output-pipe/lib/logstash/outputs/pipe.rb:    finished
./logstash-output-rabbitmq/lib/logstash/outputs/rabbitmq.rb:    finished
./logstash-output-redmine/lib/logstash/outputs/redmine.rb:      finished
./logstash-output-s3/lib/logstash/outputs/s3.rb:    finished
./logstash-output-sqs/lib/logstash/outputs/sqs.rb:    finished
./logstash-output-stdout/lib/logstash/outputs/stdout.rb:      finished
./logstash-output-udp/lib/logstash/outputs/udp.rb:      finished

@suyograo
Copy link
Contributor

Closing this in lieu of #3895

@suyograo suyograo closed this Sep 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants