Normalize the exception behaviour for inputs outputs and filters #2386

jsvd · 2015-01-21T11:02:06Z

Right now an exception on an input plugin will cause it to restart indefinitely, while an exception in an output plugin halts logstash immediately.

[EDITED] Taking @jordansissel's comments into account, this pull request makes all plugins have the same behaviour:
If a plugin raises a StandardError, worker catches and retries the method. An Exception makes logstash crash.

This might not be the desired behaviour we want, so this PR is more a WIP, aiming at consistent behaviour across plugins.
Also this adds tests to the pipeline for this behaviour.

Issues related to this problem:
#2152
#2130
#1250
#1373

fixes #2477

wiibaa · 2015-01-21T11:58:42Z

spec/core/pipeline_spec.rb

@@ -114,4 +130,64 @@ class TestPipeline < LogStash::Pipeline
      end
    end
  end
+
+  context "when filters raise exceptions" do


when ~~filters~~ plugins raise exceptions ?

Good catch!

wiibaa · 2015-01-26T05:02:45Z

Also related to #1373

jordansissel · 2015-01-26T05:26:18Z

First off, thanks for helping start this discussion more visibly outside our little team :)

If the plugin raises an exception, the teardown for all plugins will be called and the exception message will be printed.

I'm not in favor of this behavior for two reasons. The first is historical behavior and reason behind it. The second pertains to clustering and the future.

First, history. There's tons of external libraries of varying stability (in the library and in the integration with Logstash) that Logstash depends on that mean uncaught exceptions would cause Logstash to terminate. My original idea was that you could write simpler plugins and worry less about intermittent failures (via exceptions) by simply restarting the plugin when it failed. This was most effectively done for the input plugins (because they have an obvious "start" behavior), and less so for filters and outputs. This "restart it when it fails" is common in process managers like daemontools, upstart, systemd, and runit; it also is a central goal of Erlang's supervisor behaviors.

Second, future! If we terminate logstash on any uncaught error, then how does this impact things when Logstash is a cluster? I don't think the cluster should terminate with one node having trouble, right? And what of multitenant cluster deployments where one user could accidentally push a bad configuration (or plugin!) that causes errors in that user's log flow - we don't want to interrupt any flows by other tenants.

My practice in operations for the past few years has always been to run applications in a kind of "do_thing while true" to restart a thing whenever it exits (on error, or otherwise, always restart it!). In Logstash's case, this follows with my desire to always restart components on failures. For permanent failures, an event could be poisonous and we need to detect these situatinos and direct them to some kind of store for later use (dead letter queue, etc), but the default should be to restart the failed component and only direct any poisonous input (events) do a safe place after several restart attempts have failed.

jsvd · 2015-02-02T16:47:22Z

test this please

jordansissel · 2015-02-05T18:40:31Z

lib/logstash/pipeline.rb

@@ -302,4 +295,8 @@ def flush_filters_to_output!(options = {})
    end
  end # flush_filters_to_output!

+  private
+  def print_exception_information(exception)
+    @logger.error("Restarting worker: #{exception} => #{exception.backtrace}")


@logger.error("Exception information", :exception => exception, :backtrace => exception.backtrace)

lexelby · 2015-02-10T14:29:10Z

Great stuff here. Thanks, all!

jsvd · 2015-02-26T16:43:20Z

TODO include concern of #1588

input, filter and output workers should retry on a StandardError. This PR adds that behaviour and tests. It also changes Thread.abort_on_exception to false so that the pipeline.run method catches the exceptions on other threads through `thread.join`. Otherwise an Exception on a thread directly calls exit and logstash terminates

ph · 2015-03-17T21:08:33Z

lib/logstash/pipeline.rb

      break if event.is_a?(LogStash::ShutdownEvent)
      output(event)
    end # while true

+  rescue => e


rescue StandardError => e

joekiller · 2015-03-23T18:17:36Z

I'm going to try to test this as I believe I'm having a problem similar to that described in #2130.

guyboertje · 2015-10-27T10:17:57Z

See #4064

guyboertje · 2015-11-02T11:30:04Z

Please continue discussion here #4127

suyograo · 2017-05-13T21:45:19Z

super old stuff. closing

wiibaa reviewed Jan 21, 2015
View reviewed changes

purbon added the work in progress label Jan 21, 2015

jsvd force-pushed the fix/output_thread_lock branch from eb8bbb0 to 1cfa954 Compare February 2, 2015 16:37

jsvd self-assigned this Feb 2, 2015

jsvd added the needs reviewing label Feb 2, 2015

jordansissel reviewed Feb 5, 2015
View reviewed changes

jsvd mentioned this pull request Feb 10, 2015

exception in output thread silently locks up logstash #2130

Closed

jordansissel added the O(3) label Feb 25, 2015

suyograo modified the milestones: v1.6.0, 1.5.1 Feb 26, 2015

jordansissel added the O(3) label Mar 2, 2015

jsvd added 7 commits March 17, 2015 13:22

make input/output workers not retry on exception

be0cf44

assure filter calls teardown when exception is raised

23ecd24

fix typo

d0b163c

code cleanup

3fb8814

consistent exception logging across pluginworkers

1b93349

pipeline:refactored specs. added Exception handling

7ee652d

jsvd force-pushed the fix/output_thread_lock branch from ef560ab to 7ee652d Compare March 17, 2015 20:50

ph reviewed Mar 17, 2015
View reviewed changes

joekiller mentioned this pull request Apr 2, 2015

Shutdown Semantics: Exception handling in the pipeline #2477

Open

tbragin removed this from the v1.5.1 milestone Jun 10, 2015

jordansissel removed the O(3) label Jun 29, 2015

colinsurprenant mentioned this pull request Aug 26, 2015

Exception handling in plugins #3794

Closed

guyboertje mentioned this pull request Oct 20, 2015

How should we rescue Java Throwables in the plugin threads and the pipeline thread #4064

Closed

guyboertje mentioned this pull request Nov 2, 2015

[meta] A general implementation for handling exceptions at all levels in threads and plugins. #4127

Open

colinsurprenant mentioned this pull request Jan 13, 2016

add robustness to avoid crashing on unexepected input #1250

Closed

suyograo removed the reviewing label May 28, 2016

suyograo closed this May 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalize the exception behaviour for inputs outputs and filters #2386

Normalize the exception behaviour for inputs outputs and filters #2386

jsvd commented Jan 21, 2015

wiibaa Jan 21, 2015

jsvd Jan 21, 2015

wiibaa commented Jan 26, 2015

jordansissel commented Jan 26, 2015

jsvd commented Feb 2, 2015

jordansissel Feb 5, 2015

lexelby commented Feb 10, 2015

jsvd commented Feb 26, 2015

ph Mar 17, 2015

joekiller commented Mar 23, 2015

guyboertje commented Oct 27, 2015

guyboertje commented Nov 2, 2015

suyograo commented May 13, 2017

Normalize the exception behaviour for inputs outputs and filters #2386

Normalize the exception behaviour for inputs outputs and filters #2386

Conversation

jsvd commented Jan 21, 2015

wiibaa Jan 21, 2015

Choose a reason for hiding this comment

jsvd Jan 21, 2015

Choose a reason for hiding this comment

wiibaa commented Jan 26, 2015

jordansissel commented Jan 26, 2015

jsvd commented Feb 2, 2015

jordansissel Feb 5, 2015

Choose a reason for hiding this comment

lexelby commented Feb 10, 2015

jsvd commented Feb 26, 2015

ph Mar 17, 2015

Choose a reason for hiding this comment

joekiller commented Mar 23, 2015

guyboertje commented Oct 27, 2015

guyboertje commented Nov 2, 2015

suyograo commented May 13, 2017