Avoid infinite reconnection retries #9

purbon · 2015-06-23T13:48:24Z

Add a configuration option to set the number of times the output is going to try to reconnect in case of a failing connection. If reconnect_times if not defined then the former behaviour is expected,
going to retry connection forever.

Version 1.1.0 bump

going to try to reconnect in case of a failing connection. If reconnect_times if not defined then the former behaviour is expected, going to retry connection forever. Version 1.1.0 bump

jordansissel · 2015-06-25T14:31:26Z

lib/logstash/outputs/redis.rb

-      retry
+
+      @reconnect_tries += 1
+      retry if retry_connection?


Doesn't this make it lose the event if redis is down?

Yes it makes, but without modifying the pipeline logic there is actually no way I see to handle this. As this is an option feature, the user can decide if he want a finite or infinite number of retries.

colinsurprenant · 2015-07-06T18:16:17Z

In this state I agree with @jordansissel that this proposed change is worrisome. First there is no explicit mention of the fact that the event has been dropped. Then, I am not sure which is the "lesser evil" between having a stalled pipeline and having all input events dropped in the logs if the reconnect_times option is set.

I think that we should think about this from these 2 angles:

the pipeline stall problem is not actually caused by stalled output but because of a poor shutdown signalling implementation (me) and the lack of persistence (in progress).
logstash outputs has been following this idea of continuous retries, for the best or the worst, and we should make sure we have a consistent strategy before changing this semantic.

purbon · 2015-07-07T09:34:33Z

Hi,
I agree with you too. This PR introduces what for me was the most
straightforward solution to this problem for now, but I agree the right
solution is to fix the pipeline shutdown semantics.

Is there any solution you might think, who provides help now to users
facing this situation?

purbon

On Mon, Jul 6, 2015 at 8:16 PM Colin Surprenant notifications@github.com
wrote:

In this state I agree with @jordansissel https://github.com/jordansissel
that this proposed change is worrisome. First there is no explicit mention
of the fact that the event has been dropped. Then, I am not sure which is
the "lesser evil" between having a stalled pipeline and having all input
events dropped in the logs if the reconnect_times option is set.

I think that we should think about this from these 2 angles:

the pipeline stall problem is not actually caused by stalled output
but because of a poor shutdown signalling implementation (me) and the lack
of persistence (in progress).

logstash outputs has been following this idea of continuous retries,
for the best or the worst, and we should make sure we have a consistent
strategy before changing this semantic.

—
Reply to this email directly or view it on GitHub
#9 (comment)
.

colinsurprenant · 2015-07-07T21:40:42Z

@purbon as I said in my previous comment, my proposed solution(s) is a) to properly fix the shutdown signalling which should happen for 1.6 per elastic/logstash#3451 b) to move on with persistence and c) to discuss about "global" retry strategy for output plugins...

purbon · 2015-07-08T04:08:54Z

Yes, this makes sense.

purbon

On Tue, 7 Jul 2015 23:40 Colin Surprenant notifications@github.com wrote:

@pere https://github.com/pere as I said in my previous comment, my
proposed solution(s) is a) to properly fix the shutdown signalling which
should happen for 1.6 per elastic/logstash#3451
elastic/logstash#3451 b) to move on with
persistence and c) to discuss about "global" retry strategy for output
plugins...

—
Reply to this email directly or view it on GitHub
#9 (comment).<img
src="https://ci5.googleusercontent.com/proxy/hIaIBPVnn7VvNpCG

h

9i7YtnZEaGCnRNJ9FWdwnOuaMChHupNYdZjehlLM5HicyweXPa0ZbO7xeEWlJJuFqRrlsZkSapP71niAD1Bn8EgKW1X_BIf4eR0dW_zmqTbKrQWmCsOB4eeWlL2IWttZSF79swz7dqPw7g=s0-d-e1-ft#
https://github.com/notifications/beacon/AAELvKuCBSpphBzqhzdA58bj4UqX3b1qks5obD7bgaJpZM4FJ4Xv.gif
">

purbon · 2015-07-08T04:24:13Z

Closing this, as this is not the intended solution and a proper shutdown sequence is going to be implemented soon. We can always reopen if necessary.

Add a configuration option to set the number of times the output is

728a2aa

going to try to reconnect in case of a failing connection. If reconnect_times if not defined then the former behaviour is expected, going to retry connection forever. Version 1.1.0 bump

purbon added bug pipeline-stalls needs reviewing labels Jun 23, 2015

purbon assigned suyograo Jun 23, 2015

purbon mentioned this pull request Jun 23, 2015

If an output plugin stop retrieving elements from the queue, the stop sequence does not work elastic/logstash#3491

Closed

suyograo assigned jordansissel and unassigned suyograo Jun 23, 2015

jordansissel reviewed Jun 25, 2015
View reviewed changes

purbon mentioned this pull request Jun 26, 2015

stalled outputs will prevent proper shutdown elastic/logstash#3451

Closed

7 tasks

purbon closed this Jul 8, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid infinite reconnection retries #9

Avoid infinite reconnection retries #9

purbon commented Jun 23, 2015

jordansissel Jun 25, 2015

purbon Jun 25, 2015

colinsurprenant commented Jul 6, 2015

purbon commented Jul 7, 2015

colinsurprenant commented Jul 7, 2015

purbon commented Jul 8, 2015

purbon commented Jul 8, 2015

Avoid infinite reconnection retries #9

Avoid infinite reconnection retries #9

Conversation

purbon commented Jun 23, 2015

jordansissel Jun 25, 2015

Choose a reason for hiding this comment

purbon Jun 25, 2015

Choose a reason for hiding this comment

colinsurprenant commented Jul 6, 2015

purbon commented Jul 7, 2015

colinsurprenant commented Jul 7, 2015

purbon commented Jul 8, 2015

purbon commented Jul 8, 2015