Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix restart attempts of restart stanza in delay mode. #5737

Merged
merged 1 commit into from
Jun 5, 2019

Conversation

fwkz
Copy link
Contributor

@fwkz fwkz commented May 21, 2019

restart {
  interval = "1m"
  attempts = 2
  delay    = "5s"
  mode     = "delay"
}

Imagine constantly failing job with restart policy in mode delay, starting from 2nd interval every subsequent one gets one extra restart attempt. Root cause stems from the fact that attempts counter is being reset (line no. 178) just after it was incremented.

                                        +
            1st INTERVAL                |          2nd INTERVAL
            all good                    |          one extra restart
                                        |
                                        |
     initial      1st       2nd         |     initial      1st        2nd       3rd
     attempt      restart   restart     |     attempt      restart    restart   EXTRA restart
                                        |
        +           +        +          |         +           +          +          +
        |           |        |          |         |           |          |          |
        |           |        |          |         |           |          |          |
        |           |        |          |         |           |          |          |
        |           |        |          |         |           |          |          |
 +-+----+-----+-----+----+---+---+----------------+-----+-----+-----+----+------+---+----->
   |    t     |          |       |      |       t+60s   |           |           |
   +          +          +       |      |               |           +           +
count=0    count=1    count=2    |      |               |        count=1     count=2
   +          +          +       +      +               +           +           +
                              count=3              count=4
                            stop and wait          but wait! It is
                            for the new            the new interval
                            interval               so reset the counter!
                                                   counter=0
                                                   (line no. 178)

If we increment counter after checking if we have entered a new interval we should by fine.

Number of restarts during 2nd interval is off by one.
@hashicorp-cla
Copy link

hashicorp-cla commented May 21, 2019

CLA assistant check
All committers have signed the CLA.

@cgbaker
Copy link
Contributor

cgbaker commented May 22, 2019

wonderful, @fwkz . 0.9.2 is frozen, will merge immediately afterwards.

@fwkz
Copy link
Contributor Author

fwkz commented May 28, 2019

0.9.2 is frozen, will merge immediately afterwards.

Sure, I'm glad I could help. 💪

@notnoop notnoop merged commit d9ac7c2 into hashicorp:master Jun 5, 2019
@github-actions
Copy link

github-actions bot commented Feb 9, 2023

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants