You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
When many separate machines are using watchtower to poll for the same image updates, it's possible to run into a thundering herd type of problem, where many machines are updating their services all at the same time. This can be especially true for environments where the same watchtower configuration has been deployed simultaneously using a mass-configuration management system (Ansible, Chef, SaltStack, Puppet, etc).
Describe the solution you'd like
I believe the problem could be alleviated by introducing a new configuration option to splay the timing of watchtower's polling interval and schedule option values. The argument to the splay option could be expressed as a number of seconds to insert as a random delay to the interval or schedule timing. This random delay would be re-calculated each time the interval or schedule is triggered.
Examples:
Poll every 5 to 6 minutes: watchtower --interval 300 --splay 60
Schedule once every day between 12:00 and 12:10 watchtower --schedule "0 12 * * *" --splay 600
Describe alternatives you've considered
An alternative could be to add a feature to watchtower that uses a network service like etcd, where watchtower would wait for a "lock" to be opened before it starts performs its poll. This may be a more robust approach for some cases but would likely be more complex.
Additional context
No response
The text was updated successfully, but these errors were encountered:
Hi there! 👋🏼 As you're new to this repo, we'd like to suggest that you read our code of conduct as well as our contribution guidelines. Thanks a bunch for opening your first issue! 🙏
m4wh6k
changed the title
Support time splay / randomization
Support polling time splay / randomization
Aug 1, 2023
Is your feature request related to a problem? Please describe.
When many separate machines are using watchtower to poll for the same image updates, it's possible to run into a thundering herd type of problem, where many machines are updating their services all at the same time. This can be especially true for environments where the same watchtower configuration has been deployed simultaneously using a mass-configuration management system (Ansible, Chef, SaltStack, Puppet, etc).
Describe the solution you'd like
I believe the problem could be alleviated by introducing a new configuration option to splay the timing of watchtower's polling interval and schedule option values. The argument to the splay option could be expressed as a number of seconds to insert as a random delay to the interval or schedule timing. This random delay would be re-calculated each time the interval or schedule is triggered.
Examples:
Poll every 5 to 6 minutes:
watchtower --interval 300 --splay 60
Schedule once every day between 12:00 and 12:10
watchtower --schedule "0 12 * * *" --splay 600
Describe alternatives you've considered
An alternative could be to add a feature to watchtower that uses a network service like etcd, where watchtower would wait for a "lock" to be opened before it starts performs its poll. This may be a more robust approach for some cases but would likely be more complex.
Additional context
No response
The text was updated successfully, but these errors were encountered: