-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
locksmithd will reboot outside of reboot windows, if no semaphore was acquired before #407
Comments
Hi, thanks for backporting this issue. Here a few steps to reproduce locally the behavior:
Write ignition to setup a local {"ignition": { "version": "2.0.0" },
"systemd": {
"units": [
{
"name": "update-engine.service",
"enable": false
},
{
"name": "etcd-member.service",
"dropins": [{
"name": "environment.conf",
"contents": "[Unit]\n[Service]\nEnvironment=ETCD_ADVERTISE_CLIENT_URLS=http://127.0.0.1:2379\nEnvironment=ETCD_LISTEN_CLIENT_URLS=http://127.0.0.1:2379"
}]
},
{
"name": "locksmithd.service",
"enable": true,
"dropins": [{
"name": "environment.conf",
"contents": "[Unit]\nAfter=etcd-member.service\nRequires=etcd-member.service\n[Service]\nEnvironment=LOCKSMITHD_ENDPOINT=http://localhost:2379\nEnvironment=LOCKSMITHD_REBOOT_WINDOW_START=03:00\nEnvironment=LOCKSMITHD_REBOOT_WINDOW_LENGTH=2m"
}]
}
]
},
"storage": {
"files": [
{
"filesystem": "root",
"path": "/etc/coreos/update.conf",
"contents": { "source": "data:,REBOOT_STRATEGY=etcd-lock%0A" },
"mode": 420
}
]
}
}
N.B: Notice the
Once on the instance, remove any available token to "emulate" the issue:
Looking quickly at the code, we could add a "sleep" inside this loop: Using the same logic as here: |
Thanks for looking into this! Yes, in case you fail to get the semaphore you can probably sleep until the update period starts again followed by a reset of the backoff interval (if |
Hi @tobgu , would you be interested by doing the implementation yourself then ? :) |
@tormath1 I want to work on this issue |
@aniruddha2000 If you want to pick it up I'm happy with that. I would expect the code change to be pretty minimal based on the above suggestion but verifying it in any meaningful way would probably be the biggest effort for me. |
closes flatcar/Flatcar#407 Signed-off-by: Mathieu Tortuyaux <mathieu@kinvolk.io>
closes flatcar/Flatcar#407 Signed-off-by: Mathieu Tortuyaux <mathieu@kinvolk.io>
closes flatcar/Flatcar#407 Signed-off-by: Mathieu Tortuyaux <mathieu@kinvolk.io>
closes flatcar/Flatcar#407 Signed-off-by: Mathieu Tortuyaux <mathieu@kinvolk.io>
Description
This ticket mirrors coreos/bugs#1886 which has been left to fade away.
Impact
Unexpected reboots and updates of machines outside of designated reboot windows.
Expected behavior
Once locksmithd gets the semaphore it should check again that we're still within the update window. If not, no update should take place and the semaphore should be released again.
The text was updated successfully, but these errors were encountered: