You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
is only executed during the reboot time window in the rebootAsRequired gorotine.
Outside reboot time window:
k logs kured-knvc5 -f
time="2023-02-07T09:45:58Z" level=info msg="Binding node-id command flag to environment variable: KURED_NODE_ID"
time="2023-02-07T09:45:58Z" level=info msg="Kubernetes Reboot Daemon: c7b7d6a"
time="2023-02-07T09:45:58Z" level=info msg="Node ID: node1"
time="2023-02-07T09:45:58Z" level=info msg="Lock Annotation: kube-system/kured:weave.works/kured-node-lock"
time="2023-02-07T09:45:58Z" level=info msg="Lock TTL not set, lock will remain until being released"
time="2023-02-07T09:45:58Z" level=info msg="Lock release delay set, lock release will be delayed by: 30m0s"
time="2023-02-07T09:45:58Z" level=info msg="PreferNoSchedule taint: "
time="2023-02-07T09:45:58Z" level=info msg="Blocking Pod Selectors: []"
time="2023-02-07T09:45:58Z" level=info msg="Reboot schedule: ---MonTueWedThuFri--- between 02:30 and 06:00 Europe/Berlin"
time="2023-02-07T09:45:58Z" level=info msg="Reboot check command: [test -f /var/run/reboot-required] every 1m0s"
time="2023-02-07T09:45:58Z" level=info msg="Reboot command: [/bin/systemctl reboot]"
time="2023-02-07T09:45:58Z" level=info msg="Will annotate nodes during kured reboot operations"
<no further logs>
Due to separate gorotine gomaintainRebootRequiredMetric metric is updated accordingly and indicates that a reboot is in fact required:
k exec -it kured-knvc5 -- sh
/ # wget -qO- 127.0.0.1:8080/metrics | grep kured
# HELP kured_reboot_required OS requires reboot due to software updates.
# TYPE kured_reboot_required gauge
kured_reboot_required{node="node1"} 1
When kured is in the reboot time window, reboot is required (and it is possible to acquire the lock) then the drain and reboots follows quickly:
k exec -it kured-pxkjr -- sh
/ # wget -qO- 127.0.0.1:8080/metrics | grep kured
# HELP kured_reboot_required OS requires reboot due to software updates.
# TYPE kured_reboot_required gauge
kured_reboot_required{node="node1"} 1
k logs kured-pxkjr -f
time="2023-02-07T10:20:21Z" level=info msg="Binding node-id command flag to environment variable: KURED_NODE_ID"
time="2023-02-07T10:20:21Z" level=info msg="Kubernetes Reboot Daemon: c7b7d6a"
time="2023-02-07T10:20:21Z" level=info msg="Node ID: node1"
time="2023-02-07T10:20:21Z" level=info msg="Lock Annotation: kube-system/kured:weave.works/kured-node-lock"
time="2023-02-07T10:20:21Z" level=info msg="Lock TTL not set, lock will remain until being released"
time="2023-02-07T10:20:21Z" level=info msg="Lock release delay set, lock release will be delayed by: 30m0s"
time="2023-02-07T10:20:21Z" level=info msg="PreferNoSchedule taint: "
time="2023-02-07T10:20:21Z" level=info msg="Blocking Pod Selectors: []"
time="2023-02-07T10:20:21Z" level=info msg="Reboot schedule: ---MonTueWedThuFri--- between 10:00 and 13:00 Europe/Berlin"
time="2023-02-07T10:20:21Z" level=info msg="Reboot check command: [test -f /var/run/reboot-required] every 1m0s"
time="2023-02-07T10:20:21Z" level=info msg="Reboot command: [/bin/systemctl reboot]"
time="2023-02-07T10:20:21Z" level=info msg="Will annotate nodes during kured reboot operations"
time="2023-02-07T10:22:56Z" level=info msg="Reboot required"
time="2023-02-07T10:22:56Z" level=info msg="Adding node node1 annotation: weave.works/kured-reboot-in-progress=2023-02-07T10:22:56Z"
time="2023-02-07T10:22:56Z" level=info msg="Adding node node1 annotation: weave.works/kured-most-recent-reboot-needed=2023-02-07T10:22:56Z"
time="2023-02-07T10:22:56Z" level=info msg="Acquired reboot lock"
time="2023-02-07T10:22:56Z" level=info msg="Draining node node1"
...
time="2023-02-07T10:23:32Z" level=info msg="Running command: [/usr/bin/nsenter -m/proc/1/ns/mnt -- /bin/systemctl reboot] for node: node1"
time="2023-02-07T10:23:32Z" level=info msg="Waiting for reboot"
In my opinion, it defeats the purpose to log "Reboot required" just before the reboot.
I think it would be better to check the sentinel and log "Reboot required" over a separate gorotine outside the reboot time window like it is done with the maintainRebootRequiredMetric gorotine. Then there would also be no mismatch between the two.
What do you think?
The text was updated successfully, but these errors were encountered:
This issue was automatically considered stale due to lack of activity. Please update it and/or join our slack channels to promote it, before it automatically closes (in 7 days).
Without this patch, one metric could say "reboot is required"
while the rebootAsRequired tick did not run (long period for
example).
This is a problem, as it leads to misexpectations: "Why
did the system not reboot, while the metrics indicate a reboot
was required".
This solves it by inlining the metrics management within the
rebootAsRequired goroutine.
Closes: kubereboot#725
Signed-off-by: Jean-Philippe Evrard <open-source@a.spamming.party>
Hello,
to my knowledge, it is currently not possible to find out beforehand that a reboot is required over the logs.
When I understand the code correctly, then
kured/cmd/kured/main.go
Line 697 in ad1e9b8
is only executed during the reboot time window in the rebootAsRequired gorotine.
Outside reboot time window:
Due to separate gorotine gomaintainRebootRequiredMetric metric is updated accordingly and indicates that a reboot is in fact required:
When kured is in the reboot time window, reboot is required (and it is possible to acquire the lock) then the drain and reboots follows quickly:
In my opinion, it defeats the purpose to log "Reboot required" just before the reboot.
I think it would be better to check the sentinel and log "Reboot required" over a separate gorotine outside the reboot time window like it is done with the maintainRebootRequiredMetric gorotine. Then there would also be no mismatch between the two.
What do you think?
The text was updated successfully, but these errors were encountered: