-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support events.preStop #1168
Support events.preStop #1168
Conversation
I also verified that user-provided preStop events can't interfere with the async-storage-sidecar preStop event:
|
93baf21
to
77575bf
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #1168 +/- ##
==========================================
+ Coverage 52.53% 52.91% +0.38%
==========================================
Files 82 84 +2
Lines 7460 7544 +84
==========================================
+ Hits 3919 3992 +73
- Misses 3260 3269 +9
- Partials 281 283 +2
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good, I'll test it soon
pkg/library/lifecycle/prestop.go
Outdated
const redirectPreStopOutputFmt = `{ | ||
%s | ||
} 1>/proc/1/fd/1 2>/proc/1/fd/2 | ||
` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a doc comment on the intention behind this?
Also, I'm concerned whether this format will suppress any messages on failure -- if a prestop command fails, normally the output is stored in a FailedPreStopHook
event. If output is redirected to stdout, it might actually make it harder to figure out what went wrong in case of failure (if the event no longer contains this information)
I'm thinking about this because normally with workspaces, pods are removed and not stopped, so pod logs will disappear (this output would only be visible if you were tailing the logs as the workspace stopped)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a doc comment on the intention behind this?
Will add some documentation on the intention of the redirection (though it seems this is more of a docker feature than kubernetes standard).
Also, I'm concerned whether this format will suppress any messages on failure -- if a prestop command fails, normally the output is stored in a
FailedPreStopHook
event. If output is redirected to stdout, it might actually make it harder to figure out what went wrong in case of failure (if the event no longer contains this information)
I'm having trouble figuring out how to actually get terminated pod events to see the FailedPreStopHook
event. From my understanding, once a pod terminates, there's no way to actually get information or events from it (without some type of logging system installed on the cluster).
For example, the following pod will create a FailedPreStopHook
event, but you can't really catch it:
apiVersion: v1
kind: Pod
metadata:
name: lifecycle-demo
spec:
containers:
- name: lifecycle-demo-container
image: nginx
lifecycle:
postStart:
exec:
command: ["/bin/sh","-c","ls"]
preStop:
exec:
command: ["/bin/sh","-c","badcommand"]
In my testing, I could only observe the FailedPreStopHook
if I made the postStart command fail: the postStart event would fail, generating a FailedPostStartHook
event, then the preStop hook would run, generating a FailedPreStopHook
event, and then Kubernetes would restart the pod (eventually reaching a CrashLoopBackoff state).
I'm thinking about this because normally with workspaces, pods are removed and not stopped, so pod logs will disappear (this output would only be visible if you were tailing the logs as the workspace stopped)
This is true.
I think I need to experiment with a cluster logging/monitoring tool to see how the output redirection behaves, because it seems that under normal circumstances, it's impossible to see pod logs or events after pod termination
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a prestop command fails, normally the output is stored in a
FailedPreStopHook
event.
Oddly enough, I don't see failing postStart (or preStop) command output in the event, though the Kubernetes documentation mentions that it should appear.
I can confirm however that redirecting the command output will still result in a FailedPreStopHook
warning event (though this wasn't your actual concern):
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13s default-scheduler Successfully assigned devworkspace-controller/lifecycle-demo to minikube
Normal Pulled 12s kubelet Successfully pulled image "nginx" in 422.082173ms (422.088026ms including waiting)
Normal Pulling 10s (x2 over 12s) kubelet Pulling image "nginx"
Normal Created 10s (x2 over 11s) kubelet Created container lifecycle-demo-container
Normal Started 10s (x2 over 11s) kubelet Started container lifecycle-demo-container
Warning FailedPostStartHook 10s (x2 over 11s) kubelet PostStartHook failed
Normal Killing 10s (x2 over 11s) kubelet FailedPostStartHook
Warning FailedPreStopHook 10s (x2 over 11s) kubelet PreStopHook failed
Normal Pulled 10s kubelet Successfully pulled image "nginx" in 464.96827ms (464.973595ms including waiting)
Warning BackOff 8s (x2 over 9s) kubelet Back-off restarting failed container lifecycle-demo-container in pod lifecycle-demo_devworkspace-controller(2067b4a6-9f64-4148-a2a5-d6483a315e4b
The above events came from the following pod spec:
apiVersion: v1
kind: Pod
metadata:
name: lifecycle-demo
spec:
containers:
- name: lifecycle-demo-container
image: nginx
lifecycle:
postStart:
exec:
command: ["/bin/sh","-c","{ badcommand } 1>/proc/1/fd/1 2>/proc/1/fd/2"]
preStop:
exec:
command: ["/bin/sh","-c","{ badcommand } 1>/proc/1/fd/1 2>/proc/1/fd/2"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the concern is that redirecting output means any logs from a failed hook don't get added to the event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I forgot to mention: even if the output isn't redirected, I don't see it in the event when doing kubectl describe
.
The following pod spec, which has a failing, non-redirected postStart command:
apiVersion: v1
kind: Pod
metadata:
name: lifecycle-demo
spec:
containers:
- name: lifecycle-demo-container
image: nginx
lifecycle:
postStart:
exec:
command: ["/bin/sh","-c","badcommand"]
preStop:
exec:
command: ["/bin/sh","-c","{ badcommand } 1>/proc/1/fd/1 2>/proc/1/fd/2"]
Does not get its command output in the FailedPostStartHook event:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5s default-scheduler Successfully assigned devworkspace-controller/lifecycle-demo to minikube
Normal Pulled 4s kubelet Successfully pulled image "nginx" in 589.651197ms (589.662446ms including waiting)
Normal Pulling 3s (x2 over 5s) kubelet Pulling image "nginx"
Normal Created 3s (x2 over 4s) kubelet Created container lifecycle-demo-container
Normal Started 3s (x2 over 4s) kubelet Started container lifecycle-demo-container
Warning FailedPostStartHook 3s (x2 over 4s) kubelet PostStartHook failed
Normal Killing 3s (x2 over 4s) kubelet FailedPostStartHook
Warning FailedPreStopHook 3s (x2 over 4s) kubelet PreStopHook failed
Normal Pulled 3s kubelet Successfully pulled image "nginx" in 460.949193ms (461.095788ms including waiting)
Warning BackOff 1s (x2 over 2s) kubelet Back-off restarting failed container lifecycle-demo-container in pod lifecycle-demo_devworkspace-controller(3ba486ed-64aa-48e8-a803-65c77c10b347)
This is on Minikube v1.30.1 using Kubernetes v1.26.3 on Docker 23.0.2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output is contained in the message field of the event:
- Without the redirect:
message: |- Exec lifecycle hook ([/bin/sh -c { echo "hello" ls -al exit 1 } ]) for Container "web-terminal" in Pod "workspace8f91680d98264907-54c9d766c4-pbzcw_dw(b52066eb-3bbf-4562-bc27-396267720503)" failed - error: command '/bin/sh -c { echo "hello" ls -al exit 1 } ' exited with 1: , message: "hello\ntotal 16\ndrwxrwxr-x 1 root root 4096 May 23 18:40 .\ndrwxr-xr-x 1 root root 4096 May 23 18:40 ..\ndrwxrwxr-x 1 root root 4096 May 23 18:40 .config\n-rw-rw---- 1 root root 533 May 23 18:40 .viminfo\n"
- With the redirect:
message: |- Exec lifecycle hook ([/bin/sh -c { echo "hello" ls -al exit 1 } 1>/proc/1/fd/1 2>/proc/1/fd/2 ]) for Container "web-terminal" in Pod "workspace8f91680d98264907-c8c5f7c46-nczvc_dw(60ac330d-edcd-4c08-a566-3700173f847b)" failed - error: command '/bin/sh -c { echo "hello" ls -al exit 1 } 1>/proc/1/fd/1 2>/proc/1/fd/2 ' exited with 1: , message: ""
without the redirect, the message ends with
hello
total 16
drwxrwxr-x 1 root root 4096 May 23 18:40 .
drwxr-xr-x 1 root root 4096 May 23 18:40 ..
drwxrwxr-x 1 root root 4096 May 23 18:40 .config
-rw-rw---- 1 root root 533 May 23 18:40 .viminfo
You can view this via
kubectl get ev --field-selector reason==FailedPreStopHook -o yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the example and clarification @amisevsk. To me, this is enough of a reason to not use the 1>/proc/1/fd/1 2>/proc/1/fd/2
redirect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: amisevsk, AObuchow The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fix devfile#1139 Signed-off-by: Andrew Obuchowicz <aobuchow@redhat.com>
Signed-off-by: Andrew Obuchowicz <aobuchow@redhat.com>
Signed-off-by: Andrew Obuchowicz <aobuchow@redhat.com>
Signed-off-by: Andrew Obuchowicz <aobuchow@redhat.com>
a81c3d7
to
52719a0
Compare
New changes are detected. LGTM label has been removed. |
/retest |
1 similar comment
/retest |
/test v8-images |
/test v8-devworkspace-operator-e2e |
1 similar comment
/test v8-devworkspace-operator-e2e |
/retest |
3 similar comments
/retest |
/retest |
/retest |
What does this PR do?
Adds support for
events.preStop
in devworkspaces. Only exec type commands are supported.Multiple preStop commands may be executed, following a similar format to postStart events, the difference being that the output of the commands are redirected to
/proc/1/fd/1
(for stdout) and/proc/1/fd/2
for stderr. This allows reading the output of preStop commands in the kubernetes logs after pod termination.The webhook warning against the usage of
events.preStop
was removed, as well as the mention ofevents.preStop
in the unsupported devfile API doc.What issues does this PR fix or reference?
Fix #1139
Is it tested? How?
Some automated tests (based on the postStart test cases) were added.
To manually test:
events.preStop
Open another terminal and monitor the logs of the workspace pod:
kubectl logs -f workspace623a8f20aa6442f3-6dccd76799-mqnt5 -n $NAMESPACE
In another terminal, delete the DevWorkspace:
kubectl delete dw pre-stop-test -n $NAMESPACE
In the terminal that you're monitoring the logs, you should see the output of all preStop event commands:
PR Checklist
/test v8-devworkspace-operator-e2e, v8-che-happy-path
to trigger)v8-devworkspace-operator-e2e
: DevWorkspace e2e testv8-che-happy-path
: Happy path for verification integration with Che