-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove shutdown-manager liveness probe #4967
Conversation
The probe can currently cause problems when it fails by causing the shutdown-manager container to be restarted by itself, which then results in the envoy container getting stuck in a "DRAINING" state indefinitely. Not having the probe is less bad overall because envoy pods are less likely to get stuck in "DRAINING", and the worst case without it is that shutdown-manager is truly unresponsive during a pod termination, in which case the envoy container will simply terminate without first draining active connections. Updates projectcontour#4851. Signed-off-by: Steve Kriss <krisss@vmware.com>
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #4967 +/- ##
==========================================
- Coverage 77.63% 77.60% -0.04%
==========================================
Files 140 140
Lines 16885 16871 -14
==========================================
- Hits 13109 13093 -16
- Misses 3519 3521 +2
Partials 257 257
|
This change is to mitigate a problem where when the liveness probe fails, the shutdown-manager container is restarted by itself. | ||
This ultimately has the unintended effect of causing the envoy container to be stuck indefinitely in a "DRAINING" state and not serving traffic. | ||
|
||
Overall, not having the liveness probe on the shutdown-manager container is less bad because envoy pods are less likely to get stuck in "DRAINING", and the worst case without it is that shutdown-manager is truly unresponsive during a pod termination, in which case the envoy container will simply terminate without first draining active connections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit on the wording, might be good to explicitly add that the termination/restart of the container caused by the lack of liveness probe will ensure the envoy pod is back in the set of ready envoys to load balance traffic to
Signed-off-by: Steve Kriss <krisss@vmware.com>
The probe can currently cause problems when it fails by causing the shutdown-manager container to be restarted by itself, which then results in the envoy container getting stuck in a "DRAINING" state indefinitely. Not having the probe is less bad overall because envoy pods are less likely to get stuck in "DRAINING", and the worst case without it is that shutdown-manager is truly unresponsive during a pod termination, in which case the envoy container will simply terminate without first draining active connections. Updates projectcontour#4851. Signed-off-by: Steve Kriss <krisss@vmware.com> Signed-off-by: yy <yang.yang@daocloud.io>
The probe can currently cause problems when it fails by causing the shutdown-manager container to be restarted by itself, which then results in the envoy container getting stuck in a "DRAINING" state indefinitely. Not having the probe is less bad overall because envoy pods are less likely to get stuck in "DRAINING", and the worst case without it is that shutdown-manager is truly unresponsive during a pod termination, in which case the envoy container will simply terminate without first draining active connections. Updates projectcontour#4851. Signed-off-by: Steve Kriss <krisss@vmware.com> Signed-off-by: yy <yang.yang@daocloud.io>
The probe can currently cause problems when it fails by causing the shutdown-manager container to be restarted by itself, which then results in the envoy container getting stuck in a "DRAINING" state indefinitely. Not having the probe is less bad overall because envoy pods are less likely to get stuck in "DRAINING", and the worst case without it is that shutdown-manager is truly unresponsive during a pod termination, in which case the envoy container will simply terminate without first draining active connections. Updates projectcontour#4851. Signed-off-by: Steve Kriss <krisss@vmware.com>
The probe can currently cause problems when it fails by causing the shutdown-manager container to be restarted by itself, which then results in the envoy container getting stuck in a "DRAINING" state indefinitely.
Not having the probe is less bad overall because envoy pods are less likely to get stuck in "DRAINING", and the worst case without it is that shutdown-manager is truly unresponsive during a pod termination, in which case the envoy container will simply terminate without first draining active connections.
Updates #4851.
Signed-off-by: Steve Kriss krisss@vmware.com