You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, a feature is introduced to solve 'pod deleted' issue, whichi is caused by kube-controller-manager garbage controller. But for now, there are risks that the finalizer cann't be removed (or not in time) in some cases, for example:
manually delete workflows while controller is not running
workflow-controller restart due to health check
api rate-limiting
Here I have some suggestion to avoid that situation occurrs.
use workflow finalizer
use a cron to remove finalizer of finished pod periodicly
But both methods can't handle api rate-limiting, may an api priority mechanism may be introduced. Also, there may some other cases and solutions.
And, there may be other option, but not using finalizer. As we know, this caused by gc controller, we may make some change to let it sort by finished time. Or, we can let wait container not exist, but sleep for a while, once workflow-controller have captured the exit status of main container, we kill wait container.
The text was updated successfully, but these errors were encountered:
imliuda
changed the title
Follow up work of #12413 and related issues, ensure pod finalizer get removed
Follow up work of https://github.com/argoproj/argo-workflows/pull/12413 and related issues, ensure pod finalizer get removed
Apr 18, 2024
imliuda
changed the title
Follow up work of https://github.com/argoproj/argo-workflows/pull/12413 and related issues, ensure pod finalizer get removed
Follow up work of #12413 and related issues, ensure pod finalizer get removed
Apr 18, 2024
Currently, a feature is introduced to solve 'pod deleted' issue, whichi is caused by kube-controller-manager garbage controller. But for now, there are risks that the finalizer cann't be removed (or not in time) in some cases, for example:
Here I have some suggestion to avoid that situation occurrs.
But both methods can't handle api rate-limiting, may an api priority mechanism may be introduced. Also, there may some other cases and solutions.
And, there may be other option, but not using finalizer. As we know, this caused by gc controller, we may make some change to let it sort by finished time. Or, we can let wait container not exist, but sleep for a while, once workflow-controller have captured the exit status of main container, we kill wait container.
The text was updated successfully, but these errors were encountered: