-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CronJobs sometimes do not discard outdated jobs #64873
Comments
/sig apps |
@bartebor This looks like you might be running etcd without quorum read? Before version 1.9 you have to start the api server(s) with the |
@mortent I think this is exactly what happened. We did not have |
/close |
@mortent: you can't close an active issue unless you authored it or you are assigned to it, Can only assign issues to org members and/or repo collaborators.. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign mortent |
@kow3ns: GitHub didn't allow me to assign the following users: mortent. Note that only kubernetes members and repo collaborators can be assigned. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/close |
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened:
I have a simple cron job scheduled for every minute with
successfulJobHistoryLimit
andfailedJobHistoryLimit
set to 1. A job sleeps 10s and succesfully exits. After some time I see jobs which should be already deleted, but they are not and hang forever. Some jobs are deleted, others do not. Each remaining job has itsdeletionTimestamp
set and "orphan" finalizer present.Logs show that controller tries to delete a job every 10s but apiserver silently ignores this because
deletionTimestamp
is set.What you expected to happen:
I expect to see only one successful job left as specified via
successfulJobHistoryLimit
.How to reproduce it (as minimally and precisely as possible):
Create a simple cron job (schedule for every minute; sleep 10s) and wait. Unfortunately it seems to be environment dependent, so you may or may not see this behaviour.
Anything else we need to know?:
I have some findings which could be useful:
kubectl edit
deletes job.deletionTimestamp
set and "orphan" finalizer present. If DELETE stalls for some reason (maybe because of more than one apiserver in cluster) and GET finishes earlier it gets object with nodeletionTimestamp
and no finalizers present, what makes garbage collector abandon its work (there is a "the orphan finalizer is already removed from object" message in kube-controller-manager's log):I am not sure if this is related with #56348.
This could affect not only cronjobs. We had a similar situation with replication controllers before, but we made some changes in our software to mitigate this and I am unable to verify that now.
Environment:
kubectl version
):The text was updated successfully, but these errors were encountered: