-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]koord-manager failed to renew lease #2096
Comments
@b43646 Do you mean the koord-manager restarts when the PodGroup is submitted? |
@saintube It seems that the pod restarted three days after the podgroup was submitted because the koord-manager failed to renew the lease. |
@b43646 It might be a common issue of the cluster environment where the koord-manager queries the apiserver and timeout. Is there any additional clue to show the koord-manager works abnormally so we can investigate? |
@saintube The network is fluctuating and unstable. Does the koord-manager have a retry mechanism when renewing the lease? |
@b43646 Yes. The koord-manager uses the leader election mechanism of the controller-runtime, so you can find the manager is still working after the twice lease lost. |
What happened:
After running for 3 days, it was found that the koordinator-manager pod restarted.
What you expected to happen:
The koordinator-manager pod has been running stably without any abnormal restarts.
How to reproduce it (as minimally and precisely as possible):
koord-manager-74866c758d-pjjhw's description as below
The relevant log information is as follows:
lease information as below
Anything else we need to know?:
Environment:
kubectl version
):The text was updated successfully, but these errors were encountered: