Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Regression] MaxRetryTimeout should be respected while scaling machineDeployment #213

Closed
2 tasks done
himanshu-kun opened this issue May 5, 2023 · 2 comments · Fixed by #216
Closed
2 tasks done
Assignees
Labels
kind/bug Bug priority/critical Needs to be resolved soon, because it impacts users negatively status/closed Issue is closed (either delivered or triaged)

Comments

@himanshu-kun
Copy link

himanshu-kun commented May 5, 2023

What happened:
Currently the retry deadline of 1min is not getting respected, due to which the CA's mcm implementation never gives up and keep trying to scale the machineDeployment as requested by CA's core logic.
This leads to CA never removing the ToBeDeletedTaint on the node , and they are considered as upcoming node due to an upstream bug.

Means the pods stay in Pending state.

What you expected to happen:
CA mcm implementation should respect MaxRetryTimeout

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:

This occurred because of a regression introduced in #160 where retryDeadline is updated on every machineDeployment update failure , leading to infinite deadline. Find the code here . Also , since the machineDeployment is never re-fetched, it always fails update on the apiserver.

Currently #160 has been patched and released till rel-v1.21 , so need to update these patch branches as well.

The following should be part of the PR , which fixes this issue:

Environment:

@himanshu-kun himanshu-kun added the kind/bug Bug label May 5, 2023
@himanshu-kun
Copy link
Author

/assign @rishabh-11 @himanshu-kun

@himanshu-kun
Copy link
Author

/priority critical

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Bug priority/critical Needs to be resolved soon, because it impacts users negatively status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants