-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 make clusterctl REST client throttling configurable #8411
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a good idea to me - one question about the use case though - as long as you don't mind sharing 🙂
Is there a reason there's so many additional CRDs installed before clusterctl init
?
Sure thing. In CAPZ we're considering adopting Azure Service Operator (ASO) to create/update/delete individual resources in Azure (CAPZ proposal) and ASO defines one CRD per Azure resource type. And right now there's not a great way to install only the ASO CRDs we need. |
Thanks for explaining - definitely seems reasonable. The only question is whether to default the burst higher or make it configurable with a flag / config file. I think clusterctl maintainers and reviewers will have clearer opinions on that. |
Should this be classified as a bug fix and considered for cherry-pick? |
we could do both? I think a higher default is sane, given a user installing several infra providers could quickly run into these limits (and that the current value is lower than the kubectl default kubernetes/kubernetes#105520) |
I think it's a good idea to cherry-pick for sure. @nojnhuh could you add a command line flag making this configurable? |
Yep, will do. Should I do the same for the rest config's |
Sounds good to me. |
Had to add most of the plumbing to expose these as CLI flags which kinda blew up this PR. Definitely took a hit to my keyboard's Cmd, C, and V keys but it works at least. /retitle 🐛 make clusterctl REST client throttling configurable |
@killianmuldoon Thanks for taking a look so far! Could you help assign folks you think might be good to review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @Jont828
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Is there anywhere in the book these options should be documented?
I don't see anything like generated documentation for every flag on every command, so maybe the built-in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Increasing the defaults to match upstream (kubectl) sounds reasonable.
The overall approach looks good. I will test it later this week.
I don't see anything like generated documentation for every flag on every command, so maybe the built-in --help is enough here?
Correct. We do not have any explicit documentation for the flags of all the commands. The flag descriptions included in the built-in --help
are enough.
/assign @ykakarap Friendly ping on this if you wouldn't mind taking a look. |
Test flake already has an issue open: #8478 |
I wonder if an alternative to the current implementation in this PR is to use the lazy rest mapper that CR introduced as an experimental feature (and we exposed as an experimental feature on our controllers). The lazy restmapper only discovers CRDs on-demand when they are actually used. This way we could just reduce the amount of requests instead of changing the throttling. xref: |
@sbueringer Do you think it might make sense to integrate the lazy rest mapper separately so users can take advantage of the increased throttling thresholds without having to opt in to experimental functionality? |
I'm fine with it but I would like to leave the decision to clusterctl maintainers. I just thought it's better to make it unnecessary to make throttling configurable by optimizing the calls. While lazy restmapper is experimental we use it since ~ 1-2 months in all our e2e test jobs (1.4 + main IIRC) without any problems. Also looks like it would help us to avoid having to make changes in a lot of places to make it configurable. It feels like something that we shouldn't require to have configurable based on what clusterctl is doing (deploying providers) to be honest |
^^ @vincepri WDYT? |
New changes are detected. LGTM label has been removed. |
Thought experiment: Let's assume the lazy restmapper becomes the default/only restmapper in CR v0.15 (to which we will bump in 2 weeks). Would we still want the flags additionally? |
Context: I'm asking because we're about to make the lazy restmapper the default restmapper: kubernetes-sigs/controller-runtime#2296 |
Given that it getting the default in two weeks, the lazy rest mapper makes more and more sense... |
We have it as soon as #8007 is merged, so we could try how it changes the situation for clusterctl |
@sbueringer Do you expect that as soon as controller-runtime is bumped, then CAPI will automatically start using the lazy restmapper and this issue I was originally running into might go away entirely? Or will there be additional work to take advantage of the lazy restmapper? |
Looks like after bumping controller-runtime (#8007) clusterctl client should default to using the lazy rest mapper. It would be better to perform some tests (similar to how you got the initial metrics of 5-6m ) to see if it addresses the need. |
Thanks @ykakarap, I'll retest once the controller-runtime version bump merges. AFACT kubectl doesn't make these values configurable, so I think we can reasonably leave the values hardcoded to what kubectl has in clusterctl at least for the short- to medium-term. If increasing the default burst does continue to make things significantly faster with the lazy restmapper, I'd be in favor of making that change by itself and making the values configurable if need be in a separate PR. I'll follow up here once I've done that testing with controller-runtime 0.15. |
Sounds good. Thank you :) |
Sounds good. As of controller-runtime v0.15 the lazy restmapper will be the only restmapper so it will be used out of the box |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@nojnhuh Just fyi, CR bump has been merged now if you want to test again :) |
Just tried this out and it's definitely lots better than before! /close |
@nojnhuh: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What this PR does / why we need it:
When running commands like
clusterctl init
on a cluster with many CRDs installed, the API discovery process frequently gets throttled client-side causing operations to take significantly longer. On a fresh kind cluster, I was seeingclusterctl init
take about 30s, but when ~100 CRDs are installed it took 5-6m. In this blog post from Upbound (maintainers of Crossplane), they mention they ran into the same problem withkubectl
and contributed a fix upstream to bump therest.Config
'sBurst
from 100 to 300. With the same change applied here, I see the sameclusterctl init
command go from taking 5-6m to under 1m with minimal log messages indicating client-side throttling is being applied.Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #