Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increase controller concurrency and cpu requests #2862

Merged
merged 1 commit into from
Aug 24, 2023

Conversation

mhenriks
Copy link
Member

What this PR does / why we need it:

This commit ups the cpu request for for all our installed compopents (cdi-deployment, cdi-apiserver, cdi-uploadproxy, cdi-operator) for 10m (1% of a core) to 100m (10% of a core).

Also make all controllers handle 3 concurrent requests.

Without this change, it is pretty easy to create a large number of concurrent clone operations and get token timeout errors. Upping resource requests and concurrency addresses the issue in a very direct way.

I experimented with a bunch of other approaches including creating a controller just for refreshing tokens but this is the lowest touch (codewise) and most highly effective solution.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes # https://bugzilla.redhat.com/show_bug.cgi?id=2216038

Special notes for your reviewer:

Release note:

Increase deployment cpu requests to 100m.  Configure controllers to handle concurrent requests.

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/M labels Aug 22, 2023
This commit ups the cpu request for for all our installed compopents
(cdi-deployment, cdi-apiserver, cdi-uploadproxy, cdi-operator)
for 10m (1% of a core) to 100m (10% of a core).
The main driver of this is BZ: 2216038.
Without this change, it is pretty easy to create a large number of
concurrent clone operations and get token timeout errors.
Upping resource requests and concurrency addresses the issue
in a very direct way.

Signed-off-by: Michael Henriksen <mhenriks@redhat.com>
@awels
Copy link
Member

awels commented Aug 22, 2023

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 22, 2023
delta := time.Since(syncState.dv.ObjectMeta.CreationTimestamp.Time)
log.V(3).Info("Adding extended DataVolume token took", "delta", delta)
}
syncState.dv = syncState.dvMutated.DeepCopy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good but I'm wondering about this section, does this introduce new behavior? Should we test it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my testing this appeared to reduce updateStatus failures/retries. The point is that once the dv has been updated the "baseline" and mutated DV are the same

Copy link
Collaborator

@akalenyu akalenyu Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird, syncUpdate is usually the last thing we do in the loop, how come something else runs updates after it
EDIT:
nvm I see we updateStatus after it. makes sense.
But updateStatus seems to work on a fresh copy of the DV from the cache, which should have the updates..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! But the fact is that the "fresh copy" from the cache is often stale (informer didn't receive update yet). Would be better if updatestatus actually uses the copy returned by update in that case. I swear that adding this assignment was reducing errors/retries. Seems unlikely but maybe the additional time to do the assignment gives more time for the informer to update.

Anyway, worst case this assignment has no actual effect. I can remove it if you'd like but it does seem more correct to me that "dv" and "dvMutated" should be equal at that point. The fact that neither value is referenced again is kind of beside the point. That may change in the future

Copy link
Collaborator

@akalenyu akalenyu Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may change in the future

That is my concern; if we ever have to look at dv/dvmutated diff to compute some status transition,
this may throw us off

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should spend a lot of time wondering about hypotheticals. But what are you suggesting we do here @akalenyu?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I guess that's not very likely, I am fine with keeping it

@akalenyu
Copy link
Collaborator

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: akalenyu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 23, 2023
@kubevirt-bot kubevirt-bot merged commit cc8dbc3 into kubevirt:main Aug 24, 2023
18 checks passed
@mhenriks
Copy link
Member Author

/cherry-pick release-v1.57

@kubevirt-bot
Copy link
Contributor

@mhenriks: new pull request created: #2867

In response to this:

/cherry-pick release-v1.57

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants