Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work around common snapshot controller issue in tests #2976

Merged
merged 1 commit into from
Nov 22, 2023

Conversation

akalenyu
Copy link
Collaborator

@akalenyu akalenyu commented Nov 9, 2023

What this PR does / why we need it:
This issue prevents the common csi snapshot controller from removing it's PVC finalizer:
kubernetes-csi/external-snapshotter#957

This causes us some pain in testing:
https://search.ci.kubevirt.io/?search=dataimportcron_test.go%3A138&maxAge=336h&context=1&type=bug%2Bissue%2Bjunit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job
so let's try working around it.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Release note:

NONE

@kubevirt-bot kubevirt-bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Nov 9, 2023
@akalenyu
Copy link
Collaborator Author

akalenyu commented Nov 9, 2023

/cc @awels @jpeimer
we've talked about this before, it started to get worse so I took a stab at it..

// work around https://github.com/kubernetes-csi/external-snapshotter/issues/957
// it does converge after the resync period of snapshot controller (15mins)
cc.AddAnnotation(snapshot, "workaround", "triggersync")
err = f.CrClient.Update(context.TODO(), snapshot)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are essentially just re-triggering a reconcile so it cleans up, what happens if that gets the same error, maybe retry a few times instead of just once? I don't know if that is overkill or not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good call I am not sure. It fails on

E1109 05:15:43.755196       1 snapshot_controller.go:191] error check and remove PVC finalizer for snapshot [datasource-test-63c30e708034]: snapshot controller failed to update datasource-test-63c30e708034 on API server: Operation cannot be fulfilled on persistentvolumeclaims "datasource-test-63c30e708034": the object has been modified; please apply your changes to the latest version and try again

Copy link
Collaborator Author

@akalenyu akalenyu Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we won't need to retry more since 30 seconds have passed so it had plenty of time to catch up

@akalenyu akalenyu force-pushed the wa-csi-snapshot-finalizer-bug branch from a8207c1 to 6b5abc8 Compare November 12, 2023 09:38
This issue prevents the common csi snapshot controller from removing it's
PVC finalizer:
kubernetes-csi/external-snapshotter#957

This causes us some pain in testing:
https://search.ci.kubevirt.io/?search=dataimportcron_test.go%3A138&maxAge=336h&context=1&type=bug%2Bissue%2Bjunit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job
so let's try working around it.

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
@akalenyu akalenyu force-pushed the wa-csi-snapshot-finalizer-bug branch from 6b5abc8 to db07eff Compare November 14, 2023 10:06
@awels
Copy link
Member

awels commented Nov 17, 2023

/test pull-containerized-data-importer-e2e-hpp-latest

@alromeros
Copy link
Collaborator

Seems that we can't do any better here, looks good. Maybe use a shorter timeout in the second waitPVCDeleted? Just a suggestion though.
/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Nov 20, 2023
@mhenriks
Copy link
Member

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mhenriks

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 22, 2023
@kubevirt-bot kubevirt-bot merged commit 8c34b57 into kubevirt:main Nov 22, 2023
21 checks passed
@akalenyu
Copy link
Collaborator Author

/cherrypick release-v1.58

@kubevirt-bot
Copy link
Contributor

@akalenyu: new pull request created: #2996

In response to this:

/cherrypick release-v1.58

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@akalenyu
Copy link
Collaborator Author

/cherrypick release-v1.57

@kubevirt-bot
Copy link
Contributor

@akalenyu: new pull request created: #2997

In response to this:

/cherrypick release-v1.57

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants