Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1993934: Merge upstream v4.2.0 #56

Merged
merged 67 commits into from
Aug 20, 2021

Conversation

bertinatto
Copy link
Member

@bertinatto bertinatto commented Aug 17, 2021

pohly and others added 30 commits May 5, 2021 14:47
kubernetes-csi/csi-release-tools@6616a6b5 Merge kubernetes-csi/csi-release-tools#146 from pohly/kubernetes-1.21
kubernetes-csi/csi-release-tools@510fb0f9 prow.sh: support Kubernetes 1.21
kubernetes-csi/csi-release-tools@c63c61b3 prow.sh: add CSI_PROW_DEPLOYMENT_SUFFIX
kubernetes-csi/csi-release-tools@51ac11c3 Merge kubernetes-csi/csi-release-tools#144 from pohly/pull-jobs
kubernetes-csi/csi-release-tools@dd54c926 pull-test.sh: test importing csi-release-tools into other repo
kubernetes-csi/csi-release-tools@7d2643a5 Merge kubernetes-csi/csi-release-tools#143 from pohly/path-setup
kubernetes-csi/csi-release-tools@6880b0c8 prow.sh: avoid creating paths unless really running tests
kubernetes-csi/csi-release-tools@bc0504ad Merge kubernetes-csi/csi-release-tools#140 from jsafrane/remove-unused-k8s-libs
kubernetes-csi/csi-release-tools@5b1de1ad go-get-kubernetes.sh: remove unused k8s libs
kubernetes-csi/csi-release-tools@49b42693 Merge kubernetes-csi/csi-release-tools#120 from pohly/add-kubernetes-release
kubernetes-csi/csi-release-tools@f7e7ee49 docs: steps for adding testing against new Kubernetes release

git-subtree-dir: release-tools
git-subtree-split: 6616a6b
…esnapshots-request

Update volumesnapshots request to list across all namespaces
kubernetes-csi/csi-release-tools@f3255906 Merge kubernetes-csi/csi-release-tools#149 from pohly/cluster-logs
kubernetes-csi/csi-release-tools@4b03b308 Merge kubernetes-csi/csi-release-tools#155 from pohly/owners
kubernetes-csi/csi-release-tools@a6453c86 owners: introduce aliases
kubernetes-csi/csi-release-tools@ad83def4 Merge kubernetes-csi/csi-release-tools#153 from pohly/fix-image-builds
kubernetes-csi/csi-release-tools@55617801 build.make: fix image publishng
kubernetes-csi/csi-release-tools@29bd39b3 Merge kubernetes-csi/csi-release-tools#152 from pohly/bump-csi-test
kubernetes-csi/csi-release-tools@bc427931 prow.sh: use csi-test v4.2.0
kubernetes-csi/csi-release-tools@b546baaf Merge kubernetes-csi/csi-release-tools#150 from mauriciopoppe/windows-multiarch-args
kubernetes-csi/csi-release-tools@bfbb6f35 add parameter base_image and addon_image to BUILD_PARAMETERS
kubernetes-csi/csi-release-tools@2d61d3bc Merge kubernetes-csi/csi-release-tools#151 from humblec/cm
kubernetes-csi/csi-release-tools@48e71f06 Replace `which` command ( non standard)  with `command -v` builtin
kubernetes-csi/csi-release-tools@feb20e26 prow.sh: collect cluster logs
kubernetes-csi/csi-release-tools@7b96bea3 Merge kubernetes-csi/csi-release-tools#148 from dobsonj/add-checkpathcmd-to-prow
kubernetes-csi/csi-release-tools@2d2e03b7 prow.sh: enable -csi.checkpathcmd option in csi-sanity
kubernetes-csi/csi-release-tools@09d41512 Merge kubernetes-csi/csi-release-tools#147 from pohly/mock-testing
kubernetes-csi/csi-release-tools@74cfbc97 prow.sh: support mock tests
kubernetes-csi/csi-release-tools@4a3f1103 prow.sh: remove obsolete test suppression

git-subtree-dir: release-tools
git-subtree-split: f325590
Two new timeout values ( retryIntervalStart & retryIntervalMax )
have been added to set the ratelimiter for volumesnapshotcontent queue.

Fix# kubernetes-csi#463

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>

```release-note
 `retry-interval-start` and `retry-interval-max` arguments are added to csi-snapshotter sidecar which controls retry interval of failed volume snapshot creation and deletion. These values set the ratelimiter for volumesnapshotcontent queue.
```

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
correct snapshot controller installation doc
Add ability to customize VolumeSnapshotContent workqueue
… retryIntervalMax`

This patch adds two new parameters `retryIntervalStart & retryIntervalMax`
which can be configured to adjust the ratelimiters of snapshotqueue and contentqueue
in the controller.

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>

```release-note
 `retry-interval-start` and `retry-interval-max` arguments are added to common-controller
  which controls retry interval of failed volume snapshot creation and deletion.
  These values set the ratelimiter for snapshot and content queues.
```

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Allow tuning common-controller Ratelimiter with `retryIntervalStart & retryIntervalMax`
Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
…ngcreated_returncontent

Return VolumeSnapshotContent from various functions instead of nil
…space

Add VS namespace to VSC printed columns
…config

Add command line arguments to configure leader election options
update setup-csi-snapshotter.yaml csi-snapshotter image
kubernetes-csi/csi-release-tools@c0a4fb1d Merge kubernetes-csi/csi-release-tools#164 from anubha-v-ardhan/patch-1
kubernetes-csi/csi-release-tools@9c6a6c08 Master to main cleanup
kubernetes-csi/csi-release-tools@682c686a Merge kubernetes-csi/csi-release-tools#162 from pohly/pod-name-via-shell-command
kubernetes-csi/csi-release-tools@36a29f5c Merge kubernetes-csi/csi-release-tools#163 from pohly/remove-bazel
kubernetes-csi/csi-release-tools@68e43ca7 prow.sh: remove Bazel build support
kubernetes-csi/csi-release-tools@c5f59c5a prow.sh: allow shell commands in CSI_PROW_SANITY_POD
kubernetes-csi/csi-release-tools@71c810ab Merge kubernetes-csi/csi-release-tools#161 from pohly/mock-test-fixes
kubernetes-csi/csi-release-tools@9e438f8e prow.sh: fix mock testing
kubernetes-csi/csi-release-tools@d7146c79 Merge kubernetes-csi/csi-release-tools#160 from pohly/kind-update
kubernetes-csi/csi-release-tools@4b6aa609 prow.sh: update to KinD v0.11.0
kubernetes-csi/csi-release-tools@7cdc76f3 Merge kubernetes-csi/csi-release-tools#159 from pohly/fix-deployment-selection
kubernetes-csi/csi-release-tools@ef8bd33b prow.sh: more flexible CSI_PROW_DEPLOYMENT, part II
kubernetes-csi/csi-release-tools@204bc89c Merge kubernetes-csi/csi-release-tools#158 from pohly/fix-deployment-selection
kubernetes-csi/csi-release-tools@61538bb7 prow.sh: more flexible CSI_PROW_DEPLOYMENT
kubernetes-csi/csi-release-tools@2b0e6db9 Merge kubernetes-csi/csi-release-tools#157 from humblec/csi-release
kubernetes-csi/csi-release-tools@a2fcd6de Adding myself to csi reviewers group

git-subtree-dir: release-tools
git-subtree-split: c0a4fb1
@openshift-ci
Copy link

openshift-ci bot commented Aug 17, 2021

@bertinatto: An error was encountered querying GitHub for users with public email (wduan@redhat.com) for bug 1993934 on the Bugzilla server at https://bugzilla.redhat.com. No known errors were detected, please see the full error message for details.

Full error message. non-200 OK status code: 403 Forbidden body: "{\n \"documentation_url\": \"https://docs.github.com/en/free-pro-team@latest/rest/overview/resources-in-the-rest-api#secondary-rate-limits\",\n \"message\": \"You have exceeded a secondary rate limit. Please wait a few minutes before you try again.\"\n}\n"

Please contact an administrator to resolve this issue, then request a bug refresh with /bugzilla refresh.

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bertinatto
Copy link
Member Author

/bugzilla refresh

@openshift-ci openshift-ci bot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Aug 17, 2021
@openshift-ci
Copy link

openshift-ci bot commented Aug 17, 2021

@bertinatto: This pull request references Bugzilla bug 1993934, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (wduan@redhat.com), skipping review request.

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Aug 17, 2021
@jsafrane
Copy link

/verify-owners

@jsafrane
Copy link

/test e2e-gcp-csi
/test e2e-aws-csi
I don't like the snapshot failures

@jsafrane
Copy link

/test e2e-gcp-csi
/test e2e-aws-csi
just to be sure...

@bertinatto
Copy link
Member Author

/test all

@bertinatto
Copy link
Member Author

/test e2e-gcp-csi

@jsafrane
Copy link

/test e2e-gcp-csi
/test e2e-aws-csi

@jsafrane
Copy link

Something is wrong. In run https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_csi-external-snapshotter/56/pull-ci-openshift-csi-external-snapshotter-master-e2e-gcp-csi/1428026187141091328

  • A test wants to create a snapshot at 16:51:34.497:
Aug 18 16:51:34.497: INFO: Waiting up to 5m0s for VolumeSnapshot snapshot-9qr8s to become ready

@jsafrane
Copy link

Reproduced on a cluster built from this PR. Just run snapshot tests 5x in parallel:

for i in `seq 5`; do ( TEST_CSI_DRIVER_FILES=manifest.yaml ./openshift-tests run openshift/csi  --run=VolumeSnapshotDataSource  |& tee log$i ) & done

Some tests timed out. Looking at the snapshot-controller container, all its snapshotWorker goroutines get stuck at:

goroutine 113 [semacquire, 13 minutes]:
sync.runtime_SemacquireMutex(0xc0004f59ec, 0x0, 0x1)
        /usr/lib/golang/src/runtime/sema.go:71 +0x47
sync.(*Mutex).lockSlow(0xc0004f59e8)
        /usr/lib/golang/src/sync/mutex.go:138 +0x105
sync.(*Mutex).Lock(...)
        /usr/lib/golang/src/sync/mutex.go:81
github.com/kubernetes-csi/external-snapshotter/v4/pkg/metrics.(*operationMetricsManager).OperationStart(0xc0004f59e0, 0x19aba21, 0xe, 0xc000703e60, 0x24, 0xc0004b6050, 0xf, 0x19a600c, 0x7, 0x0, ...)
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/metrics/metrics.go:186 +0x217
github.com/kubernetes-csi/external-snapshotter/v4/pkg/common-controller.(*csiSnapshotCommonController).processSnapshotWithDeletionTimestamp(0xc0005980f0, 0xc000732f00, 0xc000732f00, 0x0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/common-controller/snapshot_controller.go:252 +0x3cf
github.com/kubernetes-csi/external-snapshotter/v4/pkg/common-controller.(*csiSnapshotCommonController).syncSnapshot(0xc0005980f0, 0xc000732f00, 0x19723a0, 0xc000732f00)
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/common-controller/snapshot_controller.go:205 +0x1175
github.com/kubernetes-csi/external-snapshotter/v4/pkg/common-controller.(*csiSnapshotCommonController).updateSnapshot(0xc0005980f0, 0xc000732f00, 0x0, 0x0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/common-controller/snapshot_controller_base.go:374 +0x326
github.com/kubernetes-csi/external-snapshotter/v4/pkg/common-controller.(*csiSnapshotCommonController).syncSnapshotByKey(0xc0005980f0, 0xc000703f50, 0x24, 0xc00044e000, 0x1)
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/common-controller/snapshot_controller_base.go:230 +0xed7
github.com/kubernetes-csi/external-snapshotter/v4/pkg/common-controller.(*csiSnapshotCommonController).snapshotWorker(0xc0005980f0)
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/common-controller/snapshot_controller_base.go:195 +0xf8
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc0003c8000)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc0003c8000, 0x1b91080, 0xc00024c030, 0x1, 0xc00044e060)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0x9b
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0003c8000, 0x0, 0x0, 0x1, 0xc00044e060)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.Until(0xc0003c8000, 0x0, 0xc00044e060)
        /go/src/github.com/kubernetes-csi/external-snapshotter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d
created by github.com/kubernetes-csi/external-snapshotter/v4/pkg/common-controller.(*csiSnapshotCommonController).Run
        /go/src/github.com/kubernetes-csi/external-snapshotter/pkg/common-controller/snapshot_controller_base.go:146 +0x305

@jsafrane
Copy link

@jsafrane
Copy link

jsafrane commented Aug 19, 2021

I am testing a fix in #57 - metrics.go tries to acquire a mutex twice in the same goroutine and go mutexes are not recursive.

@jsafrane
Copy link

jsafrane commented Aug 19, 2021

Reported upstream in kubernetes-csi#580. We may need to introduce kubernetes-csi#581 here. I hope upstream is going to release a new version soon.

RecordMetrics() grabs a mutex and calls recordCancelMetric(), which wants to
grab the same mutex. Go mutexes are not recursive, so recordCancelMetric
blocks forever.

recordCancelMetric should not grab the mutex, it can be sure that the
caller did it already.
@openshift-ci
Copy link

openshift-ci bot commented Aug 19, 2021

The following users are mentioned in OWNERS file(s) but are untrusted for the following reasons. One way to make the user trusted is to add them as members of the openshift org. You can then trigger verification by writing /verify-owners in a comment.

  • kubernetes-csi-reviewers
    • User is not a member of the org. User is not a collaborator. Satisfy at least one of these conditions to make the user trusted.
  • kubernetes-csi-approvers
    • User is not a member of the org. User is not a collaborator. Satisfy at least one of these conditions to make the user trusted.

@bertinatto
Copy link
Member Author

Reported upstream in kubernetes-csi#580. We may need to introduce kubernetes-csi#581 here. I hope upstream is going to release a new version soon.

Done. Let's run the CSI jobs a few times before merging to make sure the flake is gone.

@bertinatto
Copy link
Member Author

/test all

@openshift-ci
Copy link

openshift-ci bot commented Aug 20, 2021

@bertinatto: This pull request references Bugzilla bug 1993934, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (wduan@redhat.com), skipping review request.

In response to this:

Bug 1993934: Merge upstream v4.2.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gnufied
Copy link
Member

gnufied commented Aug 20, 2021

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 20, 2021
@gnufied gnufied removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Aug 20, 2021
@openshift-merge-robot openshift-merge-robot merged commit 52ab893 into openshift:master Aug 20, 2021
@openshift-ci
Copy link

openshift-ci bot commented Aug 20, 2021

@bertinatto: Some pull requests linked via external trackers have merged:

The following pull requests linked via external trackers have not merged:

These pull request must merge or be unlinked from the Bugzilla bug in order for it to move to the next state. Once unlinked, request a bug refresh with /bugzilla refresh.

Bugzilla bug 1993934 has not been moved to the MODIFIED state.

In response to this:

Bug 1993934: Merge upstream v4.2.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link

[ART PR BUILD NOTIFIER]

This PR has been included in build ose-csi-snapshot-validation-webhook-container-v4.9.0-202311250023.p0.g52ab893.assembly.stream for distgit csi-snapshot-validation-webhook.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.