Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd: use ioctx locks for key rotation #4734

Merged
merged 3 commits into from
Jul 30, 2024
Merged

Conversation

black-dragon74
Copy link
Member

@black-dragon74 black-dragon74 commented Jul 26, 2024

Describe what this PR does

This PR introduces rados ioctx locks for key rotation operations which will help with synchronizing
operations when dealing with ROX/RWX volumes.

@black-dragon74 black-dragon74 self-assigned this Jul 26, 2024
@mergify mergify bot added the component/rbd Issues related to RBD label Jul 26, 2024
@black-dragon74 black-dragon74 force-pushed the rwx-test branch 2 times, most recently from dc19aae to 21ccbda Compare July 26, 2024 07:44
Copy link
Member

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks very much like what is used in CephFS. Can you move it to a shared helper package under internal/util/lock/...? You can have a

package lock

interface Lock {
    LockExclusive(ctx) error
    Unlock(ctx) error
}

type lock struct {
    lockName, lockCookie, lockDesc string
    ioctx *rados.Ioctx
}

func NewLock(ioctx, volumeID, duration) Lock {
    return &lock{
        ioctx: ioctx,
        ...
    }
}

func (lck *Lock) LockExclusive(ctx context.Context, ioctx *rados.Ioctx, volumeID string) error {
    return lck.ioctx.LockExclusive(...)
}

func (lck *Lock) Unlock(ctx) error {
   ...
}

internal/rbd/encryption.go Outdated Show resolved Hide resolved
internal/rbd/encryption.go Outdated Show resolved Hide resolved
internal/rbd/encryption.go Outdated Show resolved Hide resolved
@black-dragon74 black-dragon74 force-pushed the rwx-test branch 2 times, most recently from 7ca683d to f1ea33a Compare July 29, 2024 07:40
Copy link
Collaborator

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit

internal/util/lock/lock.go Outdated Show resolved Hide resolved
Copy link
Member

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a small change needed for a little simpler lock API.

internal/util/lock/lock.go Outdated Show resolved Hide resolved
@nixpanic
Copy link
Member

@Mergifyio queue

Copy link
Contributor

mergify bot commented Jul 30, 2024

queue

🛑 The pull request has been removed from the queue default

The queue conditions cannot be satisfied due to failing checks.

You can take a look at Queue: Embarked in merge queue check runs for more details.

In case of a failure due to a flaky test, you should first retrigger the CI.
Then, re-embark the pull request into the merge queue by posting the comment
@mergifyio refresh on the pull request.

that abstracts rados Ioctx Lock/Unlocks

Signed-off-by: Niraj Yadav <niryadav@redhat.com>
Signed-off-by: Niraj Yadav <niryadav@redhat.com>
Signed-off-by: Niraj Yadav <niryadav@redhat.com>
@mergify mergify bot added the ok-to-test Label to trigger E2E tests label Jul 30, 2024
@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.29

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.29

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.28

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.28

@ceph-csi-bot ceph-csi-bot removed the ok-to-test Label to trigger E2E tests label Jul 30, 2024
@black-dragon74
Copy link
Member Author

@Mergifyio queue

Copy link
Contributor

mergify bot commented Jul 30, 2024

queue

🛑 The pull request has been removed from the queue default

The queue conditions cannot be satisfied due to failing checks.

You can take a look at Queue: Embarked in merge queue check runs for more details.

In case of a failure due to a flaky test, you should first retrigger the CI.
Then, re-embark the pull request into the merge queue by posting the comment
@mergifyio refresh on the pull request.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 30, 2024

Warning: A secret was passed to "sh" using Groovy String interpolation, which is insecure.
Affected argument(s) used the following variable(s): [CREDS_USER]
See https://jenkins.io/redirect/groovy-string-interpolation for details.

  • ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no root@n27-38-30.pool.ci.centos.org 'podman login --authfile=~/.podman-auth.json --username=**** --password=**** registry-.apps.ocp.cloud.ci.centos.org'
    Error: authenticating creds for "registry-
    .apps.ocp.cloud.ci.centos.org": pinging container registry registry-****.apps.ocp.cloud.ci.centos.org: Get "https://registry-****.apps.ocp.cloud.ci.centos.org/v2/": tls: failed to verify certificate: x509: certificate signed by unknown authority
    script returned exit code 125

@nixpanic CI is failing with above error, any idea about it?

@nixpanic
Copy link
Member

I think there was some certificate rotation on the OpenShift service that provides the routing. This gives a similar error for the Jenkins Pod:

curl -v https://jenkins-ceph-csi.apps.ocp.cloud.ci.centos.org/

The TLS certificate was updated today:

$ curl -v -k https://registry-ceph-csi.apps.ocp.cloud.ci.centos.org
* Host registry-ceph-csi.apps.ocp.cloud.ci.centos.org:443 was resolved.
* IPv6: (none)
* IPv4: 54.174.203.175, 54.145.63.179
*   Trying 54.174.203.175:443...
* Connected to registry-ceph-csi.apps.ocp.cloud.ci.centos.org (54.174.203.175) port 443
...
* Server certificate:
*  subject: CN=api.ocp.cloud.ci.centos.org
*  start date: Jul 30 08:42:03 2024 GMT
...

@nixpanic
Copy link
Member

@Mergifyio requeue

Copy link
Contributor

mergify bot commented Jul 30, 2024

requeue

✅ The queue state of this pull request has been cleaned. It can be re-embarked automatically

@nixpanic
Copy link
Member

@Mergifyio queue

Copy link
Contributor

mergify bot commented Jul 30, 2024

queue

🛑 The pull request has been removed from the queue default

The queue conditions cannot be satisfied due to failing checks.

You can take a look at Queue: Embarked in merge queue check runs for more details.

In case of a failure due to a flaky test, you should first retrigger the CI.
Then, re-embark the pull request into the merge queue by posting the comment
@mergifyio refresh on the pull request.

@nixpanic nixpanic added the ok-to-test Label to trigger E2E tests label Jul 30, 2024
@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-cephfs

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-rbd

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.28

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.28

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.29

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.30

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.28

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.29

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.30

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.29

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.30

@ceph-csi-bot ceph-csi-bot removed the ok-to-test Label to trigger E2E tests label Jul 30, 2024
@nixpanic
Copy link
Member

@Mergifyio requeue

Copy link
Contributor

mergify bot commented Jul 30, 2024

requeue

✅ The queue state of this pull request has been cleaned. It can be re-embarked automatically

@mergify mergify bot merged commit 439a03f into ceph:devel Jul 30, 2024
52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rbd Issues related to RBD
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants