Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix when rbdVol is nil, panic occurred #4002

Closed
wants to merge 1 commit into from
Closed

Conversation

huangzynn
Copy link

image
When rbdVol is nil, panic occurred
add if rbdVol != nil

@mergify mergify bot added the bug Something isn't working label Jul 18, 2023
@nixpanic
Copy link
Member

Do you have steps to reproduce this problem? It would be good to have a test case, or at least be able to validate the fix.

@@ -1078,7 +1078,9 @@ func (cs *ControllerServer) CreateSnapshot(

// Fetch source volume information
rbdVol, err := GenVolFromVolID(ctx, req.GetSourceVolumeId(), cr, req.GetSecrets())
defer rbdVol.Destroy()
if rbdVol != nil {
defer rbdVol.Destroy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be moved after if err != nil, no need to check if rbdVol != nil in that case.

Copy link
Contributor

@Rakshith-R Rakshith-R left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huangzynn
Copy link
Author

Such as:
image
image
image

@nixpanic
Copy link
Member

Hi @huangzynn , in order to fix the commitlint issue, please make these changes:

  1. squash the commits into one
  2. change the subject of the commit to rbd: move CreateSnapshot defer rbdVol.Destroy() after err != nil
  3. force push the single commit to your GitHub branch

Many thanks!

@nixpanic
Copy link
Member

Do you have steps to reproduce this problem? It would be good to have a test case, or at least be able to validate the fix.

So, by reading the screenshots, is it correct to say that the problem can be triggered when creating a VolumeSnapshot of a non-existing PVC?

@huangzynn
Copy link
Author

Do you have steps to reproduce this problem? It would be good to have a test case, or at least be able to validate the fix.

So, by reading the screenshots, is it correct to say that the problem can be triggered when creating a VolumeSnapshot of a non-existing PVC?
yes

@@ -1092,6 +1091,7 @@ func (cs *ControllerServer) CreateSnapshot(

return nil, err
}
defer rbdVol.Destroy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can cause a cleanup problem as rbdVol is returned even in case of error. IMO moving this to line 1081 with a check for nil is okay.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I thought from the beginning

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rbdVol should never be returned in a case of an error. Either err != nil or rbdVol != nil. The additional if-statement should be useless, and in effect is the same as having the defer below the error handling. It adds a bit of complexity, and makes me wonder if rbdVol == nil should not be handled further down in the function as well?

@huangzynn huangzynn force-pushed the devel branch 2 times, most recently from 4bfae38 to f61add6 Compare July 19, 2023 08:43
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 19, 2023

@huangzynn as suggested here https://github.com/ceph/ceph-csi/blob/b1d1d3bf75b8becb427ec494dd6449f7b5bce115/docs/development-guide.md#commit-messages you need to update the commit to fix the commitlint CI problem. @nixpanic posted a sample for you above. you are missing rbd: prefix in commit message. job output here https://github.com/ceph/ceph-csi/actions/runs/5596968399/jobs/10234777959?pr=4002

add if rbdVol != nil defer rbdVol.Destroy()

Signed-off-by: huangzy <huangzynn@outlook.com>
@nixpanic
Copy link
Member

It seems not trivial to reproduce this, when I try, I get the following

Status:
  Error:
    Message:     Failed to create snapshot content with error snapshot controller failed to update invalid-pvc-pr-4002 on API server: cannot get claim from snapshot
    Time:        2023-07-19T10:55:43Z
  Ready To Use:  false
Events:
  Type     Reason                         Age    From                 Message
  ----     ------                         ----   ----                 -------
  Warning  SnapshotContentCreationFailed  2m39s  snapshot-controller  Failed to create snapshot content with error snapshot controller failed to update invalid-pvc-pr-4002 on API server: cannot get claim from snapshot

Maybe the PCV needs to exist to pass the snapshot-controller check, and get deleted before Ceph-CSI acts on it?

@huangzynn
Copy link
Author

huangzynn commented Jul 20, 2023

It seems not trivial to reproduce this, when I try, I get the following

Status:
  Error:
    Message:     Failed to create snapshot content with error snapshot controller failed to update invalid-pvc-pr-4002 on API server: cannot get claim from snapshot
    Time:        2023-07-19T10:55:43Z
  Ready To Use:  false
Events:
  Type     Reason                         Age    From                 Message
  ----     ------                         ----   ----                 -------
  Warning  SnapshotContentCreationFailed  2m39s  snapshot-controller  Failed to create snapshot content with error snapshot controller failed to update invalid-pvc-pr-4002 on API server: cannot get claim from snapshot

Maybe the PCV needs to exist to pass the snapshot-controller check, and get deleted before Ceph-CSI acts on it?

What is your snapshot controller version?
My snapshot-controller is v6.2.2

@nixpanic
Copy link
Member

My snapshot-controller is v6.2.2

My cluster has v6.2.1... not sure if there is a big difference between those.

@humblec
Copy link
Collaborator

humblec commented Jul 20, 2023

@huangzynn can you please provide the below details from this cluster.

  • The exact yaml of volume snapshot object and #kubectl get pvc output from the namespace
  • The snapshot controller and external snapshotter version
  • Kubernetes cluster and Ceph CSI driver version

@huangzynn
Copy link
Author

  • kubectl get pvc

image
image
csi-snapshotter:v6.2.2
snapshot-controller:v6.2.2
cephcsi:v3.6.2
Kubernetes:v1.21.5

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

@github-actions
Copy link

github-actions bot commented Sep 4, 2023

This pull request has been automatically closed due to inactivity. Please re-open if these changes are still required.

@github-actions github-actions bot closed this Sep 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component/rbd Issues related to RBD stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants