Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define DeleteVolume behavior with snapshots #347

Merged

Conversation

Akrog
Copy link
Contributor

@Akrog Akrog commented Jan 25, 2019

Spec is not clear on what should happen when on a DeleteVolume call when
the volume has snapshots.

This patch clarifies the situation by explicitly mentioning that the
operation should complete and the snapshots should still be operational.

Closes: #346

spec.md Outdated
@@ -1131,6 +1131,8 @@ This RPC will be called by the CO to deprovision a volume.
This operation MUST be idempotent.
If a volume corresponding to the specified `volume_id` does not exist or the artifacts associated with the volume do not exist anymore, the Plugin MUST reply `0 OK`.

CSI plugins MUST treat volumes independent from their snapshots. Controller Plugin MUST support deleting a volume without affecting its existing snapshots, which MUST still be fully operational and acceptable as sources for new volumes as well as appear on `ListSnapshot` calls.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be too onerous for some plugins to support, especially if they're not maintaining their own state store independent of any internal store provided by the "real" backend.

I'd rather let the plugin decide how to do this:

  1. Fail the volume-delete RPC because it's not allowed there are snapshots.
  2. Honor the volume-delete RPC and maintain the existing snapshots (because they're still usable).
  3. Honor the volume-delete RPC and remove all linked snapshots.

If there's an ongoing snapshot-related operation in progress (e.g. creating a snapshot from said volume, or creating a volume from a snapshot that's linked to the volume-to-be-removed) then we have error codes a plugin can use to indicate a conflict with an ongoing operation (ABORTED).

If (1) is the case then there's already a documented error code for delete's that fail because a resource is in-use (FAILED_PRECONDITION).

If the plugin is able to execute (3) without errors, then it's up to the CO to periodically reconcile its internal state w/ that of the plugin in order to account for snapshots that may have disappeared.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since a plugin's capabilities w/ respect to (2) or (3) aren't discoverable via some CSI capability, a CO w/ CSI snapshot support should probably be periodically scanning/reconciling snapshots anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to support the 3 options, which sounds great as we'd be more inclusive with all types of backends and plugins, I think we should add some kind of capability reporting to specify each plugin behavior.

Without knowing expected behavior it'll get messy for COs and users, since each plugin would behave differently and the CO wouldn't know what to expect, and users would have to go to the plugin's documentation to know if deleting a volume with snapshots is supported or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should let the SP decide but this is a good suggestion. So how about /s/MUST/SHOULD/ in this change to make it a recommendation rather then a requirement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm updating the patch to change it to SHOULD and also make it contemplate @jdef 's first scenario. This way plugins can either support deleting volumes independently or fail. We can leave support for cascade deletion of snapshots as a new feature and leave this patch to just clarify the current expected behavior.

spec.md Outdated
@@ -1131,6 +1131,8 @@ This RPC will be called by the CO to deprovision a volume.
This operation MUST be idempotent.
If a volume corresponding to the specified `volume_id` does not exist or the artifacts associated with the volume do not exist anymore, the Plugin MUST reply `0 OK`.

CSI plugins MUST treat volumes independent from their snapshots. Controller Plugin MUST support deleting a volume without affecting its existing snapshots, which MUST still be fully operational and acceptable as sources for new volumes as well as appear on `ListSnapshot` calls.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should let the SP decide but this is a good suggestion. So how about /s/MUST/SHOULD/ in this change to make it a recommendation rather then a requirement?

@Akrog Akrog force-pushed the clarify-delete-volume branch from 86e779e to b291631 Compare February 27, 2019 12:58
spec.md Outdated
@@ -1131,6 +1131,8 @@ This RPC will be called by the CO to deprovision a volume.
This operation MUST be idempotent.
If a volume corresponding to the specified `volume_id` does not exist or the artifacts associated with the volume do not exist anymore, the Plugin MUST reply `0 OK`.

CSI plugins SHOULD treat volumes independent from their snapshots. If the Controller Plugin supports deleting a volume without affecting its existing snapshots, then these snapshots MUST still be fully operational and acceptable as sources for new volumes as well as appear on `ListSnapshot` calls.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the revised language, though I recommend s/MUST/SHOULD/ in the 2nd sentence. We're saying this behavior is optional, but recommended - so the use of MUST only makes it more confusing: is this required or optional behavior?

nit: one sentence per line, please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot about the one line per sentence, sorry.

The reason for using MUST in the second sentence is that it is a conditional sentence, and I believe it to be correct.

If a driver does support deleting a volume independently of the snapshots then it MUST (not should, this is a hard requirement) allow using those snapshots normally.

If I change the MUST to SHOULD, the specs end up saying that it's OK to have a driver that allows deleting a volume with snapshots and leave its snapshots in a state that they cannot be normally used. Which is not what we want to say.

I will add another sentence explaining the case of the controller not supporting deleting volumes independently of their snapshots to make it explicit that both options are possible, and that's why we had the SHOULD in the first sentence.

Thanks for the review.

Spec is not clear on what should happen when on a DeleteVolume call when
the volume has snapshots.

This patch clarifies the situation by explicitly mentioning that the
operation should complete and the snapshots should still be operational.

Closes: container-storage-interface#346
@Akrog Akrog force-pushed the clarify-delete-volume branch from b291631 to a249266 Compare March 4, 2019 10:15
@jdef
Copy link
Member

jdef commented Mar 7, 2019

Thanks for the clarifying language. I wonder if this should be considered a breaking change?

This could be construed as more restrictive behavior vs. what was called out in the original specification.

@Akrog
Copy link
Contributor Author

Akrog commented Mar 7, 2019

I think it's not really a breaking change, it's mostly a clarification with a preferred behavior.

Existing v1.0 plugins will have implemented the delete operation in one of those two ways, and it doesn't matter which one they have, because they will conform to the spec regardless.

Copy link
Contributor

@julian-hj julian-hj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jdef
Copy link
Member

jdef commented Mar 7, 2019

Existing v1.0 plugins will have implemented the delete operation in one of those two ways, and it doesn't matter which one they have, because they will conform to the spec regardless.

ok, sounds good

@saad-ali
Copy link
Member

/lgtm
/approve

@zhucan
Copy link

zhucan commented Apr 25, 2019

maybe, whether volumes independent from their snapshots, we can set a flag to volumesnapshotclass?

@zhucan
Copy link

zhucan commented Apr 25, 2019

when deleting volumes , we can judge wether volumes independent from their snapshots from the flag?

@Akrog
Copy link
Contributor Author

Akrog commented Apr 25, 2019

maybe, whether volumes independent from their snapshots, we can set a flag to volumesnapshotclass?
when deleting volumes , we can judge wether volumes independent from their snapshots from the flag?

From CSI's perspective there's no such thing as the VolumeSnapshotClass. That is a CO (k8s) specific detail.

From the CSI perspective what we discussed was that a CSI plugins could report whether the plugin supports this capability or not. But decided to let drivers decide and be the CO's decision.

It should be possible to use a flag in the VolumeSnapshotClass to tell k8s whether a volume with snapshots can be deleted or not, but adding this feature should be brought up with the Kubernetes community.

@zhucan
Copy link

zhucan commented Apr 25, 2019 via email

@Akrog
Copy link
Contributor Author

Akrog commented Apr 25, 2019

Then your storage must succeed for independent volumes and return FAILED_PRECONDITION error for the non-independent ones.

@zhucan
Copy link

zhucan commented Apr 25, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Undefined behavior for DeleteVolume with snapshots
6 participants