-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Provide publish context during ControllerUnpublishVolume #496
Comments
The controller-unpublish call does not relay state on purpose. Consider
that the CO is braindead and loses context. It should still be possible to
unpublish.
I'm sympathetic to the partitioning problem, and think that the CO should
do more to help plugins deal with such.
…-1 on this proposal.
On Fri, Nov 5, 2021, 4:36 PM Sumukh Shivaprakash ***@***.***> wrote:
Problem Overview:
ControllerPublishVolume has a contract to return FAILED_PRECONDITION with
the status indicating the node id that the volume is currently published on.
This helps detecting stale ControllerPublishVolume API calls by the CO to
the SP controller and take appropriate action.
However, during the controller unpublish, consider the following ordering
of events:
1.
CO issues a ControllerUnpublishVolume(node1) to the Plugin.
2.
The above TCP message is delayed indefinitely over the network.
3.
Node on which CO is running experiences a network partition and is not
part of a federated cluster anymore. As a result, the CO is now running in
a different node that decides to place it on node2. NOTE that the old TCP
connection from the CO is still alive to the Plugin.
4.
CO issues ControllerPublishVolume(node2) to the Plugin.
5.
CO then decides to place the volume on node1 again hence issues
ControllerUnpublishVolume(node2) subsequently doing a
ControllerPublishVolume(node1).
6.
Finally, the delayed TCP message from 2 arrives indicating a
ControllerUnpublishVolume(node1)
Expectation:
The Plugin must be able to identify the stale API call that comes in step
6 and ignore it.
Proposal:
If the ControllerUnpublishVolume includes the original publish context
that was returned as part of the ControllerPublishVolume, the Plugin/SP can
sequence these and detect staless.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#496>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR5KLBTVPDVDLJRE6OI2UDUKRE6ZANCNFSM5HOU3QLQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Isn't the co storing/plumbing around the volume id and volume context already? I didn't get how that is different from the publish context. If the co is brain dead and loses all its context, how would it be able to controller unpublish anyway? |
The only required field for controller-unpublish is volume_id.
…On Fri, Nov 5, 2021, 6:40 PM Sumukh Shivaprakash ***@***.***> wrote:
Isn't the co storing/plumbing around the volume id and volume context
already?
I didn't get how that is different from the publish context.
If the co is brain dead and loses all its context, how would it be able to
controller unpublish anyway?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#496 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR5KLDD3TAGUYWV4B3SERDUKRTO3ANCNFSM5HOU3QLQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Isn't the assumption that the volume_id is also lost though? CO would still have to go through the flow of all the idempotent calls like CreateVolume (which gives back the volume_id), then ControllerPublishVolume(which being idempotent can give back the publish context again) and finally invoke the ControllerUnpublishVolume(publish_context)? |
Unless you are referring to manual intervention by a human who is trying to patch this up and wants to unpublish the volume - I see your point then. |
Correct. A human should be able to provide volume ID.
…On Fri, Nov 5, 2021, 7:02 PM Sumukh Shivaprakash ***@***.***> wrote:
Unless you are referring to manual intervention by a human who is trying
to patch this up and wants to unpublish the volume - I see your point then.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#496 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR5KLG27DPOONBTNINNE2TUKRWAZANCNFSM5HOU3QLQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
That makes sense. Any suggestions on how this can be tackled or pointers to other approaches are appreciated! Going back to your original response, what did you imply by "CO must do more". A CO doing anything more wouldn't be csi conformant and hence lose the flexibility of plugging in different SP's right? |
CO must do more:
There's another csi proposal that deals with partitioning and attempts to
implement a fencing strategy within csi - I commented there as well. The CO
bears more responsibility (for complexity) in such cases so that plug-in
implementations (and the csi spec) remain simpler.
It sounds like you're trying to address an important problem with a minimal
spec change, and in general I'm supportive of such surgical approaches.
However, in this case, the proposal violates a principal that we
established early on re: braindead CO and volume cleanup operations.
…On Fri, Nov 5, 2021, 7:31 PM Sumukh Shivaprakash ***@***.***> wrote:
That makes sense. Any suggestions on how this can be tackled or pointers
to other approaches are appreciated!
Going back to your original response, what did you imply by "CO must do
more". A CO doing anything more wouldn't be csi conformant and hence lose
the flexibility of plugging in different SP's right?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#496 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR5KLELQS2E7WLPF756JF3UKRZORANCNFSM5HOU3QLQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Yes, for the apis in csi that I cared about (create,controller publish/unpublish, node stage/unstage, node publish/unpublish, delete), it seems like we can deal with a network partition for all the cases except the controller unpublish and hence the question here. I can take a look at the other proposal to see how the co could do more to handle this situation - can you point me to the issue number? The way I think of a co, in spite of all its complexity and how many ever intermediate states it persists, a network partition causing a stale api call to arrive at a plugin during unpublish will not be distinguished because the spec only allows for a volume id. I can think of passing in an optional "etag" that if absent will be treated as the safety net/human intervention where the call will be honored and the unpublish will be accepted. However, if the value is set to something (which will be the case in the stale api call), it is ignored due to the etag checks. It is similar to database optimistic concurrency. Would this address the drawback of the principles of csi you pointed out in my original proposal? |
@jdef - gentle ping! |
Not sure why I'm pinged here. Afaict this issue / ticket has been settled.
…On Tue, Nov 9, 2021, 12:27 AM Sumukh Shivaprakash ***@***.***> wrote:
@jdef <https://github.com/jdef> - gentle ping!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#496 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR5KLB4I6BBOTOCUPHCNU3ULCWMBANCNFSM5HOU3QLQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Wondering what your take on this approach is: |
I think I misread the original proposal, mistakenly confusing volume-context for publish-context. That said... "etag" sounds like additional state that: "etag" seems like "special publish volume context". Let's avoid special case things like this. So let's say that publish-context was added as an always-OPTIONAL field for controller-unpublish-volume: message ControllerUnpublishVolumeRequest {
...
map<string,string> publish_context = x; // OPTIONAL, always
} ... and the new RPC semantics become ... when publish-context is empty: when publish-context is non-empty: At first glance, this seems to be backwards compatible and the new behavior is completely opt-in from a plugin perspective. I'm curious - did you actually run into this situation in a live cluster, or this proposal born of hypotheticals? |
Yes, given its optional behavior it would be a non-breaking change for existing plugin's/CO's. Is the expectation for the feature proposer to send out a PR/do one of the maintainers would take care of incorporating this in the next release?
We did not hit this in a live cluster but we have a mock CO and CSI simulator that exercised this code path and ran into a coding assert. |
Kindly submit a PR. If this is your first, you probably need to sign the
CLA, otherwise PR cannot be accepted.
…On Tue, Nov 9, 2021, 10:44 AM Sumukh Shivaprakash ***@***.***> wrote:
At first glance, this seems to be backwards compatible and the new
behavior is completely opt-in from a plugin perspective.
Yes, given its optional behavior it would be a non-breaking change for
existing plugin's/CO's. Is the expectation for the feature proposer to send
out a PR/do one of the maintainers would take care of incorporating this in
the next release?
I'm curious - did you actually run into this situation in a live cluster,
or this proposal born of hypotheticals?
We did not hit this in a live cluster but we have a mock CO and CSI
simulator that exercised this code path and ran into a coding assert.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#496 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR5KLDZRIJZQB5DCOBJOCDULE6XLANCNFSM5HOU3QLQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Problem Overview:
ControllerPublishVolume has a contract to return FAILED_PRECONDITION with the status indicating the node id that the volume is currently published on.
This helps detecting stale ControllerPublishVolume API calls by the CO to the SP controller and take appropriate action.
However, during the controller unpublish, consider the following ordering of events:
CO issues a ControllerUnpublishVolume(node1) to the Plugin.
The above TCP message is delayed indefinitely over the network.
Node on which CO is running experiences a network partition and is not part of a federated cluster anymore. As a result, the CO is now running in a different node that decides to place it on node2. NOTE that the old TCP connection from the CO is still alive to the Plugin.
CO issues ControllerPublishVolume(node2) to the Plugin.
CO then decides to place the volume on node1 again hence issues ControllerUnpublishVolume(node2) subsequently doing a ControllerPublishVolume(node1).
Finally, the delayed TCP message from 2 arrives indicating a ControllerUnpublishVolume(node1)
Expectation:
The Plugin must be able to identify the stale API call that comes in step 6 and ignore it.
Proposal:
If the ControllerUnpublishVolume includes the original publish context that was returned as part of the ControllerPublishVolume, the Plugin/SP can sequence these and detect staless.
The text was updated successfully, but these errors were encountered: