CSI: move node unmount to server-driven RPCs #7596

tgross · 2020-04-02T02:54:29Z

If a volume-claiming alloc stops and the CSI Node plugin that serves that alloc's volumes is missing, there's no way for the allocrunner hook to send the NodeUnpublish and NodeUnstage RPCs. Added complications:

The CSI spec requires that any ControllerUnpublish RPC happens after those two RPCs.
We don't want to stop volume-claiming jobs on Nomad client restart or plugin update/stop because that would be a big change from existing Nomad job behavior (and generally unpleasant for operators).
Nomad's implementation of the volume staging has the mountpoint path specific to the plugin ID, but specific to the volume ID. That means we don't have to be subject to the same "hang forever" condition as k8s is demonstrating in CSI: shutdown consuming tasks when node plugins shutdown #7180 (comment).
If the Node plugin is restored before the task gives up on unmounting the volume, everything works. But this is a short window.
But if the plugin is down and a task using it stops and then the plugin comes back later, we'd want that state tracked so we remember the mount point and can unpublish.

This changeset addresses this issue with a redesign of the client-side for CSI. Rather than unmounting in the alloc runner hook, the alloc runner hook will simply exit. When the server gets the Node.UpdateAlloc for the terminal allocation that had a volume claim, it creates a volume claim GC job. This job will made client RPCs to a new node plugin RPC endpoint, and only once that succeeds, move on to making the client RPCs to the controller plugin. If the node plugin is unavailable, the GC job will fail and be requeued.

We don't have to track any state about what the client has done, just that the allocs are terminal. When we do the "heartyeet" work in #2185, most of the plumbing is already in place. And this gives us the opportunity to add a Nomad command for operators to forcibly drop claims via the CLI in the future.

There's a bit of work done here (probably easiest to review commit-by-commit):

added the client-side RPC endpoint
moved all the client unmount operations into that handler
reworked the GC job significantly so that the logic could be exercised in tests without making a bunch of RPC

tgross · 2020-04-02T13:24:34Z

I just ran an end-to-end test of this where I deployed plugins, deployed jobs that need those plugins, removed the plugins, and then stopped the volume-claiming jobs. The deployed jobs stop normally, but the GC job fails and gets requeued as expected; this prevents a nomad system gc from completing its work as well. When I restored the plugins, the GCs complete and the claims were released as expected.

tgross · 2020-04-02T14:42:35Z

Interesting failure mode that stops a system GC. I had a volume claim that wasn't released for some reason (TBD), but it looks like the node plugin work had been completed. So when the system GC fired, it's hitting an error. We need to make sure the node client endpoint handles that gracefully.

     2020-04-02T14:33:06.100Z [DEBUG] core.sched: forced job GC
    2020-04-02T14:33:06.100Z [DEBUG] core.sched: job GC found eligible objects: jobs=2 evals=9 allocs=0
    2020-04-02T14:33:06.115Z [ERROR] worker: error invoking scheduler: error="failed to process evaluation: 2 errors occurred:
        * rpc error: node detach volume: rpc error: code = Internal desc = Could not unmount "/csi/per-alloc/1cda822e-bed8-0b90-92e6-24bbbf
Unmounting arguments: /csi/per-alloc/1cda822e-bed8-0b90-92e6-24bbbfdfe647/vol-0634a0d38d55012f9/rw-file-system-single-node-writer
Output: umount: /csi/per-alloc/1cda822e-bed8-0b90-92e6-24bbbfdfe647/vol-0634a0d38d55012f9/rw-file-system-single-node-writer: no mount point
        * rpc error: node detach volume: rpc error: code = Internal desc = Could not unmount "/csi/per-alloc/1cda822e-bed8-0b90-92e6-24bbbf
Unmounting arguments: /csi/per-alloc/1cda822e-bed8-0b90-92e6-24bbbfdfe647/vol-0634a0d38d55012f9/rw-file-system-single-node-writer
Output: umount: /csi/per-alloc/1cda822e-bed8-0b90-92e6-24bbbfdfe647/vol-0634a0d38d55012f9/rw-file-system-single-node-writer: no mount point
"

client/csi_endpoint.go

langmartin

Ok! I think it looks good

github-actions · 2023-01-11T02:16:33Z

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

tgross added the theme/storage label Apr 2, 2020

tgross added this to the 0.11.0 milestone Apr 2, 2020

tgross requested a review from langmartin April 2, 2020 02:54

This was referenced Apr 2, 2020

csi: internals docs #7522

Merged

CSI: shutdown consuming tasks when node plugins shutdown #7180

Closed

tgross added a commit that referenced this pull request Apr 2, 2020

updated language to match PR #7596

baa0dcf

tgross added 4 commits April 2, 2020 14:31

rename controller endpoint file

12b0454

refactor: rename client-side CSI RPC endpoint

11191ca

csi: move client-side unstage/unmount operations to RPC handler

ed37ca7

csi: have volume claim GC reap make node unpublish RPCs

a9e0d16

tgross force-pushed the f-csi-unmount-nodes-from-server branch from 8cd461f to a9e0d16 Compare April 2, 2020 18:31

langmartin suggested changes Apr 2, 2020

View reviewed changes

client/csi_endpoint.go Show resolved Hide resolved

tgross requested a review from langmartin April 2, 2020 19:40

tgross mentioned this pull request Apr 2, 2020

CSI: failed to setup alloc: pre-run hook "csi_hook" #7568

Closed

langmartin approved these changes Apr 2, 2020

View reviewed changes

tgross merged commit 414caf7 into master Apr 2, 2020

tgross deleted the f-csi-unmount-nodes-from-server branch April 2, 2020 20:04

tgross mentioned this pull request Apr 30, 2020

csi: pull csi_endpoint request validation out into separate function #7175

Closed

tgross mentioned this pull request Jun 30, 2021

CSI node unpublish/unstage should avoid requiring server #10833

Closed

github-actions bot locked as resolved and limited conversation to collaborators Jan 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CSI: move node unmount to server-driven RPCs #7596

CSI: move node unmount to server-driven RPCs #7596

tgross commented Apr 2, 2020 •

edited

Loading

tgross commented Apr 2, 2020

tgross commented Apr 2, 2020

langmartin left a comment

github-actions bot commented Jan 11, 2023

CSI: move node unmount to server-driven RPCs #7596

CSI: move node unmount to server-driven RPCs #7596

Conversation

tgross commented Apr 2, 2020 • edited Loading

tgross commented Apr 2, 2020

tgross commented Apr 2, 2020

langmartin left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 11, 2023

tgross commented Apr 2, 2020 •

edited

Loading