Skip to content

Commit

Permalink
docs: backport #12466 into release-1.1.x (#12717)
Browse files Browse the repository at this point in the history
  • Loading branch information
tgross authored Apr 20, 2022
1 parent c714639 commit 0ed1750
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 15 deletions.
25 changes: 16 additions & 9 deletions website/content/docs/internals/plugins/csi.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,11 @@ that perform both the controller and node roles in the same
instance. Not every plugin provider has or needs a controller; that's
specific to the provider implementation.

Plugins mount and unmount volumes but are not in the data path once
the volume is mounted for a task. Plugin tasks are needed when tasks
using their volumes stop, so plugins should be left running on a Nomad
client until all tasks using their volumes are stopped.

You should always run node plugins as Nomad `system` jobs and use the
`-ignore-system` flag on the `nomad node drain` command to ensure that the
node plugins are still running while the node is being drained. Use
Expand All @@ -65,7 +70,8 @@ region. Controller plugins can be run as `service` jobs.
Nomad exposes a Unix domain socket named `csi.sock` inside each CSI
plugin task, and communicates over the gRPC protocol expected by the
CSI specification. The `mount_dir` field tells Nomad where the plugin
expects to find the socket file.
expects to find the socket file. The path to this socket is exposed in
the container as the `CSI_ENDPOINT` environment variable.

### Plugin Lifecycle and State

Expand Down Expand Up @@ -94,7 +100,7 @@ server and waits for a response; the allocation's tasks won't start
until the volume has been claimed and is ready.

If the volume's plugin requires a controller, the server will send an
RPC to the Nomad client where that controller is running. The Nomad
RPC to any Nomad client where that controller is running. The Nomad
client will forward this request over the controller plugin's gRPC
socket. The controller plugin will make the request volume available
to the node that needs it.
Expand All @@ -110,13 +116,14 @@ client, and the node plugin mounts the volume to a staging area in
the Nomad data directory. Nomad will bind-mount this staged directory
into each task that mounts the volume.

This cycle is reversed when a task that claims a volume becomes terminal. The
client will send an "unpublish" RPC to the server, which will send "detach"
RPCs to the node plugin. The node plugin unmounts the bind-mount from the
allocation and unmounts the volume from the plugin (if it's not in use by
another task). The server will then send "unpublish" RPCs to the controller
plugin (if any), and decrement the claim count for the volume. At this point
the volume’s claim capacity has been freed up for scheduling.
This cycle is reversed when a task that claims a volume becomes
terminal. The client frees the volume locally by making "unpublish"
RPCs to the node plugin. The node plugin unmounts the bind-mount from
the allocation and unmounts the volume from the plugin (if it's not in
use by another task). The client will then send an "unpublish" RPC to
the server, which will forward it to the the controller plugin (if
any), and decrement the claim count for the volume. At this point the
volume’s claim capacity has been freed up for scheduling.

[csi-spec]: https://github.com/container-storage-interface/spec
[csi-drivers-list]: https://kubernetes-csi.github.io/docs/drivers.html
Expand Down
13 changes: 7 additions & 6 deletions website/content/docs/job-specification/csi_plugin.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -51,17 +51,18 @@ option.

## Recommendations for Deploying CSI Plugins

CSI plugins run as Nomad jobs but after mounting the volume are not in the
data path for the volume. Jobs that mount volumes write and read directly to
CSI plugins run as Nomad tasks, but after mounting the volume are not in the
data path for the volume. Tasks that mount volumes write and read directly to
the volume via a bind-mount and there is no communication between the job and
the CSI plugin. But when an allocation that mounts a volume stops, Nomad will
need to communicate with the plugin on that allocation's node to unmount the
volume. This has implications on how to deploy CSI plugins:

* During node drains, jobs that claim volumes must be moved before the `node`
or `monolith` plugin for those volumes. You should run `node` or `monolith`
plugins as [`system`][system] jobs and use the `-ignore-system` flag on
`nomad node drain` to ensure that the plugins are running while the node is
* If you are stopping jobs on a node, you must stop tasks that claim
volumes before stopping the `node` or `monolith` plugin for those
volumes. You should run `node` or `monolith` plugins as
[`system`][system] jobs and use the `-ignore-system` flag on `nomad
node drain` to ensure that the plugins are running while the node is
being drained.

* Only one plugin instance of a given plugin ID and type (controller or node)
Expand Down

0 comments on commit 0ed1750

Please sign in to comment.