Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI volume creation and provisioning drop mount capabilities #10644

Closed
sundbry opened this issue May 23, 2021 · 2 comments · Fixed by #10643
Closed

CSI volume creation and provisioning drop mount capabilities #10644

sundbry opened this issue May 23, 2021 · 2 comments · Fixed by #10643
Assignees
Milestone

Comments

@sundbry
Copy link
Contributor

sundbry commented May 23, 2021

Nomad version

1.1.0

Operating system and Environment details

Issue

CSI VolumeCapability.MountVolume capabilities are not included during CSI CreateVolume and ControllerPublishVolume requests. When a CSI driver receives the corresponding NodeStageVolume (which does include the VolumeCapability.MountVolume data), it may the staging request since the volume was not provisioned with matching capabilities for the specified filesystem.

This may not be a problem with some implementations of CSI, however it is a violation of the specification as the mount/filesystem capabilities specified during CreateVolume should be a superset of those used during ControllerPublishVolume/NodeStageVolume:

message CreateVolume {
  ...
  // The capabilities that the provisioned volume MUST have. SP MUST
  // provision a volume that will satisfy ALL of the capabilities
  // specified in this list. Otherwise SP MUST return the appropriate
  // gRPC error code.
  // The Plugin MUST assume that the CO MAY use the provisioned volume
  // with ANY of the capabilities specified in this list.
  // For example, a CO MAY specify two volume capabilities: one with
  // access mode SINGLE_NODE_WRITER and another with access mode
  // MULTI_NODE_READER_ONLY. In this case, the SP MUST verify that the
  // provisioned volume can be used in either mode.
  // This also enables the CO to do early validation: If ANY of the
  // specified volume capabilities are not supported by the SP, the call
  // MUST return the appropriate gRPC error code.
  // This field is REQUIRED.
  repeated VolumeCapability volume_capabilities = 3;
}

Reproduction steps

Run a CSI daemon with verbose enough logging to print RPCs received. Create a volume using terraform-provider-nomad v1.4.15:

resource "nomad_external_volume" example_volume_0" {
  type            = "csi"
  plugin_id       = "ember-csi"
  volume_id       = "example[0]"
  name            = "example[0]"
  capacity_min = "10GiB"
  capacity_max = "20GiB"

  capability {
    access_mode     = "single-node-writer"
    attachment_mode = "file-system"
  }

  mount_options {
    fs_type = "btrfs"
  }
}

Run a job with the referenced volume:

job "example" {
    ...
    volume "data" {
      type = "csi"
      source = "example"
      attachment_mode = "file-system"
      access_mode = "single-node-writer"
      per_alloc = true
      mount_options {
        fs_type = "btrfs"
      }
    }
}

Nomad Server logs (if appropriate)

2021-05-23T05:38:39.525Z [ERROR] client.alloc_runner: prerun failed: alloc_id=77067299-1e5b-4e4e-516c-1d67e3da65e7 error="pre-run hook "csi_hook" failed: volume "bd6289a8-6ac0-4a96-a05a-526e1602ea5b" is already staged to "/var/nomad/client/csi/monolith/ember-csi/staging/example[0]/rw-file-system-single-node-writer" but with incompatible capabilities for this request: rpc error: code = AlreadyExists desc = Volume already published in that path with different capabilities"

Ember CSI Server logs

2021-05-23 05:36:33 default INFO ember_csi.common [req-5ec1c8e4-d986-4bac-8740-8b0d417d18c3] => GRPC CreateVolume example[0]
2021-05-23 05:36:33 default DEBUG ember_csi.common [req-5ec1c8e4-d986-4bac-8740-8b0d417d18c3] With params:
        name: "example[0]"
        capacity_range {
          required_bytes: 10737418240
          limit_bytes: 21474836480
        }
        volume_capabilities {
          mount {
          }
          access_mode {
            mode: SINGLE_NODE_WRITER
          }
        }
        accessibility_requirements {
        } dolog /usr/lib/python3.8/site-packages/ember_csi/common.py:127
2021-05-23 05:36:33 default INFO ember_csi.common [req-5ec1c8e4-d986-4bac-8740-8b0d417d18c3] <= GRPC CreateVolume (id = bd6289a8-6ac0-4a96-a05a-526e1602ea5
b) served in 0s
2021-05-23 05:36:33 default DEBUG ember_csi.common [req-5ec1c8e4-d986-4bac-8740-8b0d417d18c3] Returns:
                volume {
                  capacity_bytes: 10737418240
                  volume_id: "bd6289a8-6ac0-4a96-a05a-526e1602ea5b"
                } dolog /usr/lib/python3.8/site-packages/e
mber_csi/common.py:150
2021-05-23 05:36:39 default INFO ember_csi.common [req-9b2caa0d-dc2c-44fd-882f-97023df9ac6b] => GRPC ControllerPublishVolume bd6289a8-6ac0-4a96-a05a-526e16
02ea5b
2021-05-23 05:36:39 default DEBUG ember_csi.common [req-9b2caa0d-dc2c-44fd-882f-97023df9ac6b] With params:
        volume_id: "bd6289a8-6ac0-4a96-a05a-526e1602ea5b"
        node_id: "ember-csi.io.localhost"
        volume_capability {
          mount {
          }
          access_mode {
            mode: SINGLE_NODE_WRITER
          }
        } dolog /usr/lib/python3.8/site-packages/ember_csi
/common.py:127
2021-05-23 05:36:39 default INFO ember_csi.common [req-9b2caa0d-dc2c-44fd-882f-97023df9ac6b] <= GRPC ControllerPublishVolume served in 0sESC[00m
2021-05-23 05:36:39 default DEBUG ember_csi.common [req-9b2caa0d-dc2c-44fd-882f-97023df9ac6b] Returns:
        nothing dolog /usr/lib/python3.8/site-packages/ember_csi/common.py:150
2021-05-23 05:36:39 default INFO ember_csi.common [req-bef10ff8-3ec0-4469-a0f5-674f8150b8d4] => GRPC NodeStageVolume bd6289a8-6ac0-4a96-a05a-526e1602ea5bES
2021-05-23 05:36:39 default DEBUG ember_csi.common [req-bef10ff8-3ec0-4469-a0f5-674f8150b8d4] With params:
        volume_id: "bd6289a8-6ac0-4a96-a05a-526e1602ea5b"
        staging_target_path: "/var/nomad/client/csi/monolith/ember-csi/staging/example[0]/rw-file-system-single-node-writer"
        volume_capability {
          mount {
            fs_type: "btrfs"
          }
          access_mode {
            mode: SINGLE_NODE_WRITER
          }
        } dolog /usr/lib/python3.8/site-packages/ember_csi/common.py:127
2021-05-23 05:36:39 default ERROR ember_csi.common [req-bef10ff8-3ec0-4469-a0f5-674f8150b8d4] !! GRPC NodeStageVolume failed in 0s with ALREADY_EXISTS (b'Volume already published in that path with different capabilities'): Exception
@tgross
Copy link
Member

tgross commented May 24, 2021

I've merged your fix in #10643 and this will ship in the next patch release of Nomad.

@tgross tgross added this to the 1.1.1 milestone May 24, 2021
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants