Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI: implement support for topology #12129

Merged
merged 9 commits into from
Mar 1, 2022
Merged

CSI: implement support for topology #12129

merged 9 commits into from
Mar 1, 2022

Conversation

tgross
Copy link
Member

@tgross tgross commented Feb 24, 2022

Fixes #7669
Fixes #10891
Fixes #11778

Allows the user to create CSI volumes within specific locations using opaque tags ("segments") specific to the storage provider. Update the feasibility checker for CSI volumes so that allocations are only placed on nodes that meet the topology requirements.


Smoke test with an AWS EBS volume created in zone us-east-1a but where all the node plugins are running in us-east-1b:

volume.hcl
id        = "ebs-vol[0]"
name      = "357cb1a0-2a11-400e-8f88-545d9d172039" # CSIVolumeName tag, must be idempotent
type      = "csi"
plugin_id = "aws-ebs0"

capacity_min = "10GiB"
capacity_max = "20GiB"

capability {
  access_mode     = "single-node-writer"
  attachment_mode = "file-system"
}

capability {
  access_mode     = "single-node-writer"
  attachment_mode = "block-device"
}

parameters {
  type = "gp2"
}

topology_request {
  required {
    topology {
      segments {
        "topology.ebs.csi.aws.com/zone" = "us-east-1a"
      }
    }
  }
}
nomad plugin status -verbose
$ nomad plugin status -verbose aws-ebs0
ID                   = aws-ebs0
Provider             = ebs.csi.aws.com
Version              = v1.5.1
Controllers Healthy  = 2
Controllers Expected = 2
Nodes Healthy        = 2
Nodes Expected       = 2

Controller Capabilities
  CREATE_DELETE_VOLUME
  CONTROLLER_ATTACH_DETACH
  LIST_VOLUMES
  GET_CAPACITY
  CREATE_DELETE_SNAPSHOT
  CREATE_LIST_SNAPSHOTS
  CLONE_VOLUME
  ATTACH_READONLY
  EXPAND_VOLUME
  LIST_VOLUMES_PUBLISHED_NODES
  VOLUME_CONDITION
  GET_VOLUME

Node Capabilities
  VOLUME_ACCESSIBILITY_CONSTRAINTS
  STAGE_UNSTAGE_VOLUME
  GET_VOLUME_STATS
  EXPAND_VOLUME
  VOLUME_CONDITION

Accessible Topologies
Node ID   Accessible Topology
69592911  topology.ebs.csi.aws.com/zone=us-east-1b
ad4bd49a  topology.ebs.csi.aws.com/zone=us-east-1b

Allocations
ID                                    Eval ID                               Node ID                               Node Name         Task Group  Version  Desired  Status   Created                    Modified
44742635-11c2-59e1-1f6f-0ad038467e09  b631e682-6bfa-48ae-a44a-0af23bea65fe  69592911-b587-b65d-8172-d657611fff2d  ip-172-31-95-75   controller  0        run      running  2022-02-25T14:45:41-05:00  2022-02-25T14:46:07-05:00
6efa16ba-c958-0ebd-57d6-f8c7b7673de1  b631e682-6bfa-48ae-a44a-0af23bea65fe  ad4bd49a-b648-cc53-03de-f0ab959134c6  ip-172-31-93-121  controller  0        run      running  2022-02-25T14:45:41-05:00  2022-02-25T14:46:02-05:00
087b518c-e07c-9c15-cefb-a5bc086e4294  f374bcb9-baa1-0800-39e3-cf53fec13793  69592911-b587-b65d-8172-d657611fff2d  ip-172-31-95-75   nodes       0        run      running  2022-02-25T14:45:46-05:00  2022-02-25T14:46:06-05:00
3ab67dc4-8557-6303-6509-e8afde76493b  f374bcb9-baa1-0800-39e3-cf53fec13793  ad4bd49a-b648-cc53-03de-f0ab959134c6  ip-172-31-93-121  nodes       0        run      running  2022-02-25T14:45:46-05:00  2022-02-25T14:46:02-05:00

nomad volume status ebs-vol[0]
$ nomad volume status 'ebs-vol[0]'
ID                   = ebs-vol[0]
Name                 = 357cb1a0-2a11-400e-8f88-545d9d172039
External ID          = vol-01147e1b775482e98
Plugin ID            = aws-ebs0
Provider             = ebs.csi.aws.com
Version              = v1.5.1
Schedulable          = true
Controllers Healthy  = 2
Controllers Expected = 2
Nodes Healthy        = 2
Nodes Expected       = 2
Access Mode          = <none>
Attachment Mode      = <none>
Mount Options        = <none>
Namespace            = default

Topologies
Topology  Segments
00        topology.ebs.csi.aws.com/zone=us-east-1a

Allocations
No allocations placed
$ nomad job run ./use-ebs-volume.nomad
==> 2022-02-25T14:49:19-05:00: Monitoring evaluation "dac24a4e"
    2022-02-25T14:49:19-05:00: Evaluation triggered by job "use-ebs-volume"
    2022-02-25T14:49:19-05:00: Evaluation within deployment: "339f31af"
    2022-02-25T14:49:19-05:00: Evaluation status changed: "pending" -> "complete"
==> 2022-02-25T14:49:19-05:00: Evaluation "dac24a4e" finished with status "complete" but failed to place all allocations:
    2022-02-25T14:49:19-05:00: Task Group "group" (failed to place 1 allocation):
      * No nodes are available in datacenter "dc2"
      * Constraint "did not meet topology requirement": 2 nodes excluded by filter
    2022-02-25T14:49:19-05:00: Evaluation "1bbd1a67" waiting for additional capacity to place remainder
==> 2022-02-25T14:49:19-05:00: Monitoring deployment "339f31af"
  ⠇ Deployment "339f31af" in progress...

    2022-02-25T14:49:19-05:00
    ID          = 339f31af
    Job ID      = use-ebs-volume
    Job Version = 0
    Status      = running
    Description = Deployment is running

    Deployed
    Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
    group       1        0       0        0          N/A^C

And then switching to a volume in the right AZ works just fine, as expected.

@tgross tgross added this to the 1.3.0 milestone Feb 24, 2022
@tgross tgross self-assigned this Feb 24, 2022
@tgross tgross marked this pull request as ready for review February 25, 2022 21:23
@tgross
Copy link
Member Author

tgross commented Feb 25, 2022

I'm not sure what's going on in https://app.circleci.com/pipelines/github/hashicorp/nomad/20207/workflows/694b6571-3edf-44f1-bcff-5ad97409f63b/jobs/207214 but that looks like boltdd stuff and unrelated to this PR. Will investigate on Monday.

@tgross
Copy link
Member Author

tgross commented Feb 28, 2022

I've opened #12143 to workaround the test issue for now, and re-run here successfully.

Copy link
Contributor

@lgfa29 lgfa29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT, nice to see this implemented, and the sample CLI outputs were very helpful.

Just missing a CHANGELOG entry 👍

// requisite topologies."
if len(vol.Topologies) > 0 {
var ok bool
for _, requiredTopo := range vol.Topologies {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking my understanding of this feature: at this point the volume is already registered, so we don't need to worry about preferred vs. required. The plugin that handled the registration resolved the supported topologies and we stored them in the state store, so any match here would match the user's request.

Copy link
Member Author

@tgross tgross Mar 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly!

(Which is really good for us, because otherwise we'd have had to implement a new scoring iterator too 😀 )

volume-0.json Outdated Show resolved Hide resolved
Copy link
Member

@shoenig shoenig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! just trivial things

command/volume_register_csi.go Show resolved Hide resolved
e2e/csi/csi.go Outdated Show resolved Hide resolved
command/plugin_status_csi.go Outdated Show resolved Hide resolved
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants