Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.11 csi plugin, volume, job cleanup problems #7743

Closed
Gurpartap opened this issue Apr 18, 2020 · 19 comments
Closed

0.11 csi plugin, volume, job cleanup problems #7743

Gurpartap opened this issue Apr 18, 2020 · 19 comments
Assignees
Milestone

Comments

@Gurpartap
Copy link

Gurpartap commented Apr 18, 2020

stopping and purging csi nodes and controller does not remove the plugin.

restarting client nomads makes nomad plugin status empty.

any stopped jobs that attempted to attach to these csi volumes (and failed in my case), are also not removed from the ui after GC:

curl --request PUT http://127.0.0.1:4646/v1/system/gc
curl --request PUT http://127.0.0.1:4646/v1/system/reconcile/summaries

Screenshot 2020-04-18 at 05 37 35


restarting server nomad shows this in logs (at this point plugin doesn't exist – it was purged and clients were restarted to attempt cleaning csi zombie plugins):

2020-04-17T23:38:28.817Z [ERROR] nomad.fsm: DeleteJob failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:28.817Z [ERROR] nomad.fsm: deregistering job failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:28.819Z [ERROR] nomad.fsm: DeleteJob failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:28.820Z [ERROR] nomad.fsm: deregistering job failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:28.821Z [ERROR] nomad.fsm: DeleteJob failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:28.821Z [ERROR] nomad.fsm: deregistering job failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:29.101Z [ERROR] nomad.fsm: DeleteJob failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:29.101Z [ERROR] nomad.fsm: deregistering job failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:29.336Z [ERROR] nomad.fsm: DeleteJob failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:29.336Z [ERROR] nomad.fsm: deregistering job failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:29.345Z [ERROR] nomad.fsm: DeleteJob failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:29.346Z [ERROR] nomad.fsm: deregistering job failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:29.355Z [ERROR] nomad.fsm: DeleteJob failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:29.356Z [ERROR] nomad.fsm: deregistering job failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:29.368Z [ERROR] nomad.fsm: DeleteJob failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:29.368Z [ERROR] nomad.fsm: deregistering job failed: error="deleting job from plugin: plugin missing: csi-linode-node <nil>"
2020-04-17T23:38:29.397Z [ERROR] nomad.fsm: CSIVolumeDeregister failed: error="volume not found: mydata"
2020-04-17T23:38:29.443Z [ERROR] nomad.fsm: CSIVolumeDeregister failed: error="volume not found: mydata"
2020-04-17T23:38:29.659Z [ERROR] nomad.fsm: CSIVolumeDeregister failed: error="volume not found: mysql"
2020-04-17T23:38:29.659Z [ERROR] nomad.fsm: CSIVolumeDeregister failed: error="volume not found: mydata"

csi-linode-node in these logs was my plugin id, however the mysql volume was only ever created on plugin with id csi-linode.

perhaps i don't yet have exact steps to reproduce, but i'm sure you'll easily hit them with the Linode CSI driver, which doesn't appear to fully work with nomad yet. try for yourself:


$ bat csi-linode-nodes.hcl
job "csi-linode-nodes" {
  datacenters = ["dc1"]
  type        = "system"

  group "nodes" {
    task "plugin" {
      driver = "docker"

      config {
        image = "linode/linode-blockstorage-csi-driver:v0.1.4"

        args = [
          "--endpoint=unix:///csi/csi.sock",
          "--token=YOUR_LINODE_TOKEN_HERE",
          "--url=https://api.linode.com/v4",
          "--node=${attr.unique.hostname}",
          # "--bs-prefix=nomad",
          "--v=2",
        ]

        privileged = true
      }

      csi_plugin {
        id        = "csi-linode"
        type      = "node"
        mount_dir = "/csi"
      }
    }
  }
}

$ bat csi-linode-controller.hcl
job "csi-linode-controller" {
  datacenters = ["dc1"]
  type        = "service"

  group "controllers" {
    count = 1

    update {
      max_parallel = 0
    }

    task "plugin" {
      driver = "docker"

      config {
        image = "linode/linode-blockstorage-csi-driver:v0.1.4"

        args = [
          "--endpoint=unix:///csi/csi.sock",
          "--token= YOUR_LINODE_TOKEN_HERE",
          "--url=https://api.linode.com/v4",
          "--node=${attr.unique.hostname}",
          # "--bs-prefix=nomad",
          "--v=2",
        ]
      }

      csi_plugin {
        id        = "csi-linode"
        type      = "controller"
        mount_dir = "/csi"
      }
    }
  }
}

create your volume at https://cloud.linode.com/volumes


$ bat volume.hcl
type            = "csi"
id              = "mysql"
name            = "mysql"
external_id     = "12345" # Your Linode Volume ID (integer value; not the label)
access_mode     = "single-node-writer"
attachment_mode = "file-system"
plugin_id       = "csi-linode"
//mount_options {
//  fs_type     = "ext4"
//  mount_flags = ["rw"]
//}

$ bat mysql.nomad
job "mysql-server" {
  datacenters = ["dc1"]
  type        = "service"

  group "mysql-server" {
    count = 1

    volume "mysql" {
      type      = "csi"
      read_only = false
      source    = "mysql"
    }

    restart {
      attempts = 10
      interval = "5m"
      delay    = "25s"
      mode     = "delay"
    }

    task "mysql-server" {
      driver = "docker"

      volume_mount {
        volume      = "mysql"
        destination = "/srv"
        read_only   = false
      }

      env = {
        "MYSQL_ROOT_PASSWORD" = "password"
      }

      config {
        image = "hashicorp/mysql-portworx-demo:latest"
        args = ["--datadir", "/srv/mysql"]

        port_map {
          db = 3306
        }
      }

      resources {
        cpu    = 500
        memory = 500

        network {
          port "db" {
            static = 3306
          }
        }
      }

      service {
        name = "mysql-server"
        port = "db"

        check {
          type     = "tcp"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}
@Gurpartap
Copy link
Author

Gurpartap commented Apr 18, 2020

Screenshot 2020-04-18 at 06 04 32

@tgross
Copy link
Member

tgross commented Apr 20, 2020

Hi @Gurpartap!

This error is showing that the plugin got registered with the "-node" suffix somehow:

deleting job from plugin: plugin missing: csi-linode-node <nil>"

If you never used csi-linode-node as a plugin ID, there's a few possible areas where this could be a bug, I think. I'll look into it further and report back.

In the meantime, if you want to clean up your cluster you could probably start the plugins on the client again and then deregister the volume cleanly. You'll always want to deregister volumes for a plugin before shutting down the last plugin allocation.

@Gurpartap
Copy link
Author

Hi @Gurpartap!

hi @tgross

This error is showing that the plugin got registered with the "-node" suffix somehow
i used csi-linode-node as the plugin id for a while. it's still lingering there.

if you want to clean up your cluster you could probably start the plugins on the client again and then deregister the volume cleanly.

i remember deregistering the volumes before stopping csi nodes and controller though. i also tried restarting the plugin with same name to reattempt deregistering and it didn't seem to work.

let me give it another go.

@Gurpartap
Copy link
Author

So with csi_plugin { id = "csi-linode-node" … }:

> nomad job run csi-linode-nodes.nomad
> nomad job run csi-linode-controller.nomad

> nomad volume deregister mydata
Error deregistering volume: Unexpected response code: 500 (rpc error: volume not found: mydata)

> nomad volume deregister mysql
Error deregistering volume: Unexpected response code: 500 (rpc error: volume not found: mysql)

> nomad volume register volume.hcl 
Error registering volume: Unexpected response code: 500 (rpc error: rpc error: validate volume: rpc error: code = NotFound desc = Volume with id 1785357245 not found)

i don't know where this "Volume with id 1785357245" is coming from. my volume id is completely different.

sounds like another plugin issue, possibly something to do with linode driver integration https://github.com/linode/linode-blockstorage-csi-driver/blob/master/pkg/linode-bs/controllerserver.go#L184-L207

@tgross
Copy link
Member

tgross commented Apr 21, 2020

You might be running into what we fixed in #7754. That's going into a 0.11.1 release of Nomad that should be shipping shortly.

@tgross
Copy link
Member

tgross commented Apr 22, 2020

Hi @Gurpartap we shipped 0.11.1 today. If you get a chance, give that a try and see if clears the problem.

@Gurpartap
Copy link
Author

Gurpartap commented Apr 23, 2020

hi @tgross. cleanup issues are still there. on the same dirty cluster:

  • started nodes+controller with plugin id csi-linode-node
  • volume register mysql-volume.hcl
  • attempted to allocate volume to mysql-server task (it didn't work)
  • stop -purge mysql-server
  • volume deregister mysql
  • stop -purge nodes+controller

repeated the same with csi-linode as plugin id, just in case.

still seeing this in logs when restarting one of the nomad servers. in fact, it has added one more log entry.

2020-04-23T03:07:20.437Z [ERROR] nomad.fsm: CSIVolumeDeregister failed: error="volume not found: mydata"
2020-04-23T03:07:20.437Z [ERROR] nomad.fsm: CSIVolumeDeregister failed: error="volume not found: mysql"
2020-04-23T03:07:20.437Z [ERROR] nomad.fsm: CSIVolumeDeregister failed: error="volume not found: mydata"
2020-04-23T03:07:20.438Z [ERROR] nomad.fsm: CSIVolumeDeregister failed: error="volume not found: mysql"
2020-04-23T03:07:20.634Z [ERROR] nomad.fsm: CSIVolumeDeregister failed: error="volume not found: mysql"

other issues observed:

  1. when attempting to allocate a seemingly successfully registered volume to a task, this shows up:
Event Description
Setup Failure failed to setup alloc: pre-run hook "csi_hook" failed: rpc error: code = Unknown desc = Invalid Linode Volume key: "64612"

the error seems to be originating from…

https://github.com/linode/linode-blockstorage-csi-driver/blob/98c7c76d88f5c5c931ddc4389e2ff6149508e375/pkg/linode-bs/nodeserver.go#L162-L179

…which requires (&csi.NodeStageVolumeRequest{}).VolumeID to have a hyphen: https://github.com/linode/linode-blockstorage-csi-driver/blob/e312ce00cf75ed243c4e545d73403c395ff1489d/pkg/common/idhelpers.go#L81-L84

this kind of seems like a question for the linode driver team.

  1. after purging csi-linode-node plugin id and starting with plugin id csi-linode (perhaps not the perfect way to reproduce this), registering volume gave this error:
nomad volume register mysql-volume.hcl
Error registering volume: Unexpected response code: 500 (rpc error: validate volume: plugin csi-linode for type csi-controller not found)

both node and controller plugins were running.

  1. registering volume threw volume id validation errors:
Error registering volume: Unexpected response code: 500 (rpc error: validate volume: rpc error: code = NotFound desc = Volume with id 3989712264 not found)

took a plugin restart and it magically began accepting the same register command.


it sounds like i'm making it increasingly difficult for nomad to comb through this state hell.

i'll be resetting the cluster data and trying afresh later.

@Gurpartap
Copy link
Author

Gurpartap commented Apr 23, 2020

ignoring the cleanup issues, there has been some progress with getting csi working on linode:

  • plugin type=monolith is what worked instead of type=node.
  • plugin type=controller was not required.
  • figured that the volume id must have something after the hyphen. so i tried the volume label and it appears to have worked. this suffix is apparently also used for path discovery. so linode csi driver expects external_id = 64612-testvolume.

this worked all the way through on the first attempt. stars must have aligned perfectly because subsequent attempts are failing. possibly due to race conditions. i've noticed:

  • nomad did not detach volume from host when mysql-server job was purge stopped
  • rerunning mysql-server had some errors regarding volume already mounted. even if i forced mysql-server to run on the host it was mounted on.
  • so i manually detached the volume from the host
  • re-ran mysql-server and it didn't seem to have waited for volume to get attached and immediately errored out the following. (the volume did get attached to the host after a while, but the job did not discover it and reschedule)
Event Description
Setup Failure failed to setup alloc: pre-run hook "csi_hook" failed: rpc error: code = Internal desc = Unable to find device path out of attempted paths: [/dev/disk/by-id/linode-testvolume /dev/disk/by-id/scsi-0Linode_Volume_testvolume]

edit:

  • mysql-server job seems to have attempted a retry on the same host but errored out. testvolume was in fact attached to this client node by nomad.
Event Description
Setup Failure failed to setup alloc: pre-run hook "csi_hook" failed: claim volumes: rpc error: controller publish: attach volume: controller attach volume: rpc error: code = AlreadyExists desc = Volume with id 64612 already attached to node 1984206

@tgross
Copy link
Member

tgross commented Apr 23, 2020

ignoring the cleanup issues, there has been some progress with getting csi working on linode:

  • plugin type=monolith is what worked instead of type=node.
  • plugin type=controller was not required.
  • figured that the volume id must have something after the hyphen. so i tried the volume label and it appears to have worked. this suffix is apparently also used for path discovery. so linode csi driver expects external_id = 64612-testvolume.

This is great. I'm trying to put together a collection of guidelines for various plugins so that we can publish example Nomad jobspecs (maybe for distribution with the plugin upstream if we can convince the authors to do so?). The work you're doing here will help out other Linode + Nomad users immensely, so thank you!


re-ran mysql-server and it didn't seem to have waited for volume to get attached and immediately errored out the following

When a task gets placed we're blocking on sending a public RPC to the controller plugin, and only once that returns are we moving on the stage/publish. I'm wondering if we've missed something in the workflow specific to working with monolithic plugins.


nomad did not detach volume from host when mysql-server job was purge stopped

One of the things we're not doing well currently is dealing with a partial completion of volume detachment (ex. the controller unpublish times out). In #7782 I'm introducing a checkpointing mechanism that should fix that situation.


mysql-server job seems to have attempted a retry on the same host but errored out. testvolume was in fact attached to this client node by nomad.
...
failed to setup alloc: pre-run hook "csi_hook" failed: claim volumes: rpc error: controller publish: attach volume: controller attach volume: rpc error: code = AlreadyExists desc = Volume with id 64612 already attached to node 1984206

The ControllerPublishVolume calls are supposed to be idempotent, but the interesting thing there is that the meaning of the error code AlreadyExists isn't as obvious as I'd have thought:

Indicates that a volume corresponding to the specified volume_id has already been published at the node corresponding to the specified node_id but is incompatible with the specified volume_capability or readonly flag .

Do you have the plugin's alloc logs from that time period? That might help debug what's going on.

@Gurpartap
Copy link
Author

Gurpartap commented Apr 23, 2020

This is great. I'm trying to put together a collection of guidelines for various plugins so that we can publish example Nomad jobspecs (maybe for distribution with the plugin upstream if we can convince the authors to do so?). The work you're doing here will help out other Linode + Nomad users immensely, so thank you!

👍

Do you have the plugin's alloc logs from that time period? That might help debug what's going on.

i can attempt to reproduce the logs.

however, i discovered that the linode/linode-blockstorage-csi-driver:latest image is not, in fact, the latest. it was published a year ago. see https://hub.docker.com/r/linode/linode-blockstorage-csi-driver/tags

so, i'm switching to v0.1.4, and reattempting as follows:

bat csi-linode.nomad
job csi-linode {
  datacenters = ["dc1"]
  type        = "system"

  group monolith {
    task plugin {
      driver = "docker"

      config {
        image = "linode/linode-blockstorage-csi-driver:v0.1.4"
        args = [
          "--endpoint=unix:///csi/csi.sock",
          "--token=YOUR_TOKEN_HERE",
          "--url=https://api.linode.com/v4",
          "--node=${attr.unique.hostname}",
          # "--bs-prefix=nomad",
          "--v=2",
        ]
        privileged = true
      }

      csi_plugin {
        id        = "csi-linode"
        type      = "monolith"
        mount_dir = "/csi"
      }
    }
  }
}

bat volume.hcl
type            = "csi"
id              = "mysql"
name            = "mysql"
external_id     = "64612-testvolume"
access_mode     = "single-node-writer"
attachment_mode = "file-system"
plugin_id       = "csi-linode"
// mount_options {
//   fs_type     = "ext4"
//   mount_flags = ["rw"]
// }

bat mysql-server.nomad
job mysql-server {
  datacenters = ["dc1"]
  type        = "service"

  group mysql-server {
    volume mysql {
      type      = "csi"
      read_only = false
      source    = "mysql"
    }

    restart {
      attempts = 3
      delay    = "10s"
      interval = "1m"
      mode     = "delay"
    }

    task mysql-server {
      driver = "docker"

      volume_mount {
        volume      = "mysql"
        destination = "/srv"
        read_only   = false
      }

      env = {
        "MYSQL_ROOT_PASSWORD" = "password"
      }

      config {
        image = "hashicorp/mysql-portworx-demo:latest"
        args  = ["--datadir", "/srv/mysql"]

        port_map {
          db = 3306
        }
      }

      resources {
        cpu    = 400
        memory = 400

        network {
          port db {
            static = 3306
          }
        }
      }

      service {
        name = "mysql-server"
        port = "db"

        check {
          type     = "tcp"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}
> nomad job stop -purge csi-linode-node
No job(s) with prefix or id "csi-linode-node" found

> nomad job stop -purge csi-linode
No job(s) with prefix or id "csi-linode" found

> nomad plugin status
Container Storage Interface
ID        Provider                 Controllers Healthy/Expected  Nodes Healthy/Expected
csi-lino  linodebs.csi.linode.com  0/6                           0/7
csi-lino  linodebs.csi.linode.com  0/2                           0/7

> nomad run csi-linode.nomad
==> Monitoring evaluation "ca241869"
    Evaluation triggered by job "csi-linode"
    Allocation "f9cbaa71" created: node "62c6c739", group "monolith"
    Allocation "feb60e89" created: node "61e818d7", group "monolith"
    Allocation "177fe21f" created: node "6658fece", group "monolith"
    Allocation "c24f100c" created: node "f8de40d9", group "monolith"
    Allocation "e5589ab0" created: node "a1a74bfa", group "monolith"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "ca241869" finished with status "complete"

> nomad plugin status
Container Storage Interface
ID        Provider                 Controllers Healthy/Expected  Nodes Healthy/Expected
csi-lino  linodebs.csi.linode.com  5/6                           5/7
csi-lino  linodebs.csi.linode.com  0/2                           0/7

> nomad plugin status csi-linode
ID                   = csi-linode
Provider             = linodebs.csi.linode.com
Version              = v0.1.1-0-g7f047dd-dirty
Controllers Healthy  = 5
Controllers Expected = 6
Nodes Healthy        = 5
Nodes Expected       = 7

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created   Modified
f9cbaa71  62c6c739  monolith    0        run      running  1m5s ago  59s ago
feb60e89  61e818d7  monolith    0        run      running  1m5s ago  59s ago
e5589ab0  a1a74bfa  monolith    0        run      running  1m5s ago  54s ago
c24f100c  f8de40d9  monolith    0        run      running  1m5s ago  59s ago
177fe21f  6658fece  monolith    0        run      running  1m5s ago  59s ago

> nomad register volume.hcl
Error registering volume: Unexpected response code: 500 (rpc error: rpc error: validate volume: Volume validation failed, message: ValidateVolumeCapabilities is currently unimplemented for CSI v1.0.0)

this seems to be affected by the cleanup issue. notice the plugin status version (v0.1.1-0-g7f047dd-dirty), which differs from the version i get when i redeploy the plugin with csi_plugin { id=take3-csi-linode } as follows:

> nomad job run csi-linode.nomad
==> Monitoring evaluation "e147759d"
    Evaluation triggered by job "csi-linode"
    Allocation "4ff8c1ad" created: node "a1a74bfa", group "monolith"
    Allocation "a125ee0e" created: node "61e818d7", group "monolith"
    Allocation "cbe299de" created: node "6658fece", group "monolith"
    Allocation "0319ef75" created: node "62c6c739", group "monolith"
    Allocation "32f36774" created: node "f8de40d9", group "monolith"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "e147759d" finished with status "complete"

> nomad plugin status 
Container Storage Interface
ID        Provider                 Controllers Healthy/Expected  Nodes Healthy/Expected
csi-lino  linodebs.csi.linode.com  0/6                           0/7
csi-lino  linodebs.csi.linode.com  0/2                           0/7
take3-cs  linodebs.csi.linode.com  5/5                           5/5

> nomad plugin status take3
ID                   = take3-csi-linode
Provider             = linodebs.csi.linode.com
Version              = v0.1.3-11-g4ff5069-dirty
Controllers Healthy  = 5
Controllers Expected = 5
Nodes Healthy        = 5
Nodes Expected       = 5

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created    Modified
32f36774  f8de40d9  monolith    0        run      running  8m47s ago  8m41s ago
cbe299de  6658fece  monolith    0        run      running  8m47s ago  8m41s ago
a125ee0e  61e818d7  monolith    0        run      running  8m47s ago  8m41s ago
0319ef75  62c6c739  monolith    0        run      running  8m47s ago  8m41s ago
4ff8c1ad  a1a74bfa  monolith    0        run      running  8m47s ago  8m39s ago

looks like the fresh plugin registration picked a recent version (v0.1.3-11-g4ff5069-dirty). perhaps this is the internal version in linode's v0.1.4 driver image. continuing…

> nomad volume register 

no error this time. but it is using a different plugin id! or is it actually using the take3 id behind the scenes and incorrectly showing csi-linode here?

> nomad volume status
Container Storage Interface
ID     Name   Plugin ID   Schedulable  Access Mode
mysql  mysql  csi-linode  false        single-node-writer

> nomad volume status mysql
ID                   = mysql
Name                 = mysql
External ID          = 64612-testvolume
Plugin ID            = csi-linode
Provider             = linodebs.csi.linode.com
Version              = v0.1.1-0-g7f047dd-dirty
Schedulable          = false
Controllers Healthy  = 0
Controllers Expected = 6
Nodes Healthy        = 0
Nodes Expected       = 7
Access Mode          = single-node-writer
Attachment Mode      = file-system
Mount Options        = <none>
Namespace            = default

Allocations
No allocations placed

> nomad job run mysql-server.nomad
==> Monitoring evaluation "ba3c4873"
    Evaluation triggered by job "mysql-server"
    Evaluation within deployment: "6fb3e5e2"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "ba3c4873" finished with status "complete" but failed to place all allocations:
    Task Group "mysql-server" (failed to place 1 allocation):
      * Constraint "CSI plugin csi-linode is unhealthy on client f8de40d9-d5cf-aa83-4937-2d1b31de02ef": 1 nodes excluded by filter
      * Constraint "CSI plugin csi-linode is unhealthy on client 6658fece-2465-be3b-087c-9723aa83b5a8": 1 nodes excluded by filter
      * Constraint "CSI plugin csi-linode is unhealthy on client e18c8ff9-6598-dde9-3c9c-774340987486": 1 nodes excluded by filter
      * Constraint "CSI plugin csi-linode is unhealthy on client fd39878a-0ac0-cf8a-f82c-c966d290eb9c": 1 nodes excluded by filter
      * Constraint "CSI plugin csi-linode is unhealthy on client 62c6c739-7245-6121-4571-9e18baac5025": 1 nodes excluded by filter
      * Constraint "CSI plugin csi-linode is unhealthy on client a1a74bfa-7096-a7e7-079c-83ba889214a3": 1 nodes excluded by filter
      * Constraint "CSI plugin csi-linode is unhealthy on client 61e818d7-b123-06f7-e1dc-f7f2b75cf9ff": 1 nodes excluded by filter
    Evaluation "b82337bc" waiting for additional capacity to place remainder

🧟‍♂️

@Gurpartap
Copy link
Author

i missed changing the plugin id in volume.hcl to take3-csi-linode. nomad should not haved allowed registering the volume, but i attribute this to the cleanup issue.

switching to plugin_id=take3-csi-linode in volume.hcl takes us back to the volume registration error also reported before:

Error registering volume: Unexpected response code: 500 (rpc error: validate volume: Volume validation failed, message: ValidateVolumeCapabilities is currently unimplemented for CSI v1.0.0)


there are several issues here, each interfering with the other. it is difficult to focus on a particular one. what can i do to get rid of the zombie plugins and volumes? registering and deregistering with the same names/ids has not helped so far.

@tgross
Copy link
Member

tgross commented Apr 29, 2020

however, i discovered that the linode/linode-blockstorage-csi-driver:latest image is not, in fact, the latest. it was published a year ago. see https://hub.docker.com/r/linode/linode-blockstorage-csi-driver/tags
so, i'm switching to v0.1.4, and reattempting as follows:

It looks like their CI system pushed tags for 0.1.5 and 0.1.6 in the last 24h as well. You might want to try staying as current as possible, as the plugin is probably fairly experimental.

this seems to be affected by the cleanup issue. notice the plugin status version (v0.1.1-0-g7f047dd-dirty), which differs from the version i get when i redeploy the plugin with csi_plugin { id=take3-csi-linode } as follows:

We should be updating the plugin version based on fingerprinting, so if there's different versions for the same plugin ID at the same time they may be overwriting each other. That's not a problem when plugins replace each other properly but with the "zombie" plugin issue it'll show up.


The error message you're receiving:

Error registering volume: Unexpected response code: 500 (rpc error: validate volume: Volume validation failed, message: ValidateVolumeCapabilities is currently unimplemented for CSI v1.0.0)

Is coming up from the plugin in controllerserver.go#L325. The ValidateVolumeCapabilities is a required RPC for the CSI spec, but from reading their code, they're not setting the Confirmed field so that's being treated as an error by Nomad.

The Nomad client code (ref plugin/csi/client.go#L321):

	if resp.Confirmed == nil {
		if resp.Message != "" {
			return fmt.Errorf("Volume validation failed, message: %s", resp.Message)
		}

		return fmt.Errorf("Volume validation failed")
	}

We may need to call GetConfirmed() there instead of Confirmed but I'm fairly certain the results are the same either way. I'll see if I can verify that, but I may need to open an issue with the good folks over at Linode if so. Even if they just set a dummy value that would be better. (It's unclear to me how k8s treats this.)


there are several issues here, each interfering with the other. it is difficult to focus on a particular one. what can i do to get rid of the zombie plugins and volumes? registering and deregistering with the same names/ids has not helped so far.

Sorry about that. I have #7825 and #7817 open to try to improve the situation there. If you can't wipe the state store of your test cluster and start over then keep an eye on the master branch of Nomad over the next week as some improvements land.

In the meantime as a workaround I would strongly suggest trying to use a different plugin ID for the new version you're using. If you use non-overlapping IDs for everything so that the "zombie" plugins and volumes have different IDs, then there shouldn't be any interference with the actual operation... just log noise.

@tgross
Copy link
Member

tgross commented Apr 29, 2020

After a pass thru k8s and the spec I'm seeing we are incorrectly validating this response. The spec says:

Confirmed indicates to the CO the set of capabilities that the
plugin has validated. This field SHALL only be set to a non-empty
value for successful validation responses.

Which means that if the plugin has validated the capabilities we should be checking to make sure they match what we expect, but if the plugin doesn't validate them that's not actually an error condition. It just means the plugin doesn't care to give us a response. I might not have written the spec that way but it's definitely a bug in Nomad. Should be a straightforward fix.

@tgross
Copy link
Member

tgross commented Apr 29, 2020

I've opened #7831 with a patch to fix the validation.

@tgross
Copy link
Member

tgross commented Apr 30, 2020

I'm working up a PR #7844 which should clear up the plugin cleanup. I need to check a few more things out but I'm making good progress on it.

The master branch now has the patch for the volume validation. I don't have a Linode setup handy at the moment but if you don't get a chance to try it, I'll take a crack at getting it tested myself.

@tgross
Copy link
Member

tgross commented May 5, 2020

#7844 has been merged as well, so that should fix some of the plugin cleanup issues. #7825 is next on my list so that'll finish this issue up.

@tgross tgross self-assigned this May 5, 2020
@tgross
Copy link
Member

tgross commented May 11, 2020

I've wrapped up #7825 for periodic plugin and volume claim GC and that'll ship in 0.11.2

@tgross
Copy link
Member

tgross commented May 13, 2020

Closed via #7831 #7844 and #7825, all shipping in 0.11.2 shortly.

@github-actions
Copy link

github-actions bot commented Nov 7, 2022

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants