csi-rbdplugin not picking up mapOptions properly #3076

jpmartin2 · 2022-05-03T05:00:03Z

Describe the bug

I have a volume which is failing to mount, I believe because I have network encryption enabled on the cluster and the rbdplugin is not properly passing mapOptions through to the rbd command (to set ms_mode=secure).

Environment details

Image/version of Ceph CSI driver : 3.6.1
Helm chart version : N/A (ceph cluster setup with rook v1.9.2)
Kernel version : 5.16.0 (from debian 11 backports repo)
Mounter used for mounting PVC (for cephFS its fuse or kernel. for rbd its
krbd or rbd-nbd) : krbd
Kubernetes cluster version : 1.22
Ceph cluster version : 17.2

Steps to reproduce

Steps to reproduce the behavior:

Performed fresh install of k3s on a debian 11 system with kernel 5.16.0 from bullseye-backports
Installed rook helm chart v1.9.2
Created ceph cluster, block pool and storageclass using the following manifest:

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: cluster
  namespace: rook-ceph
spec:
  cephVersion:
    allowUnsupported: false
    image: quay.io/ceph/ceph:v17.2.0
  cleanupPolicy:
    allowUninstallWithVolumes: false
    confirmation: ""
    sanitizeDisks:
      dataSource: zero
      iteration: 1
      method: quick
  continueUpgradeAfterChecksEvenIfNotHealthy: false
  crashCollector:
    disable: false
  dashboard:
    enabled: true
    ssl: true
  dataDirHostPath: /var/lib/rook
  disruptionManagement:
    manageMachineDisruptionBudgets: false
    managePodBudgets: true
    osdMaintenanceTimeout: 30
    pgHealthCheckTimeout: 0
  healthCheck:
    daemonHealth:
      mon:
        disabled: false
        interval: 45s
      osd:
        disabled: false
        interval: 60s
      status:
        disabled: false
        interval: 60s
    livenessProbe:
      mon:
        disabled: false
      osd:
        disabled: false
      status:
        disabled: false
    startupProbe:
      mon:
        disabled: false
      osd:
        disabled: false
      status:
        disabled: false
  mgr:
    allowMultiplePerNode: true
    count: 1
    modules:
      - enabled: true
        name: pg_autoscaler
  mon:
    allowMultiplePerNode: true
    count: 1
  monitoring:
    enabled: false
  network:
    connections:
      compression:
        enabled: false
      encryption:
        enabled: true
  priorityClassNames:
    mgr: system-cluster-critical
    mon: system-node-critical
    ods: system-node-critical
  removeOSDsIfOutAndSafeToRemove: false
  skipUpgradeChecks: false
  storage:
    nodes:
      - devices:
          - name: /dev/disk/by-id/<disk 1>
          - name: /dev/disk/by-id/<disk 2>
        name: nas
    useAllDevices: false
    useAllNodes: false
  waitTimeoutForHealthyOSDInMinutes: 10
---
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: ceph-rbd-pool
  namespace: rook-ceph
spec:
  failureDomain: osd
  replicated:
    requireSafeReplicaSize: false
    size: 2
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
allowVolumeExpansion: true
metadata:
  name: ceph-block
  namespace: default
parameters:
  clusterID: rook-ceph
  csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
  csi.storage.k8s.io/fstype: ext4
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  mapOptions: ms_mode=secure
  imageFeatures: layering
  imageFormat: "2"
  pool: ceph-rbd-pool
provisioner: rook-ceph.rbd.csi.ceph.com
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain

Attempt to use dynamic provisioning to create and mount a volume for a StatefulSet, e.g. like:

apiVersion: v1
kind: Namespace
metadata:
  name: gitlab
  namespace: gitlab
---
apiVersion: v1
kind: Service
metadata:
  name: gitlab-headless
  namespace: gitlab
spec:
  clusterIP: None
  externalIPs: []
  ports:
    - port: 8443
  selector:
    cdk8s.statefulset: gitlab-stateful-set-c803a0a9
  type: ClusterIP
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: gitlab
  namespace: gitlab
spec:
  podManagementPolicy: OrderedReady
  replicas: 1
  selector:
    matchLabels:
      cdk8s.statefulset: gitlab-stateful-set-c803a0a9
  serviceName: gitlab-headless
  template:
    metadata:
      labels:
        cdk8s.statefulset: gitlab-stateful-set-c803a0a9
    spec:
      containers:
        - env: []
          image: docker.io/gitlab/gitlab-ce
          imagePullPolicy: Always
          name: main
          ports: []
          securityContext:
            privileged: false
            readOnlyRootFilesystem: false
            runAsNonRoot: false
          volumeMounts:
            - mountPath: /var/log/gitlab
              name: gitlab-persistence
              subPath: logs
            - mountPath: /var/opt/gitlab
              name: gitlab-persistence
              subPath: data
      hostAliases: []
      initContainers: []
      securityContext:
        fsGroupChangePolicy: Always
        runAsNonRoot: false
        sysctls: []
  updateStrategy:
    rollingUpdate:
      partition: 0
    type: RollingUpdate
  volumeClaimTemplates:
    - metadata:
        name: gitlab-persistence
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 50Gi
        storageClassName: ceph-block

Actual results

The PersistentVolume was created, and has mapOptions set as expected in VolumeAttributes:

$ kubectl describe pv 
Name:            pvc-8b9d8010-3367-46ac-984d-f99d178f944c
Labels:          <none>
Annotations:     pv.kubernetes.io/provisioned-by: rook-ceph.rbd.csi.ceph.com
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    ceph-block
Status:          Bound
Claim:           gitlab/gitlab-persistence-gitlab-0
Reclaim Policy:  Retain
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        50Gi
Node Affinity:   <none>
Message:         
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            rook-ceph.rbd.csi.ceph.com
    FSType:            ext4
    VolumeHandle:      0001-0009-rook-ceph-0000000000000001-15b866e8-ca95-11ec-bbe3-22919942180f
    ReadOnly:          false
    VolumeAttributes:      clusterID=rook-ceph
                           imageFeatures=layering
                           imageFormat=2
                           imageName=csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f
                           journalPool=ceph-rbd-pool
                           mapOptions=ms_mode=secure
                           pool=ceph-rbd-pool
                           storage.kubernetes.io/csiProvisionerIdentity=1651550006515-8081-rook-ceph.rbd.csi.ceph.com
Events:                <none>

The PersistentVolumeClaim is created pointing to that PersistentVolume

$ kubectl describe pvc -n gitlab 
Name:          gitlab-persistence-gitlab-0
Namespace:     gitlab
StorageClass:  ceph-block
Status:        Bound
Volume:        pvc-8b9d8010-3367-46ac-984d-f99d178f944c
Labels:        cdk8s.statefulset=gitlab-stateful-set-c803a0a9
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: rook-ceph.rbd.csi.ceph.com
               volume.kubernetes.io/selected-node: nas
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      50Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       gitlab-0
Events:
  Type    Reason                 Age                From                                                                                                       Message
  ----    ------                 ----               ----                                                                                                       -------
  Normal  WaitForFirstConsumer   33m                persistentvolume-controller                                                                                waiting for first consumer to be created before binding
  Normal  ExternalProvisioning   33m (x2 over 33m)  persistentvolume-controller                                                                                waiting for a volume to be created, either by external provisioner "rook-ceph.rbd.csi.ceph.com" or manually created by system administrator
  Normal  Provisioning           33m                rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-d749f95f4-jjc8x_449ebe0c-8544-4d1a-81c8-fb11bf609617  External provisioner is provisioning volume for claim "gitlab/gitlab-persistence-gitlab-0"
  Normal  ProvisioningSucceeded  33m                rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-d749f95f4-jjc8x_449ebe0c-8544-4d1a-81c8-fb11bf609617  Successfully provisioned volume pvc-8b9d8010-3367-46ac-984d-f99d178f944c

However, the volume fails to mount to the pod:

$ kubectl describe pods -n gitlab
Name:           gitlab-0
Namespace:      gitlab
Priority:       0
Node:           nas/192.168.1.175
Start Time:     Mon, 02 May 2022 23:57:01 -0400
Labels:         cdk8s.statefulset=gitlab-stateful-set-c803a0a9
                controller-revision-hash=gitlab-6f64dd98c6
                statefulset.kubernetes.io/pod-name=gitlab-0
Annotations:    <none>
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  StatefulSet/gitlab
Containers:
  main:
    Container ID:   
    Image:          docker.io/gitlab/gitlab-ce
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/log/gitlab from gitlab-persistence (rw,path="logs")
      /var/opt/gitlab from gitlab-persistence (rw,path="data")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rpjwv (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  gitlab-persistence:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  gitlab-persistence-gitlab-0
    ReadOnly:   false
  kube-api-access-rpjwv:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               34m                   default-scheduler        Successfully assigned gitlab/gitlab-0 to nas
  Normal   SuccessfulAttachVolume  34m                   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-8b9d8010-3367-46ac-984d-f99d178f944c"
  Warning  FailedMount             12m (x10 over 32m)    kubelet                  Unable to attach or mount volumes: unmounted volumes=[gitlab-persistence], unattached volumes=[gitlab-persistence kube-api-access-rpjwv]: timed out waiting for the condition
  Warning  FailedMount             3m59s (x23 over 34m)  kubelet                  MountVolume.MountDevice failed for volume "pvc-8b9d8010-3367-46ac-984d-f99d178f944c" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 2) occurred while running rbd args: [--id csi-rbd-node -m 10.43.214.152:3300 --keyfile=***stripped*** map ceph-rbd-pool/csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f --device-type krbd --options noudev], rbd error output: rbd: failed to get mon address (possible ms_mode mismatch)
rbd: map failed: (2) No such file or directory

Notably it is not passing mapOptions through to the rbd call here - which I believe to be required because I configured the cluster with network encryption enabled.

Expected behavior

I expected the rbdplugin to pass the mapOptions through to the rbd map command and successfully mount the image.

Logs

RBD plugin logs:

$ kubectl logs -n rook-ceph csi-rbdplugin-s6tzt csi-rbdplugin 
E0503 03:53:25.631272 2218477 cephcsi.go:196] Failed to get the PID limit, can not reconfigure: could not find a cgroup for 'pids'
W0503 03:57:07.621582 2218477 rbd_attach.go:469] ID: 10 Req-ID: 0001-0009-rook-ceph-0000000000000001-15b866e8-ca95-11ec-bbe3-22919942180f rbd: map error an error (exit status 2) occurred while running rbd args: [--id csi-rbd-node -m 10.43.214.152:3300 --keyfile=***stripped*** map ceph-rbd-pool/csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f --device-type krbd --options noudev], rbd output: rbd: failed to get mon address (possible ms_mode mismatch)
rbd: map failed: (2) No such file or directory
E0503 03:57:07.621772 2218477 utils.go:200] ID: 10 Req-ID: 0001-0009-rook-ceph-0000000000000001-15b866e8-ca95-11ec-bbe3-22919942180f GRPC error: rpc error: code = Internal desc = rbd: map failed with error an error (exit status 2) occurred while running rbd args: [--id csi-rbd-node -m 10.43.214.152:3300 --keyfile=***stripped*** map ceph-rbd-pool/csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f --device-type krbd --options noudev], rbd error output: rbd: failed to get mon address (possible ms_mode mismatch)
rbd: map failed: (2) No such file or directory
W0503 03:57:08.630294 2218477 rbd_attach.go:469] ID: 13 Req-ID: 0001-0009-rook-ceph-0000000000000001-15b866e8-ca95-11ec-bbe3-22919942180f rbd: map error an error (exit status 2) occurred while running rbd args: [--id csi-rbd-node -m 10.43.214.152:3300 --keyfile=***stripped*** map ceph-rbd-pool/csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f --device-type krbd --options noudev], rbd output: rbd: failed to get mon address (possible ms_mode mismatch)
rbd: map failed: (2) No such file or directory
E0503 03:57:08.630547 2218477 utils.go:200] ID: 13 Req-ID: 0001-0009-rook-ceph-0000000000000001-15b866e8-ca95-11ec-bbe3-22919942180f GRPC error: rpc error: code = Internal desc = rbd: map failed with error an error (exit status 2) occurred while running rbd args: [--id csi-rbd-node -m 10.43.214.152:3300 --keyfile=***stripped*** map ceph-rbd-pool/csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f --device-type krbd --options noudev], rbd error output: rbd: failed to get mon address (possible ms_mode mismatch)
rbd: map failed: (2) No such file or directory

The map errors repeat regularly as kubernetes continues attempting to mount the volume.
I don't think the PID limit error E0503 03:53:25.631272 2218477 cephcsi.go:196] Failed to get the PID limit, can not reconfigure: could not find a cgroup for 'pids' has anything to do with this, though I also had a hard time finding any information about what's going on there.

Driver Registrar logs:

$ kubectl logs -n rook-ceph csi-rbdplugin-s6tzt driver-registrar
I0503 03:53:25.407436 2218293 main.go:166] Version: v2.5.0
I0503 03:53:25.407535 2218293 main.go:167] Running node-driver-registrar in mode=registration
I0503 03:53:26.418518 2218293 node_register.go:53] Starting Registration Server at: /registration/rook-ceph.rbd.csi.ceph.com-reg.sock
I0503 03:53:26.418778 2218293 node_register.go:62] Registration Server started at: /registration/rook-ceph.rbd.csi.ceph.com-reg.sock
I0503 03:53:26.418956 2218293 node_register.go:92] Skipping HTTP server because endpoint is set to: ""
I0503 03:53:27.125753 2218293 main.go:102] Received GetInfo call: &InfoRequest{}
I0503 03:53:27.128184 2218293 main.go:109] "Kubelet registration probe created" path="/var/lib/kubelet/plugins/rook-ceph.rbd.csi.ceph.com/registration"
I0503 03:53:30.554232 2218293 main.go:120] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}

I couldn't find any dmesg logs that seemed at all related to the issue, and I also couldn't find any other logs that might indicate why the options weren't being populated.

Additional context

I also tried this with a few different configurations, including various combinations of:

A kubeadm created (single node as well) cluster with v1.23 of kuberetes
Rook v1.9.1
Ceph v16.2.7
Stock debian 11 kernel 5.10
(though I didn't have the foresight to record the specific version of the ceph-csi plugin image in some of the other cases)

To confirm that not passing the mapOptions was the issue, I connected to the rbdplugin pod:

$ kubectl exec -it -n rook-ceph csi-rbdplugin-s6tzt -c csi-rbdplugin -- bash

Snagged a keyfile:

# cp /tmp/csi/keys/* .; while [ $? -ne 0 ]; do; cp /tmp/csi/keys/* .; done

And ran the rbd map command which failed as per the logs, but with the missing ms_mode=secure option:

# rbd --id csi-rbd-node -m 10.43.214.152:3300 --keyfile=keyfile-4081675166 map ceph-rbd-pool/csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f --device-type krbd --options noudev --options ms_mode=secure
/dev/rbd0

Which succeeded

In case it's relevant or helpful (since this was setup with rook), here's more details on the rbdplugin pod:

$ kubectl describe pods -n rook-ceph csi-rbdplugin-s6tzt 
Name:                 csi-rbdplugin-s6tzt
Namespace:            rook-ceph
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 <node info>
Start Time:           Mon, 02 May 2022 23:53:23 -0400
Labels:               app=csi-rbdplugin
                      contains=csi-rbdplugin-metrics
                      controller-revision-hash=5f8bfb554f
                      pod-template-generation=1
Annotations:          <none>
Status:               Running
IP:                   <node ip>
IPs:
  IP:           <node ip>
Controlled By:  DaemonSet/csi-rbdplugin
Containers:
  driver-registrar:
    Container ID:  containerd://64c84fef3216e44049146ea5e813b927aca8292446777fd09ef79a234167421f
    Image:         k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.5.0
    Image ID:      k8s.gcr.io/sig-storage/csi-node-driver-registrar@sha256:4fd21f36075b44d1a423dfb262ad79202ce54e95f5cbc4622a6c1c38ab287ad6
    Port:          <none>
    Host Port:     <none>
    Args:
      --v=0
      --csi-address=/csi/csi.sock
      --kubelet-registration-path=/var/lib/kubelet/plugins/rook-ceph.rbd.csi.ceph.com/csi.sock
    State:          Running
      Started:      Mon, 02 May 2022 23:53:25 -0400
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  256Mi
    Requests:
      cpu:     50m
      memory:  128Mi
    Environment:
      KUBE_NODE_NAME:   (v1:spec.nodeName)
    Mounts:
      /csi from plugin-dir (rw)
      /registration from registration-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r4wkm (ro)
  csi-rbdplugin:
    Container ID:  containerd://f455aca1e583ad053b6e5538dd11611267b3ff4e15fbc66cfe25436bfe3b7eca
    Image:         quay.io/cephcsi/cephcsi:v3.6.1
    Image ID:      quay.io/cephcsi/cephcsi@sha256:6dae8527e43965d13714cf7b3e51d52b9d970a9ac01f10f00dd324d326ec6aea
    Port:          <none>
    Host Port:     <none>
    Args:
      --nodeid=$(NODE_ID)
      --endpoint=$(CSI_ENDPOINT)
      --v=0
      --type=rbd
      --nodeserver=true
      --drivername=rook-ceph.rbd.csi.ceph.com
      --pidlimit=-1
      --metricsport=9090
      --metricspath=/metrics
      --enablegrpcmetrics=false
      --stagingpath=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/
    State:          Running
      Started:      Mon, 02 May 2022 23:53:25 -0400
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     500m
      memory:  1Gi
    Requests:
      cpu:     250m
      memory:  512Mi
    Environment:
      POD_IP:          (v1:status.podIP)
      NODE_ID:         (v1:spec.nodeName)
      POD_NAMESPACE:  rook-ceph (v1:metadata.namespace)
      CSI_ENDPOINT:   unix:///csi/csi.sock
    Mounts:
      /csi from plugin-dir (rw)
      /dev from host-dev (rw)
      /etc/ceph-csi-config/ from ceph-csi-configs (rw)
      /lib/modules from lib-modules (ro)
      /run/mount from host-run-mount (rw)
      /run/secrets/tokens from oidc-token (ro)
      /sys from host-sys (rw)
      /tmp/csi/keys from keys-tmp-dir (rw)
      /var/lib/kubelet/plugins from plugin-mount-dir (rw)
      /var/lib/kubelet/pods from pods-mount-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r4wkm (ro)
  liveness-prometheus:
    Container ID:  containerd://562d9c7338f8e92a0b592655364179b13eaca179e93f7bb197690ebdaffa3555
    Image:         quay.io/cephcsi/cephcsi:v3.6.1
    Image ID:      quay.io/cephcsi/cephcsi@sha256:6dae8527e43965d13714cf7b3e51d52b9d970a9ac01f10f00dd324d326ec6aea
    Port:          <none>
    Host Port:     <none>
    Args:
      --type=liveness
      --endpoint=$(CSI_ENDPOINT)
      --metricsport=9080
      --metricspath=/metrics
      --polltime=60s
      --timeout=3s
    State:          Running
      Started:      Mon, 02 May 2022 23:53:25 -0400
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  256Mi
    Requests:
      cpu:     50m
      memory:  128Mi
    Environment:
      CSI_ENDPOINT:  unix:///csi/csi.sock
      POD_IP:         (v1:status.podIP)
    Mounts:
      /csi from plugin-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r4wkm (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  plugin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins/rook-ceph.rbd.csi.ceph.com
    HostPathType:  DirectoryOrCreate
  plugin-mount-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins
    HostPathType:  Directory
  registration-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins_registry/
    HostPathType:  Directory
  pods-mount-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/pods
    HostPathType:  Directory
  host-dev:
    Type:          HostPath (bare host directory volume)
    Path:          /dev
    HostPathType:  
  host-sys:
    Type:          HostPath (bare host directory volume)
    Path:          /sys
    HostPathType:  
  lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:  
  ceph-csi-configs:
    Type:               Projected (a volume that contains injected data from multiple sources)
    ConfigMapName:      rook-ceph-csi-config
    ConfigMapOptional:  <nil>
    ConfigMapName:      rook-ceph-csi-mapping-config
    ConfigMapOptional:  <nil>
  keys-tmp-dir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  <unset>
  host-run-mount:
    Type:          HostPath (bare host directory volume)
    Path:          /run/mount
    HostPathType:  
  oidc-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3600
  kube-api-access-r4wkm:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason       Age   From               Message
  ----     ------       ----  ----               -------
  Normal   Scheduled    60m   default-scheduler  Successfully assigned rook-ceph/csi-rbdplugin-s6tzt to nas
  Warning  FailedMount  60m   kubelet            MountVolume.SetUp failed for volume "ceph-csi-configs" : failed to sync configmap cache: timed out waiting for the condition
  Normal   Pulled       60m   kubelet            Container image "k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.5.0" already present on machine
  Normal   Created      60m   kubelet            Created container driver-registrar
  Normal   Started      60m   kubelet            Started container driver-registrar
  Normal   Pulled       60m   kubelet            Container image "quay.io/cephcsi/cephcsi:v3.6.1" already present on machine
  Normal   Created      60m   kubelet            Created container csi-rbdplugin
  Normal   Started      60m   kubelet            Started container csi-rbdplugin
  Normal   Pulled       60m   kubelet            Container image "quay.io/cephcsi/cephcsi:v3.6.1" already present on machine
  Normal   Created      60m   kubelet            Created container liveness-prometheus
  Normal   Started      60m   kubelet            Started container liveness-prometheus

The text was updated successfully, but these errors were encountered:

humblec · 2022-05-04T04:07:33Z

Cc @pkalever

For the default mounter the mounter option will not be set in the storageclass and as it is not available in the storageclass same will not be set in the volume context, Because of this the mapOptions are getting discarded. If the mounter is not set assuming it's an rbd mounter. Note:- If the mounter is not set in the storageclass we can set it in the volume context explicitly, Doing this check-in node server to support backward existing volumes and the check is minimal we are not altering the volume context. fixes: ceph#3076 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

Cytrian · 2022-05-04T13:07:27Z

I just had a similar error when setting up a K8s on a newer Flatcar server, because that one now uses
the cgroupv2 layout.
So,

ceph-csi/internal/util/pidlimit.go

Line 39 in 1bd6297

// $ cat /sys/fs/cgroup/pids + *.scope + /pids.max.

is not correct, since there is no directory /sys/fs/cgroup/pids any more.
It's now /sys/fs/cgroup + *.scope + /pids.max

Madhu-1 · 2022-05-04T14:41:35Z

I just had a similar error when setting up a K8s on a newer Flatcar server, because that one now uses the cgroupv2 layout. So,

ceph-csi/internal/util/pidlimit.go

Line 39 in 1bd6297

// $ cat /sys/fs/cgroup/pids + *.scope + /pids.max.

is not correct, since there is no directory /sys/fs/cgroup/pids any more.
It's now /sys/fs/cgroup + *.scope + /pids.max

@Cytrian would you like to send a PR to fix this one?

jpmartin2 · 2022-05-05T01:37:28Z

Ah so based on the PR for the mount issue I can see a workaround for now is to explicitly set the mounter parameter. Was able to get up and running with that for now!

Madhu-1 · 2022-05-05T01:39:01Z

Ah so based on the PR for the mount issue I can see a workaround for now is to explicitly set the mounter parameter. Was able to get up and running with that for now!

Yes that will be the workaround for now 👍

lbogdan · 2022-05-05T14:05:15Z

is not correct, since there is no directory /sys/fs/cgroup/pids any more.
It's now /sys/fs/cgroup + *.scope + /pids.max

This isn't as simple as just changing the path; it should first detect whether we're using cgroups v1 or v2 (e.g. like here: https://github.com/containers/common/blob/a2ec40df56de42ebce03c1198495a79c2873b06e/pkg/cgroups/cgroups_supported.go#L28-L39 ), and use the correct path.

For the default mounter the mounter option will not be set in the storageclass and as it is not available in the storageclass same will not be set in the volume context, Because of this the mapOptions are getting discarded. If the mounter is not set assuming it's an rbd mounter. Note:- If the mounter is not set in the storageclass we can set it in the volume context explicitly, Doing this check-in node server to support backward existing volumes and the check is minimal we are not altering the volume context. fixes: ceph#3076 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

For the default mounter the mounter option will not be set in the storageclass and as it is not available in the storageclass same will not be set in the volume context, Because of this the mapOptions are getting discarded. If the mounter is not set assuming it's an rbd mounter. Note:- If the mounter is not set in the storageclass we can set it in the volume context explicitly, Doing this check-in node server to support backward existing volumes and the check is minimal we are not altering the volume context. fixes: #3076 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>

For the default mounter the mounter option will not be set in the storageclass and as it is not available in the storageclass same will not be set in the volume context, Because of this the mapOptions are getting discarded. If the mounter is not set assuming it's an rbd mounter. Note:- If the mounter is not set in the storageclass we can set it in the volume context explicitly, Doing this check-in node server to support backward existing volumes and the check is minimal we are not altering the volume context. fixes: #3076 Signed-off-by: Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit 7067456)

Madhu-1 mentioned this issue May 4, 2022

rbd: consider rbd as default mounter if not set. #3080

Merged

Madhu-1 mentioned this issue May 4, 2022

setting pids.max failed with cgroups v2 #3085

Closed

mergify bot closed this as completed in #3080 May 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

csi-rbdplugin not picking up mapOptions properly #3076

csi-rbdplugin not picking up mapOptions properly #3076

jpmartin2 commented May 3, 2022 •

edited

Loading

humblec commented May 4, 2022

Cytrian commented May 4, 2022

Madhu-1 commented May 4, 2022

jpmartin2 commented May 5, 2022

Madhu-1 commented May 5, 2022

lbogdan commented May 5, 2022 •

edited

Loading

csi-rbdplugin not picking up mapOptions properly #3076

csi-rbdplugin not picking up mapOptions properly #3076

Comments

jpmartin2 commented May 3, 2022 • edited Loading

Describe the bug

Environment details

Steps to reproduce

Actual results

Expected behavior

Logs

Additional context

humblec commented May 4, 2022

Cytrian commented May 4, 2022

Madhu-1 commented May 4, 2022

jpmartin2 commented May 5, 2022

Madhu-1 commented May 5, 2022

lbogdan commented May 5, 2022 • edited Loading

jpmartin2 commented May 3, 2022 •

edited

Loading

lbogdan commented May 5, 2022 •

edited

Loading