Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csi-rbdplugin not picking up mapOptions properly #3076

Closed
jpmartin2 opened this issue May 3, 2022 · 6 comments · Fixed by #3080
Closed

csi-rbdplugin not picking up mapOptions properly #3076

jpmartin2 opened this issue May 3, 2022 · 6 comments · Fixed by #3080

Comments

@jpmartin2
Copy link

jpmartin2 commented May 3, 2022

Describe the bug

I have a volume which is failing to mount, I believe because I have network encryption enabled on the cluster and the rbdplugin is not properly passing mapOptions through to the rbd command (to set ms_mode=secure).

Environment details

  • Image/version of Ceph CSI driver : 3.6.1
  • Helm chart version : N/A (ceph cluster setup with rook v1.9.2)
  • Kernel version : 5.16.0 (from debian 11 backports repo)
  • Mounter used for mounting PVC (for cephFS its fuse or kernel. for rbd its
    krbd or rbd-nbd) : krbd
  • Kubernetes cluster version : 1.22
  • Ceph cluster version : 17.2

Steps to reproduce

Steps to reproduce the behavior:

  1. Performed fresh install of k3s on a debian 11 system with kernel 5.16.0 from bullseye-backports
  2. Installed rook helm chart v1.9.2
  3. Created ceph cluster, block pool and storageclass using the following manifest:
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: cluster
  namespace: rook-ceph
spec:
  cephVersion:
    allowUnsupported: false
    image: quay.io/ceph/ceph:v17.2.0
  cleanupPolicy:
    allowUninstallWithVolumes: false
    confirmation: ""
    sanitizeDisks:
      dataSource: zero
      iteration: 1
      method: quick
  continueUpgradeAfterChecksEvenIfNotHealthy: false
  crashCollector:
    disable: false
  dashboard:
    enabled: true
    ssl: true
  dataDirHostPath: /var/lib/rook
  disruptionManagement:
    manageMachineDisruptionBudgets: false
    managePodBudgets: true
    osdMaintenanceTimeout: 30
    pgHealthCheckTimeout: 0
  healthCheck:
    daemonHealth:
      mon:
        disabled: false
        interval: 45s
      osd:
        disabled: false
        interval: 60s
      status:
        disabled: false
        interval: 60s
    livenessProbe:
      mon:
        disabled: false
      osd:
        disabled: false
      status:
        disabled: false
    startupProbe:
      mon:
        disabled: false
      osd:
        disabled: false
      status:
        disabled: false
  mgr:
    allowMultiplePerNode: true
    count: 1
    modules:
      - enabled: true
        name: pg_autoscaler
  mon:
    allowMultiplePerNode: true
    count: 1
  monitoring:
    enabled: false
  network:
    connections:
      compression:
        enabled: false
      encryption:
        enabled: true
  priorityClassNames:
    mgr: system-cluster-critical
    mon: system-node-critical
    ods: system-node-critical
  removeOSDsIfOutAndSafeToRemove: false
  skipUpgradeChecks: false
  storage:
    nodes:
      - devices:
          - name: /dev/disk/by-id/<disk 1>
          - name: /dev/disk/by-id/<disk 2>
        name: nas
    useAllDevices: false
    useAllNodes: false
  waitTimeoutForHealthyOSDInMinutes: 10
---
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: ceph-rbd-pool
  namespace: rook-ceph
spec:
  failureDomain: osd
  replicated:
    requireSafeReplicaSize: false
    size: 2
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
allowVolumeExpansion: true
metadata:
  name: ceph-block
  namespace: default
parameters:
  clusterID: rook-ceph
  csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
  csi.storage.k8s.io/fstype: ext4
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  mapOptions: ms_mode=secure
  imageFeatures: layering
  imageFormat: "2"
  pool: ceph-rbd-pool
provisioner: rook-ceph.rbd.csi.ceph.com
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
  1. Attempt to use dynamic provisioning to create and mount a volume for a StatefulSet, e.g. like:
apiVersion: v1
kind: Namespace
metadata:
  name: gitlab
  namespace: gitlab
---
apiVersion: v1
kind: Service
metadata:
  name: gitlab-headless
  namespace: gitlab
spec:
  clusterIP: None
  externalIPs: []
  ports:
    - port: 8443
  selector:
    cdk8s.statefulset: gitlab-stateful-set-c803a0a9
  type: ClusterIP
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: gitlab
  namespace: gitlab
spec:
  podManagementPolicy: OrderedReady
  replicas: 1
  selector:
    matchLabels:
      cdk8s.statefulset: gitlab-stateful-set-c803a0a9
  serviceName: gitlab-headless
  template:
    metadata:
      labels:
        cdk8s.statefulset: gitlab-stateful-set-c803a0a9
    spec:
      containers:
        - env: []
          image: docker.io/gitlab/gitlab-ce
          imagePullPolicy: Always
          name: main
          ports: []
          securityContext:
            privileged: false
            readOnlyRootFilesystem: false
            runAsNonRoot: false
          volumeMounts:
            - mountPath: /var/log/gitlab
              name: gitlab-persistence
              subPath: logs
            - mountPath: /var/opt/gitlab
              name: gitlab-persistence
              subPath: data
      hostAliases: []
      initContainers: []
      securityContext:
        fsGroupChangePolicy: Always
        runAsNonRoot: false
        sysctls: []
  updateStrategy:
    rollingUpdate:
      partition: 0
    type: RollingUpdate
  volumeClaimTemplates:
    - metadata:
        name: gitlab-persistence
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 50Gi
        storageClassName: ceph-block

Actual results

The PersistentVolume was created, and has mapOptions set as expected in VolumeAttributes:

$ kubectl describe pv 
Name:            pvc-8b9d8010-3367-46ac-984d-f99d178f944c
Labels:          <none>
Annotations:     pv.kubernetes.io/provisioned-by: rook-ceph.rbd.csi.ceph.com
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    ceph-block
Status:          Bound
Claim:           gitlab/gitlab-persistence-gitlab-0
Reclaim Policy:  Retain
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        50Gi
Node Affinity:   <none>
Message:         
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            rook-ceph.rbd.csi.ceph.com
    FSType:            ext4
    VolumeHandle:      0001-0009-rook-ceph-0000000000000001-15b866e8-ca95-11ec-bbe3-22919942180f
    ReadOnly:          false
    VolumeAttributes:      clusterID=rook-ceph
                           imageFeatures=layering
                           imageFormat=2
                           imageName=csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f
                           journalPool=ceph-rbd-pool
                           mapOptions=ms_mode=secure
                           pool=ceph-rbd-pool
                           storage.kubernetes.io/csiProvisionerIdentity=1651550006515-8081-rook-ceph.rbd.csi.ceph.com
Events:                <none>

The PersistentVolumeClaim is created pointing to that PersistentVolume

$ kubectl describe pvc -n gitlab 
Name:          gitlab-persistence-gitlab-0
Namespace:     gitlab
StorageClass:  ceph-block
Status:        Bound
Volume:        pvc-8b9d8010-3367-46ac-984d-f99d178f944c
Labels:        cdk8s.statefulset=gitlab-stateful-set-c803a0a9
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: rook-ceph.rbd.csi.ceph.com
               volume.kubernetes.io/selected-node: nas
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      50Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       gitlab-0
Events:
  Type    Reason                 Age                From                                                                                                       Message
  ----    ------                 ----               ----                                                                                                       -------
  Normal  WaitForFirstConsumer   33m                persistentvolume-controller                                                                                waiting for first consumer to be created before binding
  Normal  ExternalProvisioning   33m (x2 over 33m)  persistentvolume-controller                                                                                waiting for a volume to be created, either by external provisioner "rook-ceph.rbd.csi.ceph.com" or manually created by system administrator
  Normal  Provisioning           33m                rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-d749f95f4-jjc8x_449ebe0c-8544-4d1a-81c8-fb11bf609617  External provisioner is provisioning volume for claim "gitlab/gitlab-persistence-gitlab-0"
  Normal  ProvisioningSucceeded  33m                rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-d749f95f4-jjc8x_449ebe0c-8544-4d1a-81c8-fb11bf609617  Successfully provisioned volume pvc-8b9d8010-3367-46ac-984d-f99d178f944c

However, the volume fails to mount to the pod:

$ kubectl describe pods -n gitlab
Name:           gitlab-0
Namespace:      gitlab
Priority:       0
Node:           nas/192.168.1.175
Start Time:     Mon, 02 May 2022 23:57:01 -0400
Labels:         cdk8s.statefulset=gitlab-stateful-set-c803a0a9
                controller-revision-hash=gitlab-6f64dd98c6
                statefulset.kubernetes.io/pod-name=gitlab-0
Annotations:    <none>
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  StatefulSet/gitlab
Containers:
  main:
    Container ID:   
    Image:          docker.io/gitlab/gitlab-ce
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/log/gitlab from gitlab-persistence (rw,path="logs")
      /var/opt/gitlab from gitlab-persistence (rw,path="data")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rpjwv (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  gitlab-persistence:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  gitlab-persistence-gitlab-0
    ReadOnly:   false
  kube-api-access-rpjwv:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               34m                   default-scheduler        Successfully assigned gitlab/gitlab-0 to nas
  Normal   SuccessfulAttachVolume  34m                   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-8b9d8010-3367-46ac-984d-f99d178f944c"
  Warning  FailedMount             12m (x10 over 32m)    kubelet                  Unable to attach or mount volumes: unmounted volumes=[gitlab-persistence], unattached volumes=[gitlab-persistence kube-api-access-rpjwv]: timed out waiting for the condition
  Warning  FailedMount             3m59s (x23 over 34m)  kubelet                  MountVolume.MountDevice failed for volume "pvc-8b9d8010-3367-46ac-984d-f99d178f944c" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 2) occurred while running rbd args: [--id csi-rbd-node -m 10.43.214.152:3300 --keyfile=***stripped*** map ceph-rbd-pool/csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f --device-type krbd --options noudev], rbd error output: rbd: failed to get mon address (possible ms_mode mismatch)
rbd: map failed: (2) No such file or directory

Notably it is not passing mapOptions through to the rbd call here - which I believe to be required because I configured the cluster with network encryption enabled.

Expected behavior

I expected the rbdplugin to pass the mapOptions through to the rbd map command and successfully mount the image.

Logs

RBD plugin logs:

$ kubectl logs -n rook-ceph csi-rbdplugin-s6tzt csi-rbdplugin 
E0503 03:53:25.631272 2218477 cephcsi.go:196] Failed to get the PID limit, can not reconfigure: could not find a cgroup for 'pids'
W0503 03:57:07.621582 2218477 rbd_attach.go:469] ID: 10 Req-ID: 0001-0009-rook-ceph-0000000000000001-15b866e8-ca95-11ec-bbe3-22919942180f rbd: map error an error (exit status 2) occurred while running rbd args: [--id csi-rbd-node -m 10.43.214.152:3300 --keyfile=***stripped*** map ceph-rbd-pool/csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f --device-type krbd --options noudev], rbd output: rbd: failed to get mon address (possible ms_mode mismatch)
rbd: map failed: (2) No such file or directory
E0503 03:57:07.621772 2218477 utils.go:200] ID: 10 Req-ID: 0001-0009-rook-ceph-0000000000000001-15b866e8-ca95-11ec-bbe3-22919942180f GRPC error: rpc error: code = Internal desc = rbd: map failed with error an error (exit status 2) occurred while running rbd args: [--id csi-rbd-node -m 10.43.214.152:3300 --keyfile=***stripped*** map ceph-rbd-pool/csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f --device-type krbd --options noudev], rbd error output: rbd: failed to get mon address (possible ms_mode mismatch)
rbd: map failed: (2) No such file or directory
W0503 03:57:08.630294 2218477 rbd_attach.go:469] ID: 13 Req-ID: 0001-0009-rook-ceph-0000000000000001-15b866e8-ca95-11ec-bbe3-22919942180f rbd: map error an error (exit status 2) occurred while running rbd args: [--id csi-rbd-node -m 10.43.214.152:3300 --keyfile=***stripped*** map ceph-rbd-pool/csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f --device-type krbd --options noudev], rbd output: rbd: failed to get mon address (possible ms_mode mismatch)
rbd: map failed: (2) No such file or directory
E0503 03:57:08.630547 2218477 utils.go:200] ID: 13 Req-ID: 0001-0009-rook-ceph-0000000000000001-15b866e8-ca95-11ec-bbe3-22919942180f GRPC error: rpc error: code = Internal desc = rbd: map failed with error an error (exit status 2) occurred while running rbd args: [--id csi-rbd-node -m 10.43.214.152:3300 --keyfile=***stripped*** map ceph-rbd-pool/csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f --device-type krbd --options noudev], rbd error output: rbd: failed to get mon address (possible ms_mode mismatch)
rbd: map failed: (2) No such file or directory

The map errors repeat regularly as kubernetes continues attempting to mount the volume.
I don't think the PID limit error E0503 03:53:25.631272 2218477 cephcsi.go:196] Failed to get the PID limit, can not reconfigure: could not find a cgroup for 'pids' has anything to do with this, though I also had a hard time finding any information about what's going on there.

Driver Registrar logs:

$ kubectl logs -n rook-ceph csi-rbdplugin-s6tzt driver-registrar
I0503 03:53:25.407436 2218293 main.go:166] Version: v2.5.0
I0503 03:53:25.407535 2218293 main.go:167] Running node-driver-registrar in mode=registration
I0503 03:53:26.418518 2218293 node_register.go:53] Starting Registration Server at: /registration/rook-ceph.rbd.csi.ceph.com-reg.sock
I0503 03:53:26.418778 2218293 node_register.go:62] Registration Server started at: /registration/rook-ceph.rbd.csi.ceph.com-reg.sock
I0503 03:53:26.418956 2218293 node_register.go:92] Skipping HTTP server because endpoint is set to: ""
I0503 03:53:27.125753 2218293 main.go:102] Received GetInfo call: &InfoRequest{}
I0503 03:53:27.128184 2218293 main.go:109] "Kubelet registration probe created" path="/var/lib/kubelet/plugins/rook-ceph.rbd.csi.ceph.com/registration"
I0503 03:53:30.554232 2218293 main.go:120] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}

I couldn't find any dmesg logs that seemed at all related to the issue, and I also couldn't find any other logs that might indicate why the options weren't being populated.

Additional context

I also tried this with a few different configurations, including various combinations of:

  • A kubeadm created (single node as well) cluster with v1.23 of kuberetes
  • Rook v1.9.1
  • Ceph v16.2.7
  • Stock debian 11 kernel 5.10
    (though I didn't have the foresight to record the specific version of the ceph-csi plugin image in some of the other cases)

To confirm that not passing the mapOptions was the issue, I connected to the rbdplugin pod:

$ kubectl exec -it -n rook-ceph csi-rbdplugin-s6tzt -c csi-rbdplugin -- bash

Snagged a keyfile:

# cp /tmp/csi/keys/* .; while [ $? -ne 0 ]; do; cp /tmp/csi/keys/* .; done

And ran the rbd map command which failed as per the logs, but with the missing ms_mode=secure option:

# rbd --id csi-rbd-node -m 10.43.214.152:3300 --keyfile=keyfile-4081675166 map ceph-rbd-pool/csi-vol-15b866e8-ca95-11ec-bbe3-22919942180f --device-type krbd --options noudev --options ms_mode=secure
/dev/rbd0

Which succeeded

In case it's relevant or helpful (since this was setup with rook), here's more details on the rbdplugin pod:

$ kubectl describe pods -n rook-ceph csi-rbdplugin-s6tzt 
Name:                 csi-rbdplugin-s6tzt
Namespace:            rook-ceph
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 <node info>
Start Time:           Mon, 02 May 2022 23:53:23 -0400
Labels:               app=csi-rbdplugin
                      contains=csi-rbdplugin-metrics
                      controller-revision-hash=5f8bfb554f
                      pod-template-generation=1
Annotations:          <none>
Status:               Running
IP:                   <node ip>
IPs:
  IP:           <node ip>
Controlled By:  DaemonSet/csi-rbdplugin
Containers:
  driver-registrar:
    Container ID:  containerd://64c84fef3216e44049146ea5e813b927aca8292446777fd09ef79a234167421f
    Image:         k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.5.0
    Image ID:      k8s.gcr.io/sig-storage/csi-node-driver-registrar@sha256:4fd21f36075b44d1a423dfb262ad79202ce54e95f5cbc4622a6c1c38ab287ad6
    Port:          <none>
    Host Port:     <none>
    Args:
      --v=0
      --csi-address=/csi/csi.sock
      --kubelet-registration-path=/var/lib/kubelet/plugins/rook-ceph.rbd.csi.ceph.com/csi.sock
    State:          Running
      Started:      Mon, 02 May 2022 23:53:25 -0400
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  256Mi
    Requests:
      cpu:     50m
      memory:  128Mi
    Environment:
      KUBE_NODE_NAME:   (v1:spec.nodeName)
    Mounts:
      /csi from plugin-dir (rw)
      /registration from registration-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r4wkm (ro)
  csi-rbdplugin:
    Container ID:  containerd://f455aca1e583ad053b6e5538dd11611267b3ff4e15fbc66cfe25436bfe3b7eca
    Image:         quay.io/cephcsi/cephcsi:v3.6.1
    Image ID:      quay.io/cephcsi/cephcsi@sha256:6dae8527e43965d13714cf7b3e51d52b9d970a9ac01f10f00dd324d326ec6aea
    Port:          <none>
    Host Port:     <none>
    Args:
      --nodeid=$(NODE_ID)
      --endpoint=$(CSI_ENDPOINT)
      --v=0
      --type=rbd
      --nodeserver=true
      --drivername=rook-ceph.rbd.csi.ceph.com
      --pidlimit=-1
      --metricsport=9090
      --metricspath=/metrics
      --enablegrpcmetrics=false
      --stagingpath=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/
    State:          Running
      Started:      Mon, 02 May 2022 23:53:25 -0400
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     500m
      memory:  1Gi
    Requests:
      cpu:     250m
      memory:  512Mi
    Environment:
      POD_IP:          (v1:status.podIP)
      NODE_ID:         (v1:spec.nodeName)
      POD_NAMESPACE:  rook-ceph (v1:metadata.namespace)
      CSI_ENDPOINT:   unix:///csi/csi.sock
    Mounts:
      /csi from plugin-dir (rw)
      /dev from host-dev (rw)
      /etc/ceph-csi-config/ from ceph-csi-configs (rw)
      /lib/modules from lib-modules (ro)
      /run/mount from host-run-mount (rw)
      /run/secrets/tokens from oidc-token (ro)
      /sys from host-sys (rw)
      /tmp/csi/keys from keys-tmp-dir (rw)
      /var/lib/kubelet/plugins from plugin-mount-dir (rw)
      /var/lib/kubelet/pods from pods-mount-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r4wkm (ro)
  liveness-prometheus:
    Container ID:  containerd://562d9c7338f8e92a0b592655364179b13eaca179e93f7bb197690ebdaffa3555
    Image:         quay.io/cephcsi/cephcsi:v3.6.1
    Image ID:      quay.io/cephcsi/cephcsi@sha256:6dae8527e43965d13714cf7b3e51d52b9d970a9ac01f10f00dd324d326ec6aea
    Port:          <none>
    Host Port:     <none>
    Args:
      --type=liveness
      --endpoint=$(CSI_ENDPOINT)
      --metricsport=9080
      --metricspath=/metrics
      --polltime=60s
      --timeout=3s
    State:          Running
      Started:      Mon, 02 May 2022 23:53:25 -0400
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  256Mi
    Requests:
      cpu:     50m
      memory:  128Mi
    Environment:
      CSI_ENDPOINT:  unix:///csi/csi.sock
      POD_IP:         (v1:status.podIP)
    Mounts:
      /csi from plugin-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r4wkm (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  plugin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins/rook-ceph.rbd.csi.ceph.com
    HostPathType:  DirectoryOrCreate
  plugin-mount-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins
    HostPathType:  Directory
  registration-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins_registry/
    HostPathType:  Directory
  pods-mount-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/pods
    HostPathType:  Directory
  host-dev:
    Type:          HostPath (bare host directory volume)
    Path:          /dev
    HostPathType:  
  host-sys:
    Type:          HostPath (bare host directory volume)
    Path:          /sys
    HostPathType:  
  lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:  
  ceph-csi-configs:
    Type:               Projected (a volume that contains injected data from multiple sources)
    ConfigMapName:      rook-ceph-csi-config
    ConfigMapOptional:  <nil>
    ConfigMapName:      rook-ceph-csi-mapping-config
    ConfigMapOptional:  <nil>
  keys-tmp-dir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  <unset>
  host-run-mount:
    Type:          HostPath (bare host directory volume)
    Path:          /run/mount
    HostPathType:  
  oidc-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3600
  kube-api-access-r4wkm:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason       Age   From               Message
  ----     ------       ----  ----               -------
  Normal   Scheduled    60m   default-scheduler  Successfully assigned rook-ceph/csi-rbdplugin-s6tzt to nas
  Warning  FailedMount  60m   kubelet            MountVolume.SetUp failed for volume "ceph-csi-configs" : failed to sync configmap cache: timed out waiting for the condition
  Normal   Pulled       60m   kubelet            Container image "k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.5.0" already present on machine
  Normal   Created      60m   kubelet            Created container driver-registrar
  Normal   Started      60m   kubelet            Started container driver-registrar
  Normal   Pulled       60m   kubelet            Container image "quay.io/cephcsi/cephcsi:v3.6.1" already present on machine
  Normal   Created      60m   kubelet            Created container csi-rbdplugin
  Normal   Started      60m   kubelet            Started container csi-rbdplugin
  Normal   Pulled       60m   kubelet            Container image "quay.io/cephcsi/cephcsi:v3.6.1" already present on machine
  Normal   Created      60m   kubelet            Created container liveness-prometheus
  Normal   Started      60m   kubelet            Started container liveness-prometheus
@humblec
Copy link
Collaborator

humblec commented May 4, 2022

Cc @pkalever

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue May 4, 2022
For the default mounter the mounter option
will not be set in the storageclass and as it is
not available in the storageclass same will not
be set in the volume context, Because of this the
mapOptions are getting discarded. If the mounter
is not set assuming it's an rbd mounter.

Note:- If the mounter is not set in the storageclass
we can set it in the volume context explicitly,
Doing this check-in node server to support backward
existing volumes and the check is minimal we are not
altering the volume context.

fixes: ceph#3076

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue May 4, 2022
For the default mounter the mounter option
will not be set in the storageclass and as it is
not available in the storageclass same will not
be set in the volume context, Because of this the
mapOptions are getting discarded. If the mounter
is not set assuming it's an rbd mounter.

Note:- If the mounter is not set in the storageclass
we can set it in the volume context explicitly,
Doing this check-in node server to support backward
existing volumes and the check is minimal we are not
altering the volume context.

fixes: ceph#3076

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue May 4, 2022
For the default mounter the mounter option
will not be set in the storageclass and as it is
not available in the storageclass same will not
be set in the volume context, Because of this the
mapOptions are getting discarded. If the mounter
is not set assuming it's an rbd mounter.

Note:- If the mounter is not set in the storageclass
we can set it in the volume context explicitly,
Doing this check-in node server to support backward
existing volumes and the check is minimal we are not
altering the volume context.

fixes: ceph#3076

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
@Cytrian
Copy link

Cytrian commented May 4, 2022

I just had a similar error when setting up a K8s on a newer Flatcar server, because that one now uses
the cgroupv2 layout.
So,

// $ cat /sys/fs/cgroup/pids + *.scope + /pids.max.
is not correct, since there is no directory /sys/fs/cgroup/pids any more.
It's now /sys/fs/cgroup + *.scope + /pids.max

@Madhu-1
Copy link
Collaborator

Madhu-1 commented May 4, 2022

I just had a similar error when setting up a K8s on a newer Flatcar server, because that one now uses the cgroupv2 layout. So,

// $ cat /sys/fs/cgroup/pids + *.scope + /pids.max.

is not correct, since there is no directory /sys/fs/cgroup/pids any more.
It's now /sys/fs/cgroup + *.scope + /pids.max

@Cytrian would you like to send a PR to fix this one?

@jpmartin2
Copy link
Author

Ah so based on the PR for the mount issue I can see a workaround for now is to explicitly set the mounter parameter. Was able to get up and running with that for now!

@Madhu-1
Copy link
Collaborator

Madhu-1 commented May 5, 2022

Ah so based on the PR for the mount issue I can see a workaround for now is to explicitly set the mounter parameter. Was able to get up and running with that for now!

Yes that will be the workaround for now 👍

@lbogdan
Copy link

lbogdan commented May 5, 2022

is not correct, since there is no directory /sys/fs/cgroup/pids any more.
It's now /sys/fs/cgroup + *.scope + /pids.max

This isn't as simple as just changing the path; it should first detect whether we're using cgroups v1 or v2 (e.g. like here: https://github.com/containers/common/blob/a2ec40df56de42ebce03c1198495a79c2873b06e/pkg/cgroups/cgroups_supported.go#L28-L39 ), and use the correct path.

Madhu-1 added a commit to Madhu-1/ceph-csi that referenced this issue May 9, 2022
For the default mounter the mounter option
will not be set in the storageclass and as it is
not available in the storageclass same will not
be set in the volume context, Because of this the
mapOptions are getting discarded. If the mounter
is not set assuming it's an rbd mounter.

Note:- If the mounter is not set in the storageclass
we can set it in the volume context explicitly,
Doing this check-in node server to support backward
existing volumes and the check is minimal we are not
altering the volume context.

fixes: ceph#3076

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
@mergify mergify bot closed this as completed in #3080 May 9, 2022
mergify bot pushed a commit that referenced this issue May 9, 2022
For the default mounter the mounter option
will not be set in the storageclass and as it is
not available in the storageclass same will not
be set in the volume context, Because of this the
mapOptions are getting discarded. If the mounter
is not set assuming it's an rbd mounter.

Note:- If the mounter is not set in the storageclass
we can set it in the volume context explicitly,
Doing this check-in node server to support backward
existing volumes and the check is minimal we are not
altering the volume context.

fixes: #3076

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
mergify bot pushed a commit that referenced this issue May 9, 2022
For the default mounter the mounter option
will not be set in the storageclass and as it is
not available in the storageclass same will not
be set in the volume context, Because of this the
mapOptions are getting discarded. If the mounter
is not set assuming it's an rbd mounter.

Note:- If the mounter is not set in the storageclass
we can set it in the volume context explicitly,
Doing this check-in node server to support backward
existing volumes and the check is minimal we are not
altering the volume context.

fixes: #3076

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 7067456)
ceph-csi-bot pushed a commit that referenced this issue May 10, 2022
For the default mounter the mounter option
will not be set in the storageclass and as it is
not available in the storageclass same will not
be set in the volume context, Because of this the
mapOptions are getting discarded. If the mounter
is not set assuming it's an rbd mounter.

Note:- If the mounter is not set in the storageclass
we can set it in the volume context explicitly,
Doing this check-in node server to support backward
existing volumes and the check is minimal we are not
altering the volume context.

fixes: #3076

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 7067456)
mergify bot pushed a commit that referenced this issue May 10, 2022
For the default mounter the mounter option
will not be set in the storageclass and as it is
not available in the storageclass same will not
be set in the volume context, Because of this the
mapOptions are getting discarded. If the mounter
is not set assuming it's an rbd mounter.

Note:- If the mounter is not set in the storageclass
we can set it in the volume context explicitly,
Doing this check-in node server to support backward
existing volumes and the check is minimal we are not
altering the volume context.

fixes: #3076

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 7067456)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants