Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XFS: Superblock has unknown read-only compatible features (0x4) enabled #966

Closed
chandr20 opened this issue Apr 21, 2020 · 68 comments · Fixed by #1257
Closed

XFS: Superblock has unknown read-only compatible features (0x4) enabled #966

chandr20 opened this issue Apr 21, 2020 · 68 comments · Fixed by #1257
Assignees
Labels
bug Something isn't working component/rbd Issues related to RBD Priority-0 highest priority issue
Milestone

Comments

@chandr20
Copy link

Describe the bug

A clear and concise description of what the bug is.

Environment details

  • Image/version of Ceph CSI driver v2.1.0
  • helm chart version 1.2.5
  • Kubernetes cluster version 1.16.7
  • Logs

ern.alert <1> Apr 21 09:28:18 xxxx-2-05-worker-0 kernel: XFS (rbd0): Superblock has unknown read-only compatible features (0x4) enabled.
kern.warning <4> Apr 21 09:28:18 xxxx-2-05-worker-0 kernel: XFS (rbd0): Attempted to mount read-only compatible filesystem read-write.
kern.warning <4> Apr 21 09:28:18 xxxx-2-05-worker-0 kernel: XFS (rbd0): Filesystem can only be safely mounted read only.
kern.warning <4> Apr 21 09:28:18 xxxx-2-05-worker-0 kernel: XFS (rbd0): SB validate failed with error -22.
daemon.info <30> Apr 21 09:28:18 xxxx-2-05-worker-0 kubelet: E0421 09:28:18.051150 5822 csi_attacher.go:329] kubernetes.io/csi: attacher.MountDevice failed: rpc error: code = Internal desc = mount failed: exit status 32
daemon.info <30> Apr 21 09:28:18 xxxx-2-05-worker-0 kubelet: Mounting command: mount
daemon.info <30> Apr 21 09:28:18 xxxx-2-05-worker-0 kubelet: Mounting arguments: -t xfs -o _netdev,defaults /dev/rbd0 /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-a886c5d8-c6f7-4c5f-b9e9-e00acdb2b940/globalmount/0001-0009-rook-ceph-0000000000000001-60ac60d1-83b2-11ea-824a-8ae73a759cf9
daemon.info <30> Apr 21 09:28:18 xxxx-2-05-worker-0 kubelet: Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-a886c5d8-c6f7-4c5f-b9e9-e00acdb2b940/globalmount/0001-0009-rook-ceph-0000000000000001-60ac60d1-83b2-11ea-824a-8ae73a759cf9: wrong fs type, bad option, bad superblock on /dev/rbd0, missing codepage or helper program, or other error.
daemon.info <30> Apr 21 09:28:18 xxxx-2-05-worker-0 kubelet: I0421 09:28:18.051177 5822 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing"

Steps to reproduce

Steps to reproduce the behavior:

  1. Setup details: '...'
  2. Deployment to trigger the issue '....'
  3. See error

Actual results

Describe what happened

Expected behavior

A clear and concise description of what you expected to happen.

Additional context

Add any other context about the problem here.

For example:

Any existing bug report which describe about the similar issue/behavior

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 21, 2020

please provide steps to reproduce it.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 21, 2020

what is the kernel version?

@chandr20
Copy link
Author

3.10.0-1062.18.1.el7.x86_64

@chandr20
Copy link
Author

steps to reproduce

  1. installed k8s 1.16.7
  2. used rook helm chart 1.2.5
  3. replaced
    #image: quay.io/cephcsi/cephcsi:v2.1.0 (replaced 2.0.0 to 2.1.0(
  4. deployed ceph
    ceph/ceph:v14.2.7
  5. installed an app with pvc and faced the above mentioned issue
    same issue is not seen when using cephcsi:v2.0.0

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 21, 2020

i dont see any issue

I0421 13:19:33.965733   23860 cephcsi.go:117] Driver version: v2.1.0 and Git version: b38f2c5c310fa4dfea2bc97a2e067c7a47aca188
I0421 13:19:33.966108   23860 cephcsi.go:144] Initial PID limit is set to 3146
I0421 13:19:33.966160   23860 cephcsi.go:153] Reconfigured PID limit to -1 (max)
I0421 13:19:33.966169   23860 cephcsi.go:172] Starting driver type: rbd with name: rook-ceph.rbd.csi.ceph.com
I0421 13:19:33.981185   23860 mount_linux.go:173] Cannot run systemd-run, assuming non-systemd OS
I0421 13:19:33.981205   23860 mount_linux.go:174] systemd-run failed with: exit status 1
I0421 13:19:33.981214   23860 mount_linux.go:175] systemd-run output: System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to create bus connection: Host is down
W0421 13:19:33.981230   23860 driver.go:166] EnableGRPCMetrics is deprecated
I0421 13:19:33.981849   23860 server.go:116] Listening for connections on address: &net.UnixAddr{Name:"//csi/csi.sock", Net:"unix"}
I0421 13:19:34.038349   23860 utils.go:159] ID: 1 GRPC call: /csi.v1.Identity/GetPluginInfo
I0421 13:19:34.038404   23860 utils.go:160] ID: 1 GRPC request: {}
I0421 13:19:34.039040   23860 identityserver-default.go:37] ID: 1 Using default GetPluginInfo
I0421 13:19:34.039057   23860 utils.go:165] ID: 1 GRPC response: {"name":"rook-ceph.rbd.csi.ceph.com","vendor_version":"v2.1.0"}
I0421 13:19:35.756598   23860 utils.go:159] ID: 2 GRPC call: /csi.v1.Node/NodeGetInfo
I0421 13:19:35.756739   23860 utils.go:160] ID: 2 GRPC request: {}
I0421 13:19:35.776447   23860 nodeserver-default.go:58] ID: 2 Using default NodeGetInfo
I0421 13:19:35.776492   23860 utils.go:165] ID: 2 GRPC response: {"accessible_topology":{},"node_id":"minikube"}
I0421 13:20:34.136477   23860 utils.go:159] ID: 3 GRPC call: /csi.v1.Identity/Probe
I0421 13:20:34.137160   23860 utils.go:160] ID: 3 GRPC request: {}
I0421 13:20:34.140125   23860 utils.go:165] ID: 3 GRPC response: {}
I0421 13:20:49.584401   23860 utils.go:159] ID: 4 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0421 13:20:49.584439   23860 utils.go:160] ID: 4 GRPC request: {}
I0421 13:20:49.585285   23860 utils.go:165] ID: 4 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":3}}}]}
I0421 13:20:49.611743   23860 utils.go:159] ID: 5 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 GRPC call: /csi.v1.Node/NodeStageVolume
I0421 13:20:49.611786   23860 utils.go:160] ID: 5 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-706b3361-4655-41b8-b04d-df54713d2158/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"xfs"}},"access_mode":{"mode":1}},"volume_context":{"clusterID":"rook-ceph","imageFeatures":"layering","imageFormat":"2","journalPool":"replicapool","pool":"replicapool","storage.kubernetes.io/csiProvisionerIdentity":"1587475187924-8081-rook-ceph.rbd.csi.ceph.com"},"volume_id":"0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004"}
I0421 13:20:49.616194   23860 rbd_util.go:585] ID: 5 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 setting disableInUseChecks on rbd volume to: false
I0421 13:20:49.777044   23860 rbd_util.go:212] ID: 5 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 rbd: status csi-vol-de4a708e-83d2-11ea-a2e3-0242ac110004 using mon 10.96.154.124:6789, pool replicapool
W0421 13:20:49.811901   23860 rbd_util.go:234] ID: 5 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 rbd: no watchers on csi-vol-de4a708e-83d2-11ea-a2e3-0242ac110004
I0421 13:20:49.811920   23860 rbd_attach.go:208] ID: 5 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 rbd: map mon 10.96.154.124:6789
I0421 13:20:49.856629   23860 nodeserver.go:211] ID: 5 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 rbd image: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004/replicapool was successfully mapped at /dev/rbd0
I0421 13:20:49.856706   23860 mount_linux.go:405] Attempting to determine if disk "/dev/rbd0" is formatted using blkid with args: ([-p -s TYPE -s PTTYPE -o export /dev/rbd0])
I0421 13:20:49.874328   23860 mount_linux.go:408] Output: "", err: exit status 2
I0421 13:20:51.358968   23860 mount_linux.go:405] Attempting to determine if disk "/dev/rbd0" is formatted using blkid with args: ([-p -s TYPE -s PTTYPE -o export /dev/rbd0])
I0421 13:20:51.403073   23860 mount_linux.go:408] Output: "DEVNAME=/dev/rbd0\nTYPE=xfs\n", err: <nil>
I0421 13:20:51.403116   23860 mount_linux.go:298] Checking for issues with fsck on disk: /dev/rbd0
I0421 13:20:51.419592   23860 mount_linux.go:394] Attempting to mount disk /dev/rbd0 in xfs format at /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-706b3361-4655-41b8-b04d-df54713d2158/globalmount/0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004
I0421 13:20:51.419767   23860 mount_linux.go:146] Mounting cmd (mount) with arguments (-t xfs -o _netdev,defaults /dev/rbd0 /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-706b3361-4655-41b8-b04d-df54713d2158/globalmount/0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004)
I0421 13:20:51.460573   23860 nodeserver.go:187] ID: 5 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 rbd: successfully mounted volume 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 to stagingTargetPath /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-706b3361-4655-41b8-b04d-df54713d2158/globalmount/0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004
I0421 13:20:51.460618   23860 utils.go:165] ID: 5 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 GRPC response: {}
I0421 13:20:51.468232   23860 utils.go:159] ID: 6 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0421 13:20:51.468250   23860 utils.go:160] ID: 6 GRPC request: {}
I0421 13:20:51.468566   23860 utils.go:165] ID: 6 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":3}}}]}
I0421 13:20:51.472995   23860 utils.go:159] ID: 7 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 GRPC call: /csi.v1.Node/NodePublishVolume
I0421 13:20:51.473028   23860 utils.go:160] ID: 7 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-706b3361-4655-41b8-b04d-df54713d2158/globalmount","target_path":"/var/lib/kubelet/pods/42017236-7894-46a5-9000-f6487426f603/volumes/kubernetes.io~csi/pvc-706b3361-4655-41b8-b04d-df54713d2158/mount","volume_capability":{"AccessType":{"Mount":{"fs_type":"xfs"}},"access_mode":{"mode":1}},"volume_context":{"clusterID":"rook-ceph","imageFeatures":"layering","imageFormat":"2","journalPool":"replicapool","pool":"replicapool","storage.kubernetes.io/csiProvisionerIdentity":"1587475187924-8081-rook-ceph.rbd.csi.ceph.com"},"volume_id":"0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004"}
I0421 13:20:51.474280   23860 nodeserver.go:434] ID: 7 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 target /var/lib/kubelet/pods/42017236-7894-46a5-9000-f6487426f603/volumes/kubernetes.io~csi/pvc-706b3361-4655-41b8-b04d-df54713d2158/mount
isBlock false
fstype xfs
stagingPath /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-706b3361-4655-41b8-b04d-df54713d2158/globalmount/0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004
readonly false
mountflags [bind _netdev]
I0421 13:20:51.479667   23860 mount_linux.go:173] Cannot run systemd-run, assuming non-systemd OS
I0421 13:20:51.479687   23860 mount_linux.go:174] systemd-run failed with: exit status 1
I0421 13:20:51.479696   23860 mount_linux.go:175] systemd-run output: System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to create bus connection: Host is down
I0421 13:20:51.479709   23860 mount_linux.go:146] Mounting cmd (mount) with arguments (-t xfs -o bind,_netdev /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-706b3361-4655-41b8-b04d-df54713d2158/globalmount/0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 /var/lib/kubelet/pods/42017236-7894-46a5-9000-f6487426f603/volumes/kubernetes.io~csi/pvc-706b3361-4655-41b8-b04d-df54713d2158/mount)
I0421 13:20:51.481883   23860 mount_linux.go:146] Mounting cmd (mount) with arguments (-t xfs -o bind,remount,_netdev /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-706b3361-4655-41b8-b04d-df54713d2158/globalmount/0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 /var/lib/kubelet/pods/42017236-7894-46a5-9000-f6487426f603/volumes/kubernetes.io~csi/pvc-706b3361-4655-41b8-b04d-df54713d2158/mount)
I0421 13:20:51.488616   23860 nodeserver.go:345] ID: 7 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 rbd: successfully mounted stagingPath /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-706b3361-4655-41b8-b04d-df54713d2158/globalmount/0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 to targetPath /var/lib/kubelet/pods/42017236-7894-46a5-9000-f6487426f603/volumes/kubernetes.io~csi/pvc-706b3361-4655-41b8-b04d-df54713d2158/mount
I0421 13:20:51.488675   23860 utils.go:165] ID: 7 Req-ID: 0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004 GRPC response: {}
I0421 13:20:57.024952   23860 utils.go:159] ID: 8 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0421 13:20:57.024991   23860 utils.go:160] ID: 8 GRPC request: {}
I0421 13:20:57.026729   23860 utils.go:165] ID: 8 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":3}}}]}
I0421 13:20:57.062705   23860 utils.go:159] ID: 9 GRPC call: /csi.v1.Node/NodeGetVolumeStats
I0421 13:20:57.062836   23860 utils.go:160] ID: 9 GRPC request: {"volume_id":"0001-0009-rook-ceph-0000000000000001-de4a708e-83d2-11ea-a2e3-0242ac110004","volume_path":"/var/lib/kubelet/pods/42017236-7894-46a5-9000-f6487426f603/volumes/kubernetes.io~csi/pvc-706b3361-4655-41b8-b04d-df54713d2158/mount"}
I0421 13:20:57.100511   23860 mount_linux.go:173] Cannot run systemd-run, assuming non-systemd OS
I0421 13:20:57.100588   23860 mount_linux.go:174] systemd-run failed with: exit status 1
I0421 13:20:57.100603   23860 mount_linux.go:175] systemd-run output: System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to create bus connection: Host is down
I0421 13:20:57.100908   23860 utils.go:165] ID: 9 GRPC response: {"usage":[{"available":1019609088,"total":1061158912,"unit":1,"used":41549824},{"available":524285,"total":524288,"unit":2,"used":3}]}

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 22, 2020

closing this as not a CSI issue, feel free to reopen if the issue persists

@Madhu-1 Madhu-1 closed this as completed Apr 22, 2020
@chandr20
Copy link
Author

The moment I do the following mount operation manually

mkfs.xfs /dev/rbd0 -f

meta-data=/dev/rbd0 isize=512 agcount=16, agsize=4096000 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=65536000, imaxpct=25
= sunit=1024 swidth=1024 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=32000, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0

The mount operation is successful

└─ceph--76d4bce9--fcf7--45c0--8483--9d17cf44ea36-osd--data--c7294b13--dadc--4524--8678--10697150f51f
253:0 0 199G 0 lvm
rbd0 252:0 0 250G 0 disk /var/lib/kubelet/pods/fbbaf794-4df8-4748-952e-a83de1e3c5cd/volumes/kubernetes.io~csi/pvc-69b96a62-87fc-4a77-8a57-8a351

@chandr20
Copy link
Author

@Madhu-1 could you please reopen the iisue

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 22, 2020

@chandr20 am not seeing this issue. but reopening again. if you can give exact steps to reproduce it am happy to do it again.

@Madhu-1 Madhu-1 reopened this Apr 22, 2020
@nixpanic nixpanic added question Further information is requested need test case labels Apr 22, 2020
@chandr20
Copy link
Author

steps to reproduce

kernel (3.10.0-1062.18.1.el7.x86_64)

  1. installed k8s 1.16.7
  2. used rook helm chart 1.2.5
  3. replaced
    #image: quay.io/cephcsi/cephcsi:v2.1.0 (replaced 2.0.0 to 2.1.0)
  4. deployed ceph:v14.2.7
  5. installed an app with pvc and faced the above mentioned issue
    same issue is not seen when using cephcsi:v2.0.0
    Issue occurs only when using cephcsi:v2.1.0 and isssue not seen when using cephcsi:v2.0.0

@iExalt
Copy link

iExalt commented Apr 23, 2020

Having the same issue with
K8s: 1.18.0
Rook operator: 1.3.1
Ceph: 14.2.8

Steps:

  1. Create xfs rbd pvc
  2. Snapshot rbd pvc
  3. Create volume from snapshot
  4. Try to mount cloned volume with a pod
  Normal   Scheduled               12s                default-scheduler             Successfully assigned runescrape-staging/timescale-testing-6ffc65b887-9tvn2 to clem-cloud.clem.com
  Normal   SuccessfulAttachVolume  11s                attachdetach-controller       AttachVolume.Attach succeeded for volume "pvc-d42e5d41-8dbd-4719-aeb7-c8070c1592cc"
  Warning  FailedMount             1s (x3 over 3s)    kubelet, clem-cloud.clem.com  MountVolume.MountDevice failed for volume "pvc-d42e5d41-8dbd-4719-aeb7-c8070c1592cc" : rpc error: code = Internal desc = mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t xfs -o _netdev,defaults /dev/rbd4 /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-d42e5d41-8dbd-4719-aeb7-c8070c1592cc/globalmount/0001-0009-rook-ceph-0000000000000002-b18fec4c-85a0-11ea-a08b-067448b0f89f
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-d42e5d41-8dbd-4719-aeb7-c8070c1592cc/globalmount/0001-0009-rook-ceph-0000000000000002-b18fec4c-85a0-11ea-a08b-067448b0f89f: wrong fs type, bad option, bad superblock on /dev/rbd4, missing codepage or helper program, or other error.

xfs_repair -L /dev/rbd5 and xfs_admin -U generate /dev/rbd5 allows the PVC to be mounted at the cost of the snapshotted data actually there's a separate issue with snapshots where data isn't copied(?)

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 24, 2020

@iExalt please provide below details

  • what is the kernel version here
  • What cephcsi version you are using
  • Is it happening for cloned PVC or normal PVC also

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 24, 2020

@dillaman @humblec is it something related to kernel version? because I could not reproduce it in 4.19 kernel

@humblec
Copy link
Collaborator

humblec commented Apr 24, 2020

At glance, rather than a kernel issue, my suspicion is that, this happens when we mount the volumes which were referred by the same UUID``. That could have happened when there is a setup with a snapshot/clone. , It looks like the volume created from a snapshot will have the same UUID as the source volume. so you can only mount one at a time.
Thats why you are able to work around this issue with xfs_admin -U generate /dev/rbd5 which actually regenerate the UUID for the cloned volume !!

I could be wrong in this theory though.

@humblec
Copy link
Collaborator

humblec commented Apr 24, 2020

At glance, rather than a kernel issue, my suspicion is that, this happens when we mount the volumes which were referred by the same UUID``. That could have happened when there is a setup with a snapshot/clone. , It looks like the volume created from a snapshot will have the same UUID as the source volume. so you can only mount one at a time.
Thats why you are able to work around this issue with xfs_admin -U generate /dev/rbd5 which actually regenerate the UUID for the cloned volume !!

I could be wrong in this theory though.

This theory can be validated by taking a look at the parent and cloned volume UUIDs. If both are the same before regenerating the UUID, it could be the same root cause.

@Madhu-1 @iExalt @chandr20

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 24, 2020

At glance, rather than a kernel issue, my suspicion is that, this happens when we mount the volumes which were referred by the same UUID``. That could have happened when there is a setup with a snapshot/clone. ,

@humblec this is happening even without snapshot as mentioned in #966 (comment)

It looks like the volume created from a snapshot will have the same UUID as the source volume. so you can only mount one at a time.

the same UUID is not possible, we always ensure the UUID generated is unique, even the mountpath is always unique isn't It as it is (PVC+volumeID)

Thats why you are able to work around this issue with xfs_admin -U generate /dev/rbd5 which actually regenerate the UUID for the cloned volume !!

I could be wrong in this theory though.

the one mentioned #966 (comment) is not require the xfs_admin command

@humblec
Copy link
Collaborator

humblec commented Apr 24, 2020

@Madhu-1 I am referring filesystem UUID and not volumeID we create :)

@iExalt in your comment #966 (comment), the error is on /dev/rbd4 and the repair or regeneration happens on /dev/rbd5. Why its so? Is /dev/rbd5 also in use with other pod ?

@humblec
Copy link
Collaborator

humblec commented Apr 24, 2020

Again, if this is because of UUID issue -o nouuid option with mount command could help or atleast work around the issue.

@humblec humblec added the bug Something isn't working label Apr 24, 2020
@iExalt
Copy link

iExalt commented Apr 24, 2020

@humblec I tried it again (and ran the commands on) another pvc, which was given /dev/rbd5. Sorry about the confusion.

@iExalt
Copy link

iExalt commented Apr 24, 2020

@Madhu-1 I believe I'm using kernel 5.4, will check tomorrow. I'm using kernel 5.3.0-42-generic with Ceph CSI 2.1.0.
I've only seen this issue with cloned volumes.

@humblec
Copy link
Collaborator

humblec commented Apr 24, 2020

@humblec I tried it again (and ran the commands on) another pvc, which was given /dev/rbd5. Sorry about the confusion.

@iExalt nw :) , by any chance you could gather the source volume and cloned volume UUIDs when this issue occurs?

@iExalt
Copy link

iExalt commented Apr 24, 2020

@humblec Will do tomorrow

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 24, 2020

Just to add nothing has changed in v2.0.1 and 2.1.0

if existingFormat == "" /* && !staticVol */ {
args := []string{}
if fsType == "ext4" {
args = []string{"-m0", "-Enodiscard,lazy_itable_init=1,lazy_journal_init=1", devicePath}
} else if fsType == "xfs" {
args = []string{"-K", devicePath}
}
if len(args) > 0 {
cmdOut, cmdErr := diskMounter.Exec.Command("mkfs."+fsType, args...).CombinedOutput()
if cmdErr != nil {
klog.Errorf(util.Log(ctx, "failed to run mkfs error: %v, output: %v"), cmdErr, cmdOut)
return cmdErr
}
}
}
and
if existingFormat == "" && !staticVol {
args := []string{}
if fsType == "ext4" {
args = []string{"-m0", "-Enodiscard,lazy_itable_init=1,lazy_journal_init=1", devicePath}
} else if fsType == "xfs" {
args = []string{"-K", devicePath}
}
if len(args) > 0 {
cmdOut, cmdErr := diskMounter.Exec.Command("mkfs."+fsType, args...).CombinedOutput()
if cmdErr != nil {
klog.Errorf(util.Log(ctx, "failed to run mkfs error: %v, output: %v"), cmdErr, cmdOut)
return cmdErr
}
}
}

@iExalt
Copy link

iExalt commented Apr 24, 2020

I actually upgraded from v2.0.0 hoping that it would fix another error upon mounting a cloned PVC - interesting to know that nothing has changed though.

@iExalt
Copy link

iExalt commented Apr 24, 2020

@humblec Filesystem UUID is indeed the same, it makes sense that regenerating the UUID would fix the mount. Having said that, even though the PVC mounts successfully, it doesn't have my cloned data on it which concerns me.

/dev/rbd1 is the original PVC
/dev/rbd6 is the cloned PVC

root@clem-cloud:~# xfs_admin -u /dev/rbd1
UUID = e57983d6-8717-4471-8834-626924389c0d
root@clem-cloud:~# xfs_admin -u /dev/rbd6
UUID = e57983d6-8717-4471-8834-626924389c0d

How would I try mount option -o nouuid? I looked around in the CephCSI docs for adding mount options but I'm not sure which spec to change/the name of the field.

mergify bot pushed a commit that referenced this issue May 5, 2020
The problem happens when multiple PVCs with the
same UUID are attached/mounted on a node. This
can happen after creating a PVC from a snapshot,
or cloning a PVC.

make nouuid as the default mount option if
the format type is xfs to avoid mounting
issues.

updates: #966

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
mergify bot pushed a commit that referenced this issue May 5, 2020
The problem happens when multiple PVCs with the
same UUID are attached/mounted on a node. This
can happen after creating a PVC from a snapshot,
or cloning a PVC.

make nouuid as the default mount option if
the format type is xfs to avoid mounting
issues.

updates: #966

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 22a86c5)
humblec pushed a commit that referenced this issue May 5, 2020
The problem happens when multiple PVCs with the
same UUID are attached/mounted on a node. This
can happen after creating a PVC from a snapshot,
or cloning a PVC.

make nouuid as the default mount option if
the format type is xfs to avoid mounting
issues.

updates: #966

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
(cherry picked from commit 22a86c5)
@chandr20
Copy link
Author

chandr20 commented May 5, 2020

@humblec @Madhu-1 @dillaman @nixpanic @iExalt
Thanks for the support issue is fixed in cephcsi-2.1.1

@humblec
Copy link
Collaborator

humblec commented May 5, 2020

@chandr20 Thanks for the confirmation!!! Also appreciate for getting back with the result !

@nixpanic
Copy link
Member

nixpanic commented May 6, 2020

Our current e2e can not test this, as it requires a provisioner with CentOS-8 as base, running attachers on CentOS-7. Removing the needs test case label.

@nixpanic nixpanic added this to the release-3.0.0 milestone Jul 1, 2020
@nixpanic
Copy link
Member

nixpanic commented Jul 1, 2020

While considering an approach to address this, I wonder if there is a reason why the provisioner does not do the full provisioning (and create the filesystem). Could you tell me why that is, @humblec or @Madhu-1? Would it not be cleaner to have a separation between provisioning and consumtion? Now there is a partial provisioning done, and the final steps are done in the NodeStageVolume procedure (creating a filesystem when it does not exist).

@dillaman
Copy link

dillaman commented Jul 1, 2020

Now there is a partial provisioning done, and the final steps are done in the NodeStageVolume procedure (creating a filesystem when it does not exist).

Hmm -- would that be any better to have the controller map and mkfs the FS upon creation? The controller is a bottleneck right now as it is (since there is only 1 active) and it too could be a different kernel version than the worker nodes so it wouldn't address this issue, right?

@nixpanic
Copy link
Member

nixpanic commented Jul 2, 2020

Hmm -- would that be any better to have the controller map and mkfs the FS upon creation? The controller is a bottleneck right now as it is (since there is only 1 active) and it too could be a different kernel version than the worker nodes so it wouldn't address this issue, right?

It would not address the issue. The mkfs.xfs that is used for the formatting comes from the base Ceph container and can well be a different version than what is available on the host running the containers. The question was more about the correctness of the CSI-spec implementation.

@dillaman
Copy link

dillaman commented Jul 2, 2020

The question was more about the correctness of the CSI-spec implementation.

Is there a specific part of the spec that you think ceph-csi is out of compliance with?

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 2, 2020

While considering an approach to address this, I wonder if there is a reason why the provisioner does not do the full provisioning (and create the filesystem). Could you tell me why that is, @humblec or @Madhu-1? Would it not be cleaner to have a separation between provisioning and consumption? Now there is partial provisioning done, and the final steps are done in the NodeStageVolume procedure (creating a filesystem when it does not exist).

the provisioner work is to just create the RBD images and it will be part of node plugin to make sure the image is available on a given path on the node where the PVC needs to be mounted.

is it possible to check the mkfs supports reflink or not when formatting if supported use it else dont't use it?

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 20, 2020

if we don't fix this issue before 3.0.0 we will hit it again. either we need to change our base image back to nautilus again or fix this issue in code.

@Madhu-1 Madhu-1 added the Priority-0 highest priority issue label Jul 20, 2020
@nixpanic
Copy link
Member

is it possible to check the mkfs supports reflink or not when formatting if supported use it else dont't use it?

This would make the PVC unavailable in case it gets attached to a node that does not support reflink. It will be unpredictable and difficult for users to understand why a PVC can not be used on some nodes, but may work on others.

We can run mkfs.xfsand check for reflink= in stderr. In case it is there, pass -m reflink=0 on the cmdline before formatting. This should ideally be a config option as some workloads can benefit from reflinking.

Shall we add an option to the StorageClass, like this:

---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: csi-rbd-sc
provisioner: rbd.csi.ceph.com
parameters:
   clusterID: <cluster-id>
   pool: rbd
   reflink: disabled|enabled|auto

Currently reflink is only available for XFS, and disabling by default is the most suitable option:

  • disabled: detect support, and pass -m reflink=0
  • enabled: force enabling, always pass -m reflink=1
  • auto: do nothing, do not pass -m reflink= and make this default in some next Ceph-CSI version

@nixpanic
Copy link
Member

After a chat with @Madhu-1 on the ceph-csi slack, we decided to always disable support for reflink. #1256 has been reported to make it possible to enable reflink in the future.

nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Jul 21, 2020
Current versions of the mkfs.xfs binary enable reflink support by
default. This causes problems on systems where the kernel does not
support this feature. When the kernel the feature does not support, but
the filesystem has it enabled, the following error is logged in `dmesg`:

    XFS: Superblock has unknown read-only compatible features (0x4) enabled

Introduce a check to see if mkfs.xfs supports the `-m reflink=` option.
In case it does, pass `-m reflink=0` while creating the filesystem.

The check is executed once during the first XFS filesystem creation. The
result of the check is cached until the nodeserver restarts.

Fixes: ceph#966
Signed-off-by: Niels de Vos <ndevos@redhat.com>
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Jul 21, 2020
Current versions of the mkfs.xfs binary enable reflink support by
default. This causes problems on systems where the kernel does not
support this feature. When the kernel the feature does not support, but
the filesystem has it enabled, the following error is logged in `dmesg`:

    XFS: Superblock has unknown read-only compatible features (0x4) enabled

Introduce a check to see if mkfs.xfs supports the `-m reflink=` option.
In case it does, pass `-m reflink=0` while creating the filesystem.

The check is executed once during the first XFS filesystem creation. The
result of the check is cached until the nodeserver restarts.

Fixes: ceph#966
Signed-off-by: Niels de Vos <ndevos@redhat.com>
@nixpanic nixpanic added the component/rbd Issues related to RBD label Jul 21, 2020
@mergify mergify bot closed this as completed in #1257 Jul 24, 2020
mergify bot pushed a commit that referenced this issue Jul 24, 2020
Current versions of the mkfs.xfs binary enable reflink support by
default. This causes problems on systems where the kernel does not
support this feature. When the kernel the feature does not support, but
the filesystem has it enabled, the following error is logged in `dmesg`:

    XFS: Superblock has unknown read-only compatible features (0x4) enabled

Introduce a check to see if mkfs.xfs supports the `-m reflink=` option.
In case it does, pass `-m reflink=0` while creating the filesystem.

The check is executed once during the first XFS filesystem creation. The
result of the check is cached until the nodeserver restarts.

Fixes: #966
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component/rbd Issues related to RBD Priority-0 highest priority issue
Projects
None yet
8 participants