Unable to create pvc using cephfs #3730

franitel · 2023-03-29T11:11:08Z

Describe the bug

We have a cephfs cluster and we have deployed the ceph-csi-cephfs chart in our kubernetes cluster (v1.19.9) with the following values:

USER-SUPPLIED VALUES:
csiConfig:
- cephFS:
    subvolumeGroup: csi
  clusterID: 8fxxxxxxxxxxxxxxxxxxxxxxxa0
  monitors:
  - 172.22.14.201:6789
  - 172.22.14.202:6789
  - 172.22.14.203:6789
provisioner:
  replicaCount: 1
secret:
  adminID: admin
  adminKey: AQCccccccccccccccccccccc==
  create: true
storageClass:
  allowVolumeExpansion: true
  clusterID: 8fxxxxxxxxxxxxxxxxxxxxxxxa0
  create: true
  fsName: k8sfs
  name: csi-cephfs-sc
  reclaimPolicy: Delete

we have checked and we can mount the volume in the kubernetes nodes using the following command:

mount -v -t ceph -o name=admin,secretfile=ceph.keyring 172.22.14.203:6789:/k8scephfs /srv/cephfs3/

but when we try to deploy a pvc we see the pvc in pending mode showing the following status:

This is the ceph status:

here you can see the logs (ceph.audit.log) when we try deploy the pvc.

A clear and concise description of what the bug is.

Environment details

Image/version of Ceph CSI driver : quay.io/cephcsi/cephcsi:v3.8.0 first we were checking with v3.4.0 with the same result
Helm chart version : ceph-csi/ceph-csi-cephfs 3.8.0
Kernel version : Linux vm-k8s-test-worker-1 4.15.0-109-generic rbd: refuse to create block volumes #110-Ubuntu SMP Tue Jun 23 02:39:32 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Mounter used for mounting PVC (for cephFS its fuse or kernel. for rbd its
krbd or rbd-nbd) :
Kubernetes cluster version :
Ceph cluster version :

Steps to reproduce

Steps to reproduce the behavior:

Deploy the chart with the previos values
Deployment pvc and statefulset
See error: the pvc it is in pending status without the bonding

Actual results

When we try to deploy a pvc it is getting in pending status.

Expected behavior

Run the pod with the pvc mounted inside.

Logs

csi-rbdplugin/csi-cephfsplugin and driver-registrar container logs from
plugin pod from the node where the mount is failing.

I0329 10:59:56.394232       1 utils.go:195] ID: 26 Req-ID: pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 GRPC call: /csi.v1.Controller/CreateVolume
I0329 10:59:56.394554       1 utils.go:206] ID: 26 Req-ID: pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 GRPC request: {"capacity_range":{"required_bytes":1073741824},"name":"pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424","parameters":{"clusterID":"8f7b6e44-5b7a-48f7-83c5-dd83fb0b7ea0","csi.storage.k8s.io/pv/name":"pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424","csi.storage.k8s.io/pvc/name":"www-web-0","csi.storage.k8s.io/pvc/namespace":"storage","fsName":"k8s_fs"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{"mode":7}}]}
E0329 10:59:56.394667       1 controllerserver.go:269] ID: 26 Req-ID: pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 an operation with the given Volume ID pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 already exists
E0329 10:59:56.394697       1 utils.go:210] ID: 26 Req-ID: pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 already exists
I0329 11:00:28.774657       1 utils.go:195] ID: 27 GRPC call: /csi.v1.Identity/Probe
I0329 11:00:28.774902       1 utils.go:206] ID: 27 GRPC request: {}
I0329 11:00:28.774974       1 utils.go:212] ID: 27 GRPC response: {}
I0329 11:01:28.749331       1 utils.go:195] ID: 28 GRPC call: /csi.v1.Identity/Probe
I0329 11:01:28.751994       1 utils.go:206] ID: 28 GRPC request: {}
I0329 11:01:28.752024       1 utils.go:212] ID: 28 GRPC response: {}
I0329 11:02:04.408556       1 utils.go:195] ID: 29 Req-ID: pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 GRPC call: /csi.v1.Controller/CreateVolume
I0329 11:02:04.408948       1 utils.go:206] ID: 29 Req-ID: pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 GRPC request: {"capacity_range":{"required_bytes":1073741824},"name":"pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424","parameters":{"clusterID":"8f7b6e44-5b7a-48f7-83c5-dd83fb0b7ea0","csi.storage.k8s.io/pv/name":"pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424","csi.storage.k8s.io/pvc/name":"www-web-0","csi.storage.k8s.io/pvc/namespace":"storage","fsName":"k8s_fs"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{"mode":7}}]}
E0329 11:02:04.409167       1 controllerserver.go:269] ID: 29 Req-ID: pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 an operation with the given Volume ID pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 already exists
E0329 11:02:04.409292       1 utils.go:210] ID: 29 Req-ID: pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 already exists

If the issue is in PVC creation, deletion, cloning please attach complete logs
of below containers.

csi-provisioner and csi-rbdplugin/csi-cephfsplugin container logs from the
provisioner pod.

I0329 10:54:48.792549       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"storage", Name:"www-web-0", UID:"7edca0a0-
f18f-4de1-9ba9-73bf239ab424", APIVersion:"v1", ResourceVersion:"128546450", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner i
s provisioning volume for claim "storage/www-web-0"
W0329 10:57:48.792937       1 controller.go:934] Retrying syncing claim "7edca0a0-f18f-4de1-9ba9-73bf239ab424", failure 0
E0329 10:57:48.793184       1 controller.go:957] error syncing claim "7edca0a0-f18f-4de1-9ba9-73bf239ab424": failed to provision volume with StorageClas
s "csi-cephfs-sc": rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0329 10:57:48.793844       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"storage", Name:"www-web-0", UID:"7edca0a0-
f18f-4de1-9ba9-73bf239ab424", APIVersion:"v1", ResourceVersion:"128546450", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provi
sion volume with StorageClass "csi-cephfs-sc": rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0329 10:57:49.293732       1 controller.go:1337] provision "storage/www-web-0" class "csi-cephfs-sc": started
I0329 10:57:49.294592       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"storage", Name:"www-web-0", UID:"7edca0a0-
f18f-4de1-9ba9-73bf239ab424", APIVersion:"v1", ResourceVersion:"128546450", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner i
s provisioning volume for claim "storage/www-web-0"
W0329 10:57:49.303371       1 controller.go:934] Retrying syncing claim "7edca0a0-f18f-4de1-9ba9-73bf239ab424", failure 1
E0329 10:57:49.303401       1 controller.go:957] error syncing claim "7edca0a0-f18f-4de1-9ba9-73bf239ab424": failed to provision volume with StorageClas
s "csi-cephfs-sc": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 already exists
I0329 10:57:49.303416       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"storage", Name:"www-web-0", UID:"7edca0a0-
f18f-4de1-9ba9-73bf239ab424", APIVersion:"v1", ResourceVersion:"128546450", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provi
sion volume with StorageClass "csi-cephfs-sc": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-7edca0a0-f18f-4de1-9ba9-73bf23
9ab424 already exists
I0329 10:57:50.303664       1 controller.go:1337] provision "storage/www-web-0" class "csi-cephfs-sc": started
I0329 10:57:50.303858       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"storage", Name:"www-web-0", UID:"7edca0a0-
f18f-4de1-9ba9-73bf239ab424", APIVersion:"v1", ResourceVersion:"128546450", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner i
s provisioning volume for claim "storage/www-web-0"If the issue is in PVC mounting please attach complete logs of below containers.


- if required attach dmesg logs.

**Note:-** If its a rbd issue please provide only rbd related logs, if its a
cephFS issue please provide cephFS logs.

# Additional context #

Add any other context about the problem here.

For example:

Any existing bug report which describe about the similar issue/behavior

W0329 10:57:50.322911       1 controller.go:934] Retrying syncing claim "7edca0a0-f18f-4de1-9ba9-73bf239ab424", failure 2
E0329 10:57:50.323268       1 controller.go:957] error syncing claim "7edca0a0-f18f-4de1-9ba9-73bf239ab424": failed to provision volume with StorageClas
s "csi-cephfs-sc": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-7edca0a0-f18f-4de1-9ba9-73bf239ab424 already exists
I0329 10:57:50.323380       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"storage", Name:"www-web-0", UID:"7edca0a0-
f18f-4de1-9ba9-73bf239ab424", APIVersion:"v1", ResourceVersion:"128546450", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provi
sion volume with StorageClass "csi-cephfs-sc": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-7edca0a0-f18f-4de1-9ba9-73bf23
9ab424 already exists

If need it some other data, please only tell me.

Francisco Rodriguez

The text was updated successfully, but these errors were encountered:

Madhu-1 · 2023-03-29T12:52:03Z

@franitel please check if its a network connectivity issue or not, https://rook.io/docs/rook/latest/Troubleshooting/ceph-csi-common-issues/#ceph-health can help you with the steps for debugging.

franitel · 2023-03-29T14:34:51Z

Hi, Madhu,
I'm going to check.
thanks for your quick response!!

ppodevlabs · 2023-03-29T14:36:18Z

@franitel please check if its a network connectivity issue or not, https://rook.io/docs/rook/latest/Troubleshooting/ceph-csi-common-issues/#ceph-health can help you with the steps for debugging.

We have check this before and it seems to be working fine

root@ceph-mon-test1:/home/pedrop# ceph health detail
HEALTH_OK

❯ k exec -it -n storage ceph-csi-driver-ceph-csi-cephfs-provisioner-7f84fc97fb-j9rwh -c csi-cephfsplugin -- /bin/bash
[root@vm-k8s-test-worker-2 /]# curl curl 172.22.14.201:3300 2>/dev/null
ceph v2

Madhu-1 · 2023-03-29T14:38:13Z

monitors:

172.22.14.201:6789
172.22.14.202:6789
172.22.14.203:6789

Did you check for the 6789 port as well?

ppodevlabs · 2023-03-29T14:42:34Z

monitors:

172.22.14.201:6789

172.22.14.202:6789

172.22.14.203:6789

Did you check for the 6789 port as well?

yes

root@vm-k8s-test-worker-2:/home/pedrop# nc -zv 172.22.14.201 6789
Connection to 172.22.14.201 6789 port [tcp/*] succeeded!

❯ k exec -it -n storage ceph-csi-driver-ceph-csi-cephfs-provisioner-7f84fc97fb-j9rwh -c csi-cephfsplugin -- /bin/bash
[root@vm-k8s-test-worker-2 /]# curl curl 172.22.14.201:6789 2>/dev/null
[root@vm-k8s-test-worker-2 /]#

Madhu-1 · 2023-03-29T14:45:25Z

@ppodevlab are you able to execute ceph commands from the container? see https://www.mrajanna.com/troubleshooting-cephcsi/ might help you

ppodevlabs · 2023-03-29T15:27:20Z

@Madhu-1 i've managed to execute commands from the plugin container but i have to specify the user/keyring. Could be the case the provisioner is not fetching properly the configuration?

k exec -it -n storage ceph-csi-driver-ceph-csi-cephfs-provisioner-9584bc97-j6vgq -c csi-cephfsplugin -- /bin/bash
[root@vm-k8s-test-worker-2 /]# ceph status --user=admin --key=my_key
  cluster:
    id:     8f7b6e44-5b7a-48f7-83c5-dd83fb0b7ea0
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum ceph-mon-test2,ceph-mon-test3,ceph-mon-test1 (age 6h)
    mgr: ceph-mon-test3(active, since 5d), standbys: ceph-mon-test2, ceph-mon-test1
    mds: k8s_fs:1 {0=ceph-mon-test1=up:active} 2 up:standby
    osd: 6 osds: 6 up (since 5d), 6 in (since 8d)
    rgw: 3 daemons active (radosgw.ceph-mon-test1, radosgw.ceph-mon-test2, radosgw.ceph-mon-test3)

  task status:

  data:
    pools:   21 pools, 337 pgs
    objects: 564 objects, 914 MiB
    usage:   8.7 GiB used, 51 GiB / 60 GiB avail
    pgs:     337 active+clean

keyring file in /etc/ceph/keyring is empty

ppodevlabs · 2023-03-30T10:32:03Z

@ppodevlab are you able to execute ceph commands from the container? see https://www.mrajanna.com/troubleshooting-cephcsi/ might help you

we just did a test creating static pvcs following https://github.com/ceph/ceph-csi/blob/devel/docs/static-pvc.md#cephfs-static-pvc. We created the volume and volumegroup from the plugin container within the provisioner pod and we can mount static volumes into pods... So i do not think it is a network issue.

Madhu-1 · 2023-03-30T13:29:11Z

@ppodevlab am not sure what the problem is with your setup, one thing is you need to pass the monitor,user and key mentioned in the storageclass and configmap when you are executing the commands from the provisioner pod because those are not available by default on it

nixpanic · 2023-03-31T10:38:02Z

Kubernetes v1.19.9 is rather old and unmaintained. We do not test recent Ceph-CSI versions against that version anyore, possibly something broke and recent kubernretes-csi sidecars are not compatible with the old version?

github-actions · 2023-04-30T21:01:14Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

github-actions · 2023-05-07T21:01:18Z

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

nixpanic added component/cephfs Issues related to CephFS component/deployment Helm chart, kubernetes templates and configuration Issues/PRs labels Mar 31, 2023

nixpanic added the question Further information is requested label Mar 31, 2023

github-actions bot added the wontfix This will not be worked on label Apr 30, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to create pvc using cephfs #3730

Unable to create pvc using cephfs #3730

franitel commented Mar 29, 2023

Madhu-1 commented Mar 29, 2023

franitel commented Mar 29, 2023

ppodevlabs commented Mar 29, 2023

Madhu-1 commented Mar 29, 2023

ppodevlabs commented Mar 29, 2023

Madhu-1 commented Mar 29, 2023

ppodevlabs commented Mar 29, 2023 •

edited

Loading

ppodevlabs commented Mar 30, 2023

Madhu-1 commented Mar 30, 2023

nixpanic commented Mar 31, 2023

github-actions bot commented Apr 30, 2023

github-actions bot commented May 7, 2023

Unable to create pvc using cephfs #3730

Unable to create pvc using cephfs #3730

Comments

franitel commented Mar 29, 2023

Describe the bug

Environment details

Steps to reproduce

Actual results

Expected behavior

Logs

Madhu-1 commented Mar 29, 2023

franitel commented Mar 29, 2023

ppodevlabs commented Mar 29, 2023

Madhu-1 commented Mar 29, 2023

ppodevlabs commented Mar 29, 2023

Madhu-1 commented Mar 29, 2023

ppodevlabs commented Mar 29, 2023 • edited Loading

ppodevlabs commented Mar 30, 2023

Madhu-1 commented Mar 30, 2023

nixpanic commented Mar 31, 2023

github-actions bot commented Apr 30, 2023

github-actions bot commented May 7, 2023

ppodevlabs commented Mar 29, 2023 •

edited

Loading