-
Notifications
You must be signed in to change notification settings - Fork 536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CephFS mount syntax not updated for Quincy #3309
Comments
AFAIK it should not an issue, we are using cephcsi with Qunicy, and we don't have this issue reported from anyone. @ygg-drop couple of questions
|
Someone has to make the first report 😉 maybe this use-case is not very popular?
Yes.
Yes, I get the same error:
Yes, same error:
When I try to mount using the Quincy
EDIT: I just tested with |
I have exactly the same problem. With basic k8s installation (1.25) and ceph installation (17.2.3). |
@mchangir can you please help here, not sure why mounting fails here cephcsi uses ceph 17.2 as the base image but still looks like the mount is failing on the 17.2.3 cluster. Note:- we have not seen this issue in Rook ceph clusters. |
@Informize can you please provide the dmesg on the node |
@Informize can you also run the mount command in verbose mode? |
@ygg-drop I am wondering the compatibility between userspace package (ex: ceph-common) for the binaries to kernel ((5.15.41-0-lts) cause the issue here ? |
The mount syntax changes have been kept backward compatible. The old syntax should work with newer kernels. |
@ygg-drop dmesg and running mount helper with verbose flag would help debug what's going on. |
OS on ceph nodes and k8s control/worker nodes are all:
Version of cephcsi that works:
and mount command:
With ceph-csi 3.7
And mount command in verbose mode:
|
Do you see no output after this or does the command hang? And does the mount go through (grep ceph /proc/mounts)? Anything in dmesg? The 0's in fsid might be the issue here. |
@vshankar
and
pvc/pv's:
pod:
|
That the cephfs mount then, isn't it?
|
@vshankar yes indeed, with mount command i can manually mount it on worker node |
OK. Is the same command run by ceph-csi plugin? Can you enable mount helper debugging when its being run by ceph-csi? |
I wonder whats the |
sorry, can you point out where to look how to enable debug by ceph-csi ? |
/proc/mounts has the record to the cephfs mount as mentioned in this comment #3309 (comment) |
I presume that this comment lists the contents of |
It will manually mount but not from a pod. Debug flags are set as you can see in this comment: #3309 (comment) |
@Informize Without debug logs from a failed mount instance, its hard to tell what's going on. |
This is all the debug information that i have. Is there a way how to get extra debug information ? I see there is also another github issue related to this one: #3390 |
Do you have dmesg logs when the mount fails from the pod? |
@Informize i dont see any mount failure in your case. Do you think the mount is failing? Can you please provide cephfsplugin container logs? |
here is the mount failing |
Happen on rook 1.10.2 , external ceph installed via cephadm in version 16.2.10, kubernetes nodes are on archlinux, kernel 5.19.5-arch1-1 No really relevant dmesg error:
Same error trying to mount manually (in the container, archlinux remove ceph-library from repository few days ago), and volume don't get mounted:
|
Confirm that running rook 1.9.12 (rook 1.10.X supports cephcsi >=3.6.0, that does NOT solve the issue) and forcing cephcsi:3.5.1 solve the issue. |
Can someone provide me the setup details/reproducer i would like to reproduce it locally and see what is wrong. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation. |
This seems to be an active issue, where the only workaround is downgrading cephcsi. There is also an open PR for it. Should it be reopened? As a data point, it affects me too:
Changing the CSI ConfigMap fixes it: from:
to:
|
There is PR (ceph/ceph#48873) in ceph to fix a mount issue. Are you referring to that or some other fix? |
Describe the bug
Apparently there was a significant change in the
mount.ceph
syntax between Ceph Pacific and Quincy. However Ceph-CSI code does not seem to be updated to support the new syntax.I use Nomad 1.3.1 and I am trying to use Ceph-CSI to provide CephFS-based volumes to Nomad jobs. I tried the 3.6.2 version of Ceph-CSI (which is already based on Quincy) to mount a CephFS volume from a cluster running Ceph 17.2.0.
I use Nomad instead of Kubernetes, but I don't think this fact affects this bug.
Environment details
fuse
orkernel
. for rbd itskrbd
orrbd-nbd
) : kernelSteps to reproduce
Steps to reproduce the behavior:
nomadfs
and admin usernomad job run ceph-csi-plugin-controller.nomad
nomad job run ceph-csi-plugin-nodes.nomad
sample-fs-volume.hcl
by running:nomad volume register sample-fs-volume.hcl
mysql-fs.nomad
which tries to use the volume created in previous step using:nomad job run mysql-fs.nomad
.ceph-mysql-fs
job allocation logs.ceph-csi-plugin-controller.nomad:
ceph-csi-plugin-nodes.nomad:
sample-fs-volume.hcl:
mysql-fs.nomad:
Actual results
Ceph-CSI node plugin failed to mount CephFS.
Expected behavior
Ceph-CSI node plugin should successfully mount CephFS using the new
mount.ceph
syntax.Logs
nomad alloc status
events:I suspect the
unable to get monitor info from DNS SRV
error happens because themount.ceph
helper in 17.x does not recognize anymore passing monitor IPs this way and falls back to using DNS SRV records.Additional context
The text was updated successfully, but these errors were encountered: