Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpc error: code = InvalidArgument desc = failed to get connection: connecting failed: rados: ret=-1, Operation not permitted #4563

Open
wangchao732 opened this issue Apr 17, 2024 · 45 comments
Labels
component/deployment Helm chart, kubernetes templates and configuration Issues/PRs dependency/ceph depends on core Ceph functionality question Further information is requested

Comments

@wangchao732
Copy link

cephcsi: v3.11.0
csi-provisioner: v4.0.0
kubernetes : v1.22.12

[csi-cephfs-secret]
adminID : client.admin(base64)
adminKey: xxx(base64)

ceph fs ls
name: cephfs, metadata pool: store_metadata, data pools: [store-file ]

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: csi-cephfs-sc
labels:
app: ceph-csi-cephfs
app.kubernetes.io/managed-by: Helm
chart: ceph-csi-cephfs-3-canary
heritage: Helm
release: ceph-csi-cephfs
annotations:
kubesphere.io/creator: admin
meta.helm.sh/release-name: ceph-csi-cephfs
meta.helm.sh/release-namespace: ceph-csi-cephfs
storageclass.kubesphere.io/allow-clone: 'true'
storageclass.kubesphere.io/allow-snapshot: 'true'
provisioner: cephfs.csi.ceph.com
parameters:
clusterID: xxx
csi.storage.k8s.io/controller-expand-secret-name: csi-cephfs-secret
csi.storage.k8s.io/controller-expand-secret-namespace: ceph-csi-cephfs
csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret
csi.storage.k8s.io/node-stage-secret-namespace: ceph-csi-cephfs
csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret
csi.storage.k8s.io/provisioner-secret-namespace: ceph-csi-cephfs
fsName: cephfs
pool: store-file
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: Immediate

ConfigMap

apiVersion: v1
data:
config.json: '[{"clusterID": "80a8efd7-8ed5-4e53-bc5b-xxxx","monitors":
["192.168.13.180:6789","192.168.13.181:6789","192.168.13.182:6789"]}]'
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: ceph-csi-cephfs
meta.helm.sh/release-namespace: ceph-csi-cephfs
creationTimestamp: "2024-04-17T02:58:01Z"
labels:
app: ceph-csi-cephfs
app.kubernetes.io/managed-by: Helm
chart: ceph-csi-cephfs-3.11.0
component: provisioner
heritage: Helm
release: ceph-csi-cephfs
name: ceph-csi-config
namespace: ceph-csi-cephfs
resourceVersion: "100022091"
selfLink: /api/v1/namespaces/ceph-csi-cephfs/configmaps/ceph-csi-config
uid: 40ae7717-c85a-44eb-b0a1-3652b3d4dfe0

create pvc error
Name:"bytebase-pvc", UID:"91dc9df3-e611-44e5-8191-b18841edabf1", APIVersion:"v1", ResourceVersion:"100120810", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "csi-cephfs-sc": rpc error: code = InvalidArgument desc = failed to get connection: connecting failed: rados: ret=-1, Operation not permitted

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 17, 2024

@wangchao732 the ceph user you have created is having the required access as per https://github.com/ceph/ceph-csi/blob/devel/docs/capabilities.md#cephfs?

@wangchao732
Copy link
Author

@wangchao732 the ceph user you have created is having the required access as per https://github.com/ceph/ceph-csi/blob/devel/docs/capabilities.md#cephfs?

Thks, but get error messges

Error EINVAL: mds capability parse failed, stopped at 'fsname=cephfs path=/volumes, allow rws fsname=cephfs path=/volumes/csi' of 'allow r fsname=cephfs path=/volumes, allow rws fsname=cephfs path=/volumes/csi'

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 17, 2024

have you created the csi subvolumegroup in the filesystem? if not please create it and then try to create the user

@wangchao732
Copy link
Author

您是否在文件系统中创建了 CSI 子卷组?如果没有,请创建它,然后尝试创建用户

When I finish executing ceph auth, it seems to be created by default because I see the directory.
mount -t ceph 192.168.13.180:6789:/ /tmpdata -o name=admin,secret=xxx
pwd /tmpdata/volumes/csi

@wangchao732
Copy link
Author

您是否在文件系统中创建了 CSI 子卷组?如果没有,请创建它,然后尝试创建用户

When I finish executing ceph auth, it seems to be created by default because I see the directory. mount -t ceph 192.168.13.180:6789:/ /tmpdata -o name=admin,secret=xxx pwd /tmpdata/volumes/csi

ceph fs subvolume ls cephfs
[
{
"name": "volumes"
}
]

@wangchao732
Copy link
Author

ceph fs subvolume info cephfs volume1 csi
{
"atime": "2024-04-17 20:25:14",
"bytes_pcent": "0.00",
"bytes_quota": 50000000000,
"bytes_used": 0,
"created_at": "2024-04-17 20:25:14",
"ctime": "2024-04-17 20:25:14",
"data_pool": "store-file",
"features": [
"snapshot-clone",
"snapshot-autoprotect",
"snapshot-retention"
],
"gid": 0,
"mode": 16877,
"mon_addrs": [
"192.168.13.180:6789",
"192.168.13.181:6789",
"192.168.13.182:6789"
],
"mtime": "2024-04-17 20:25:14",
"path": "/volumes/csi/volume1/5362bce4-2dc0-44b5-99d2-07aaf023b052",
"pool_namespace": "",
"state": "complete",
"type": "subvolume",
"uid": 0
}
[root@Bj13-Ceph01-Dev ceph]# ceph auth get-or-create client.$USER mgr "allow rw" osd "allow rw tag cephfs metadata=$FS_NAME, allow rw tag cephfs data=$FS_NAME" mds "allow r fsname=$FS_NAME path=/volumes, allow rws fsname=$FS_NAME path=/volumes/$SUB_VOL" mon "allow r fsname=$FS_NAME"
Error EINVAL: mds capability parse failed, stopped at 'fsname=cephfs path=/volumes, allow rws fsname=cephfs path=/volumes/csi' of 'allow r fsname=cephfs path=/volumes, allow rws fsname=cephfs path=/volumes/csi'

@wangchao732
Copy link
Author

rbd-ceph-csi same issue

failed to provision volume with StorageClass "csi-rbd-sc": rpc error: code = Internal desc = failed to get connection: connecting failed: rados: ret=-1, Operation not permitted

@wangchao732
Copy link
Author

I0418 10:12:53.258063 1 utils.go:198] ID: 30 GRPC call: /csi.v1.Identity/Probe

I0418 10:12:53.258141 1 utils.go:199] ID: 30 GRPC request: {}

I0418 10:12:53.258175 1 utils.go:205] ID: 30 GRPC response: {}

I0418 10:13:46.083975 1 utils.go:198] ID: 31 Req-ID: pvc-051787a2-cdad-4e98-9562-01b43ce55a8e GRPC call: /csi.v1.Controller/CreateVolume

I0418 10:13:46.085127 1 utils.go:199] ID: 31 Req-ID: pvc-051787a2-cdad-4e98-9562-01b43ce55a8e GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-051787a2-cdad-4e98-9562-01b43ce55a8e","parameters":{"clusterID":"80a8efd7-8ed5-4e53-bc5b-f91c56300e99","csi.storage.k8s.io/pv/name":"pvc-051787a2-cdad-4e98-9562-01b43ce55a8e","csi.storage.k8s.io/pvc/name":"bytebase-pvc","csi.storage.k8s.io/pvc/namespace":"bytebase","imageFeatures":"layering","pool":"k8s-store"},"secrets":"stripped","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4","mount_flags":["discard"]}},"access_mode":{"mode":1}}]}

I0418 10:13:46.085678 1 rbd_util.go:1315] ID: 31 Req-ID: pvc-051787a2-cdad-4e98-9562-01b43ce55a8e setting disableInUseChecks: false image features: [layering] mounter: rbd

E0418 10:13:46.106773 1 controllerserver.go:232] ID: 31 Req-ID: pvc-051787a2-cdad-4e98-9562-01b43ce55a8e failed to connect to volume : failed to get connection: connecting failed: rados: ret=-1, Operation not permitted

E0418 10:13:46.106850 1 utils.go:203] ID: 31 Req-ID: pvc-051787a2-cdad-4e98-9562-01b43ce55a8e GRPC error: rpc error: code = Internal desc = failed to get connection: connecting failed: rados: ret=-1, Operation not permitted

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 18, 2024

@wangchao732 can you do rados operations with the above ceph users? rados ls etc?

@wangchao732
Copy link
Author

have you created the csi subvolumegroup in the filesystem? if not please create it and then try to create the user

I don't know what's going on, no matter how I try I get the same error.
cephcsi: v3.11.0
csi-provisioner: v4.0.0
kubernetes : v1.22.12
ceph: 14.2.22

2024-04-18 18:27:52.867588
[INF]
from='client.? 192.168.13.180:0/1251224851' entity='client.admin' cmd='[{"prefix": "auth get-or-create", "entity": "client.csi-rbd", "caps": ["mon", "profile rbd", "osd", "profile rbd pool=k8s-store", "mgr", "profile rbd"]}]': finished

2024-04-18 18:27:52.861771
[INF]
from='client.? 192.168.13.180:0/1251224851' entity='client.admin' cmd=[{"prefix": "auth get-or-create", "entity": "client.csi-rbd", "caps": ["mon", "profile rbd", "osd", "profile rbd pool=k8s-store", "mgr", "profile rbd"]}]: dispatch

2024-04-18 18:27:01.821352
[INF]
from='client.? 192.168.13.180:0/3439531549' entity='client.admin' cmd='[{"prefix": "auth rm", "entity": "client.csi-rbd"}]': finished

ceph osd lspools
1 k8s-store
2 .rgw.root
3 default.rgw.control
4 default.rgw.meta
5 default.rgw.log
6 store-file
7 store_metadata
8 default.rgw.buckets.index

@wangchao732
Copy link
Author

@wangchao732 您可以与上述 Ceph 用户进行 RADOS 操作吗? 等?rados ls

rados lspools
k8s-store
.rgw.root
default.rgw.control
default.rgw.meta
default.rgw.log
store-file
store_metadata
default.rgw.buckets.index

@wangchao732
Copy link
Author

@wangchao732 您可以与上述 Ceph 用户进行 RADOS 操作吗? 等?rados ls

rados lspools k8s-store .rgw.root default.rgw.control default.rgw.meta default.rgw.log store-file store_metadata default.rgw.buckets.index

rados ls -p k8s-store
rbd_directory
Bj13-Ceph01-Dev.BCLD.COM
rbd_info
rbd_header.121a144662a3e
rbd_id.data-logs
rbd_object_map.121a144662a3e

@wangchao732
Copy link
Author

Executed in a k8s cluster:

rados ls -p k8s-store
2024-04-18 19:22:42.161 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2024-04-18 19:22:42.161 7f306b5b29c0 -1 AuthRegistry(0x55f3b1d9e288) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx
2024-04-18 19:22:42.163 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2024-04-18 19:22:42.164 7f306b5b29c0 -1 AuthRegistry(0x7ffe5cc837b8) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx
failed to fetch mon config (--no-mon-config to skip)

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 18, 2024

Executed in a k8s cluster:

rados ls -p k8s-store 2024-04-18 19:22:42.161 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.161 7f306b5b29c0 -1 AuthRegistry(0x55f3b1d9e288) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx 2024-04-18 19:22:42.163 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.164 7f306b5b29c0 -1 AuthRegistry(0x7ffe5cc837b8) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx failed to fetch mon config (--no-mon-config to skip)

please pass --key and --user and -m from the kubernetes cluster

@wangchao732
Copy link
Author

Executed in a k8s cluster:
rados ls -p k8s-store 2024-04-18 19:22:42.161 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.161 7f306b5b29c0 -1 AuthRegistry(0x55f3b1d9e288) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx 2024-04-18 19:22:42.163 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.164 7f306b5b29c0 -1 AuthRegistry(0x7ffe5cc837b8) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx failed to fetch mon config (--no-mon-config to skip)

please pass --key and --user and -m from the kubernetes cluster

rados ls -p k8s-store --keyring /etc/ceph/ceph.client.csi-rbd.keyring --name client.csi-rbd
rbd_directory
Bj13-Ceph01-Dev.BCLD.COM
rbd_info
rbd_header.121a144662a3e
rbd_id.data-logs
rbd_object_map.121a144662a3e

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 18, 2024

Executed in a k8s cluster:
rados ls -p k8s-store 2024-04-18 19:22:42.161 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.161 7f306b5b29c0 -1 AuthRegistry(0x55f3b1d9e288) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx 2024-04-18 19:22:42.163 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.164 7f306b5b29c0 -1 AuthRegistry(0x7ffe5cc837b8) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx failed to fetch mon config (--no-mon-config to skip)

please pass --key and --user and -m from the kubernetes cluster

rados ls -p k8s-store --keyring /etc/ceph/ceph.client.csi-rbd.keyring --name client.csi-rbd rbd_directory Bj13-Ceph01-Dev.BCLD.COM rbd_info rbd_header.121a144662a3e rbd_id.data-logs rbd_object_map.121a144662a3e

can you also check if you are able to do write write operation in the pool you are planning to use https://docs.ceph.com/en/latest/man/8/rados/#examples

@wangchao732
Copy link
Author

Executed in a k8s cluster:
rados ls -p k8s-store 2024-04-18 19:22:42.161 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.161 7f306b5b29c0 -1 AuthRegistry(0x55f3b1d9e288) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx 2024-04-18 19:22:42.163 7f306b5b29c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2024-04-18 19:22:42.164 7f306b5b29c0 -1 AuthRegistry(0x7ffe5cc837b8) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx failed to fetch mon config (--no-mon-config to skip)

please pass --key and --user and -m from the kubernetes cluster

rados ls -p k8s-store --keyring /etc/ceph/ceph.client.csi-rbd.keyring --name client.csi-rbd rbd_directory Bj13-Ceph01-Dev.BCLD.COM rbd_info rbd_header.121a144662a3e rbd_id.data-logs rbd_object_map.121a144662a3e

can you also check if you are able to do write write operation in the pool you are planning to use https://docs.ceph.com/en/latest/man/8/rados/#examples

The test was successful.
rados -p k8s-store put testfile test.json --keyring /etc/ceph/ceph.client.csi-rbd.keyring --name client.csi-rbd
[root@bj11-bcld-k8s01 k8s]# rados ls -p k8s-store --keyring /etc/ceph/ceph.client.csi-rbd.keyring --name client.csi-rbd
rbd_directory
Bj13-Ceph01-Dev.BCLD.COM
rbd_info
testfile
rbd_header.121a144662a3e
rbd_id.data-logs
rbd_object_map.121a144662a3e

kind: Secret
apiVersion: v1
metadata:
name: csi-rbd-secret
namespace: ceph-csi-rbd
annotations:
kubesphere.io/creator: admin
data:
userID: Y3NpLXJiZAo=
userKey: QVFDbzlTQm1SR1pnTXhBQTFiYXhsNjVyYTZENjRSUVErbUVuZmc9PQo=
type: Opaque

@wangchao732
Copy link
Author

wangchao732 commented Apr 18, 2024

[root@bj11-bcld-k8s01 k8s]# cat /etc/ceph/ceph.client.csi-rbd.keyring
[client.csi-rbd]
key = AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg==
caps mgr = "profile rbd"
caps mon = "profile rbd"
caps osd = "profile rbd pool=k8s-store"
[root@bj11-bcld-k8s01 k8s]# echo "AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg=="|base64
QVFDbzlTQm1SR1pnTXhBQTFiYXhsNjVyYTZENjRSUVErbUVuZmc9PQo=
image

try userID base64 client.csi-rbd,same error.

@wangchao732
Copy link
Author

csi-rbdplugin log:

I0418 12:04:35.713123 1 utils.go:198] ID: 47 Req-ID: pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9 GRPC call: /csi.v1.Controller/CreateVolume

I0418 12:04:35.713486 1 utils.go:199] ID: 47 Req-ID: pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9 GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9","parameters":{"clusterID":"80a8efd7-8ed5-4e53-bc5b-f91c56300e99","csi.storage.k8s.io/pv/name":"pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9","csi.storage.k8s.io/pvc/name":"bytebase-pvc","csi.storage.k8s.io/pvc/namespace":"bytebase","imageFeatures":"layering","pool":"k8s-store"},"secrets":"stripped","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4","mount_flags":["discard"]}},"access_mode":{"mode":1}}]}

I0418 12:04:35.713771 1 rbd_util.go:1315] ID: 47 Req-ID: pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9 setting disableInUseChecks: false image features: [layering] mounter: rbd

E0418 12:04:35.750268 1 controllerserver.go:232] ID: 47 Req-ID: pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9 failed to connect to volume : failed to get connection: connecting failed: rados: ret=-1, Operation not permitted

E0418 12:04:35.750355 1 utils.go:203] ID: 47 Req-ID: pvc-2c763594-8d4f-407c-bbfd-752dcaecb1c9 GRPC error: rpc error: code = Internal desc = failed to get connection: connecting failed: rados: ret=-1, Operation not permitted

@nixpanic nixpanic added question Further information is requested component/deployment Helm chart, kubernetes templates and configuration Issues/PRs dependency/ceph depends on core Ceph functionality labels Apr 19, 2024
@nixpanic
Copy link
Member

Is there a way you can check the Ceph MON/OSD logs for rejected connection requests? Maybe the problem is not with the credentials, but with the network configuration of the pods/nodes?

You can also try the manual commands from within the csi-rbdplugin container of a csi-rbdplugin-provisioner pod.

@wangchao732
Copy link
Author

Is there a way to check the Ceph MON/OSD logs for denied connection requests? Maybe the problem isn't with the credentials, but with the network configuration of the pod/node?

Oh,ceph-mon.log find error, but entry client.csi-rbd exist.

2024-04-19 17:25:30.090 7f6b31027700 0 cephx server client.csi-rbd : couldn't find entity name: client.csi-rbd

client.csi-rbd
key: AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg==
caps: [mgr] profile rbd
caps: [mon] profile rbd
caps: [osd] profile rbd pool=k8s-store

ceph auth list | grep client.csi-rbd
installed auth entries:

client.csi-rbd
client.csi-rbd-node
client.csi-rbd-provisioner

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 19, 2024

adminID : client.admin(base64)
adminKey: xxx(base64)

sorry i missed it, the adminID should contain only base64 encoding of admin without client.

@wangchao732
Copy link
Author

Is there a way you can check the Ceph MON/OSD logs for rejected connection requests? Maybe the problem is not with the credentials, but with the network configuration of the pods/nodes?

You can also try the manual commands from within the csi-rbdplugin container of a csi-rbdplugin-provisioner pod.

adminID : client.admin(base64)
adminKey: xxx(base64)

sorry i missed it, the adminID should contain only base64 encoding of admin without client.

in container

sh-4.4# ceph
[errno 1] RADOS permission error (error connecting to the cluster)
sh-4.4# rados ls - -p k8s-store
2024-04-19T09:45:32.968+0000 7f7f2dba2700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
failed to fetch mon config (--no-mon-config to skip)
sh-4.4# ls
bin csi dev etc home lib lib64 lost+found media mnt opt proc root run sbin srv sys tmp usr var
sh-4.4# cd /etc/ceph
sh-4.4# ls
ceph.conf keyring
sh-4.4# cat keyring
sh-4.4# cat ceph.conf
[global]
fsid = 80a8efd7-8ed5-4e53-bc5b-f91c56300e99
mon initial members = 192.168.13.180,192.168.13.181,192.168.13.182
mon host = 192.168.13.180,192.168.13.181,192.168.13.182
mon addr = 192.168.13.180:6789,192.168.13.181:6789,192.168.13.182:6789
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

@wangchao732
Copy link
Author

client.

Yes, base64 has been modified to not include clinet., the issue is currently being encountered.

Is there a way to check the Ceph MON/OSD logs for denied connection requests? Maybe the problem isn't with the credentials, but with the network configuration of the pod/node?

Oh,ceph-mon.log find error, but entry client.csi-rbd exist.

2024-04-19 17:25:30.090 7f6b31027700 0 cephx server client.csi-rbd : couldn't find entity name: client.csi-rbd

client.csi-rbd key: AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg== caps: [mgr] profile rbd caps: [mon] profile rbd caps: [osd] profile rbd pool=k8s-store

ceph auth list | grep client.csi-rbd installed auth entries:

client.csi-rbd client.csi-rbd-node client.csi-rbd-provisioner

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 19, 2024

Is there a way you can check the Ceph MON/OSD logs for rejected connection requests? Maybe the problem is not with the credentials, but with the network configuration of the pods/nodes?
You can also try the manual commands from within the csi-rbdplugin container of a csi-rbdplugin-provisioner pod.

adminID : client.admin(base64)
adminKey: xxx(base64)

sorry i missed it, the adminID should contain only base64 encoding of admin without client.

in container

sh-4.4# ceph [errno 1] RADOS permission error (error connecting to the cluster) sh-4.4# rados ls - -p k8s-store 2024-04-19T09:45:32.968+0000 7f7f2dba2700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] failed to fetch mon config (--no-mon-config to skip) sh-4.4# ls bin csi dev etc home lib lib64 lost+found media mnt opt proc root run sbin srv sys tmp usr var sh-4.4# cd /etc/ceph sh-4.4# ls ceph.conf keyring sh-4.4# cat keyring sh-4.4# cat ceph.conf [global] fsid = 80a8efd7-8ed5-4e53-bc5b-f91c56300e99 mon initial members = 192.168.13.180,192.168.13.181,192.168.13.182 mon host = 192.168.13.180,192.168.13.181,192.168.13.182 mon addr = 192.168.13.180:6789,192.168.13.181:6789,192.168.13.182:6789 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx

You need to provide --id and keyring.

@wangchao732
Copy link
Author

Is there a way you can check the Ceph MON/OSD logs for rejected connection requests? Maybe the problem is not with the credentials, but with the network configuration of the pods/nodes?
You can also try the manual commands from within the csi-rbdplugin container of a csi-rbdplugin-provisioner pod.

adminID : client.admin(base64)
adminKey: xxx(base64)

sorry i missed it, the adminID should contain only base64 encoding of admin without client.

in container
sh-4.4# ceph [errno 1] RADOS permission error (error connecting to the cluster) sh-4.4# rados ls - -p k8s-store 2024-04-19T09:45:32.968+0000 7f7f2dba2700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] failed to fetch mon config (--no-mon-config to skip) sh-4.4# ls bin csi dev etc home lib lib64 lost+found media mnt opt proc root run sbin srv sys tmp usr var sh-4.4# cd /etc/ceph sh-4.4# ls ceph.conf keyring sh-4.4# cat keyring sh-4.4# cat ceph.conf [global] fsid = 80a8efd7-8ed5-4e53-bc5b-f91c56300e99 mon initial members = 192.168.13.180,192.168.13.181,192.168.13.182 mon host = 192.168.13.180,192.168.13.181,192.168.13.182 mon addr = 192.168.13.180:6789,192.168.13.181:6789,192.168.13.182:6789 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx

You need to provide --id and keyring.

Where is it configured?

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 19, 2024

rados ls - -p k8s-store

rados ls --p=k8s-store -m=192.168.13.180:6789 --user=csi-rbd --key=AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg==

@wangchao732
Copy link
Author

wangchao732 commented Apr 19, 2024

rados ls - -p k8s-store

rados ls --p=k8s-store -m=192.168.13.180:6789 --user=csi-rbd --key=AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg==

sh-4.4# rados ls -p=k8s-store --user=csi-rbd --key=AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg== --no-mon-config
2024-04-19T10:31:18.839+0000 7f5767fff700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
2024-04-19T10:31:18.839+0000 7f576fb47700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
couldn't connect to cluster: (1) Operation not permitted
sh-4.4# rados ls -c /etc/ceph/ceph.conf -p=k8s-store --user=csi-rbd --key=AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg== --no-mon-config
2024-04-19T10:31:44.122+0000 7f9fc9ca5700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
couldn't connect to cluster: (1) Operation not permitted
sh-4.4#

cephx server client.csi-rbd: unexpected key: req.key=a128c0dd1c3cd379 expected_key=c3c8ea50fbc36e10
2024-04-19 18:35:09.605 7f6b31027700 0 cephx server client.csi-rbd
: couldn't find entity name: client.csi-rbd

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 19, 2024

couldn't connect to cluster: (1) Operation not permitted

This is the exact problem, please check with ceph for it, not able to help with debugging on this issue as nothing seems to be wrong with csi.

@wangchao732
Copy link
Author

r client.csi-rbd: unexpected key: req.key=a128c0dd1c3cd379 expected_key=c3c8ea50fbc36e10
The key does not match,But my proofreading is consistent.

cephx server client.csi-rbd: unexpected key: req.key=a128c0dd1c3cd379 expected_key=c3c8ea50fbc36e10
2024-04-19 18:35:09.605 7f6b31027700 0 cephx server client.csi-rbd
: couldn't find entity name: client.csi-rbd
image

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 19, 2024

@wangchao732 i think you dont need to pass the encrypted key, can you pass key without base64 encoding? it should be the key you will get it from the ceph auth ls output

@wangchao732
Copy link
Author

@wangchao732 i think you dont need to pass the encrypted key, can you pass key without base64 encoding? it should be the key you will get it from the ceph auth ls output
Currently using kubesphere, the configuration file requires base64.
[root@bj11-bcld-k8s01 opt]# kubectl apply -f ceph-sc-rbd.yml
Warning: resource secrets/csi-rbd-secret is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
The request is invalid: patch: Invalid value: "map[data:map[userID:csi-rbd userKey:AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg==] metadata:map[annotations:map[kubectl.kubernetes.io/last-applied-configuration:{"apiVersion":"v1","data":{"userID":"csi-rbd","userKey":"AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg=="},"kind":"Secret","metadata":{"annotations":{"kubesphere.io/creator":"admin"},"name":"csi-rbd-secret","namespace":"ceph-csi-rbd"},"type":"Opaque"}\n]]]": error decoding from json: illegal base64 data at input byte 3
[root@bj11-bcld-k8s01 opt]# cat ceph-sc-rbd.yml
kind: Secret
apiVersion: v1
metadata:
name: csi-rbd-secret
namespace: ceph-csi-rbd
annotations:
kubesphere.io/creator: admin
data:
userID: csi-rbd
userKey: AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg==
type: Opaque

@wangchao732
Copy link
Author

rados ls -p=k8s-store --user=csi-rbd --key=AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg== --no-mon-config

It doesn't seem to be the problem, and the command is still reported when executed in the ceph cluster.

@wangchao732 i think you dont need to pass the encrypted key, can you pass key without base64 encoding? it should be the key you will get it from the ceph auth ls output

[root@Bj13-Ceph01-Dev ceph]# rados ls -p=k8s-store --user=csi-rbd --key=AQCo9SBmRGZgMxAA1baxl65ra6D64RQQ+mEnfg== --no-mon-config
2024-04-19 18:54:07.940 7fa8130169c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.csi-rbd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2024-04-19 18:54:07.941 7fa8130169c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.csi-rbd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2024-04-19 18:54:07.941 7fa8130169c0 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.csi-rbd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2024-04-19 18:54:07.942 7fa8028ad700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
couldn't connect to cluster: (1) Operation not permitted

@wangchao732
Copy link
Author

hink you dont need to pass the encrypted key, can you pass key without base64 encoding? it should be the key you will get it from the ceph auth ls output

The keyning connection is configured in ceph.conf and the connection is successful.

sh-4.4# rados ls - -p k8s-store --keyring /etc/ceph/keyring --name client.csi-rbd
rbd_directory
Bj13-Ceph01-Dev.BCLD.COM
rbd_info
testfile
rbd_header.121a144662a3e
rbd_id.data-logs
rbd_object_map.121a144662a3e
sh-4.4# cat /etc/ceph/keyring
[client.csi-rbd]
key = AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg==
caps mgr = "profile rbd"
caps mon = "profile rbd"
caps osd = "profile rbd pool=k8s-store"

@wangchao732
Copy link
Author

hink you dont need to pass the encrypted key, can you pass key without base64 encoding? it should be the key you will get it from the ceph auth ls output

The keyning connection is configured in ceph.conf and the connection is successful.

sh-4.4# rados ls - -p k8s-store --keyring /etc/ceph/keyring --name client.csi-rbd rbd_directory Bj13-Ceph01-Dev.BCLD.COM rbd_info testfile rbd_header.121a144662a3e rbd_id.data-logs rbd_object_map.121a144662a3e sh-4.4# cat /etc/ceph/keyring [client.csi-rbd] key = AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg== caps mgr = "profile rbd" caps mon = "profile rbd" caps osd = "profile rbd pool=k8s-store"

@Madhu-1 Can you help me take a look?

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 24, 2024

the above looks good, can you try to create PVC now, and also paste the kubectl yaml output of cephfs and rbd secrets?

@wangchao732
Copy link
Author

the above looks good, can you try to create PVC now, and also paste the kubectl yaml output of cephfs and rbd secrets?

The problem remains, failed to get connection: connecting failed: rados: ret=-1, Operation not permitted.
#kubectl get secret csi-rbd-secret -n ceph-csi-rbd -o yaml
apiVersion: v1
data:
userID: Y3NpLXJiZAo=
userKey: QVFBL3RTaG03WGtySHhBQWV4M2xVdkFGMEtkeE1aQkMxZEd1SWc9PQo=
kind: Secret
metadata:
annotations:
kubesphere.io/creator: admin
creationTimestamp: "2024-04-18T06:41:33Z"
name: csi-rbd-secret
namespace: ceph-csi-rbd
resourceVersion: "104329994"
selfLink: /api/v1/namespaces/ceph-csi-rbd/secrets/csi-rbd-secret
uid: 1a66d008-b05c-4c33-8d8e-da69b79b8115
type: Opaque

#kubectl get configmap ceph-config -n ceph-csi-rbd -o yaml
apiVersion: v1
data:
ceph.conf: |
[global]
fsid = 80a8efd7-8ed5-4e53-bc5b-f91c56300e99
mon_initial_members = 192.168.13.180,192.168.13.181,192.168.13.182
mon_host = 192.168.13.180,192.168.13.181,192.168.13.182
mon_addr = 192.168.13.180:6789,192.168.13.181:6789,192.168.13.182:6789
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
keyring: "[client.csi-rbd]\n\tkey = AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg==\n\tcaps
mgr = "profile rbd"\n\tcaps mon = "profile rbd"\n\tcaps osd = "profile rbd
pool=k8s-store"\n"
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: ceph-csi-rbd
meta.helm.sh/release-namespace: ceph-csi-rbd
creationTimestamp: "2024-04-18T06:17:24Z"
labels:
app: ceph-csi-rbd
app.kubernetes.io/managed-by: Helm
chart: ceph-csi-rbd-3.11.0
component: nodeplugin
heritage: Helm
release: ceph-csi-rbd
name: ceph-config
namespace: ceph-csi-rbd
resourceVersion: "101316266"
selfLink: /api/v1/namespaces/ceph-csi-rbd/configmaps/ceph-config
uid: ab2689c9-0b42-4f11-8c23-93c11a10e61c

kubectl get configmap ceph-csi-config -n ceph-csi-rbd -o yaml

apiVersion: v1
data:
cluster-mapping.json: '[]'
config.json: '[{"clusterID": "80a8efd7-8ed5-4e53-bc5b-f91c56300e99","monitors":
["192.168.13.180:6789","192.168.13.181:6789","192.168.13.182:6789"]}]'
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: ceph-csi-rbd
meta.helm.sh/release-namespace: ceph-csi-rbd
creationTimestamp: "2024-04-18T06:17:24Z"
labels:
app: ceph-csi-rbd
app.kubernetes.io/managed-by: Helm
chart: ceph-csi-rbd-3.11.0
component: nodeplugin
heritage: Helm
release: ceph-csi-rbd
name: ceph-csi-config
namespace: ceph-csi-rbd
resourceVersion: "100574540"
selfLink: /api/v1/namespaces/ceph-csi-rbd/configmaps/ceph-csi-config
uid: 6cb85b3a-a885-4160-8eb5-acbd2c4baf0d

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 24, 2024

sh-4.4# cat /etc/ceph/keyring
[client.csi-rbd]
key = AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg==
caps mgr = "profile rbd"
caps mon = "profile rbd"
caps osd = "profile rbd pool=k8s-store"

AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg== is the key in above keyring file when i decode the secret am getting different one.

$echo QVFBL3RTaG03WGtySHhBQWV4M2xVdkFGMEtkeE1aQkMxZEd1SWc9PQo=|base64 -d
AQA/tShm7XkrHxAAex3lUvAF0KdxMZBC1dGuIg==

echo Y3NpLXJiZAo=|base64 -d
csi-rbd

is this the right key for csi-rbd user?

@wangchao732
Copy link
Author

kubernetes: 1.22.2 to create a secrets, you must do base64, otherwise a failed error decoding from json: illegal base64 data at input byte 3 will be created.

@wangchao732
Copy link
Author

sh-4.4# cat /etc/ceph/keyring
[client.csi-rbd]
key = AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg==
caps mgr = "profile rbd"
caps mon = "profile rbd"
caps osd = "profile rbd pool=k8s-store"

AQAoQiJmp0avAhAAOLii0I8sa3zIAmR0fNdQjg== is the key in above keyring file when i decode the secret am getting different one.

$echo QVFBL3RTaG03WGtySHhBQWV4M2xVdkFGMEtkeE1aQkMxZEd1SWc9PQo=|base64 -d AQA/tShm7XkrHxAAex3lUvAF0KdxMZBC1dGuIg==

echo Y3NpLXJiZAo=|base64 -d csi-rbd

is this the right key for csi-rbd user?

Yes, I rebuilt it later.

@wangchao732
Copy link
Author

image

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 24, 2024

caps mgr = "allow rw"
caps mon = "profile rbd"
caps osd = "profile rbd pool=k8s-store"

can you give permission to the mgr or use csi-rbd-provisioner user in secret and see if it works?

@wangchao732
Copy link
Author

image
image

caps mgr = "allow rw"
caps mon = "profile rbd"
caps osd = "profile rbd pool=k8s-store"

can you give permission to the mgr or use csi-rbd-provisioner user in secret and see if it works?

The problem remains.
image

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 25, 2024

That key is expected to work as its working on other cluster, I am out of the idea sorry

@wangchao732
Copy link
Author

That key is expected to work as its working on other cluster, I am out of the idea sorry

It's okay, I'm also puzzled, guessing that it may be an exception when getting userid/userkey to connect to the cluster when calling the ceph api, but I don't have any proof for the time being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/deployment Helm chart, kubernetes templates and configuration Issues/PRs dependency/ceph depends on core Ceph functionality question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants