-
Notifications
You must be signed in to change notification settings - Fork 536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User "system:serviceaccount:ceph-csi-rbd:ceph-csi-rbd-provisioner" cannot list resource "nodes" #4800
Comments
For now, I've worked around it by adding the necessary permissions to the clusterrole. Once I did that, the provisioner finished its startup phase, saw the PVC, provisioned the PV and life is good. |
Hmm. With the above workaround, the pod is started and apparently running. But I see this event on the pod that mounted the new PVC:
Other than that event, I don't see the word "immediate" anywhere in the pod, PVC, or PV. I don't really know what it means. |
@Infinoid hi, will be fixed by #4798 and issue covered in the #4790 (comment) |
Thanks. Your discussion on #4790 was after it was closed, so I missed it. I think you're right, this is a duplicate. |
Describe the bug
After upgrading from 3.11.0 to 3.12.1 (using helm), the csi-provisioner log is getting constant permission errors. I updated both rbd and cephfs, but it's only happening in the rbd provisioner.
Environment details
Steps to reproduce
Steps to reproduce the behavior:
Actual results
rbd volume not provisioned, pod never starts.
Expected behavior
Provisioner provisions without error, same as 1.11.0.
Logs
The
csi-provisioner
container logs repeat these messages at some interval:I see lots of errors in
csi-snapshotter
logs too, but I think that's unrelated. (I don't have any VSCs defined.)I see nothing of note in other ceph-csi-rbd logs.
In the application's namespace, I see events like:
Additional context
Terraform installation & configuration
This is how I installed and configured it. To update, I only changed the "version" line.
What the previous (successful) version looks like
I tested provisioning and removal using the bitnami/mariadb helm chart. I am using the same chart, configured the same way, for old and new versions of ceph-csi-rbd. Here's what the v3.11.0
csi-provisioner
log looks like:In comparison, v3.12.1 gets the lease, fails to create its watches, and keeps retrying forever; it doesn't even notice the PVC.
RBAC
In the installation manifest, permission to read nodes and csinodes is always granted.
In the helm chart, permission is only granted when the domainLabels list is non-empty, but it's now empty by default (#4776). But the provisioner is still trying to read node/csinode stuff, and apparently can't finish its setup phase without it. So that seems to be why it's failing now.
At this point, I'm feeling a little lost. I feel like an enabled feature with an empty configuration should do nothing. But this is doing a little too much nothing 😁. Should the provisioner have permission to look at nodes, regardless of whether domain labels are defined in the helm chart?
I saw the discussion of command line arguments in #4777 and #4790. Was that intended to fix this issue? I checked and my provisioner is indeed being passed the
--immediate-topology=false
flag.The text was updated successfully, but these errors were encountered: