Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Way to specify posix userid for efs access points #393

Closed
hussainsaify opened this issue Mar 26, 2021 · 18 comments
Closed

Way to specify posix userid for efs access points #393

hussainsaify opened this issue Mar 26, 2021 · 18 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@hussainsaify
Copy link

@kbasv I have installed the efs-csi-driver for dynamically creating volumes and access points. However, each time a pvc is created it is creating an access point with a different posix userid incrementally from the gid range specified in the storage class. Is there a way to specify the posix user id so that the access point is created with that posix id only?

K8s Version- 1.19(EKS)
Efs-csi-driver- 1.2.0

@kbasv
Copy link

kbasv commented Mar 27, 2021

@hussainsaify No, there isn't a way in dynamic provisioning to make access points be provisioned using a specific posix Id.
If you need access points to be created using a specific posix Id, you will have to switch to static provisioning.

@hussainsaify
Copy link
Author

@kbasv Thanks for the response. If it makes sense to add a feature to specify userid while creating access points then it would be great beacuse in our curent use case each of our micorservices access a common folder structure and we build uid/gid into our docker images whcih may make dynamic provisioning non-viable to use at this point.

@Yukesh4791
Copy link

We have the same use case as @hussainsaify as mentioned and the posix gid is also getting incremented with each dynamically provisioned efs access points. It would be good to have a feature to set a fixed gid for each dynamically provisioned efs access point.
Thanks!

@tfrancisci
Copy link

I too would like to have multiple efs access points with the same posix ids. In our scenario, we have a stateful set where each pod has 2 efs mounts. One is shared amongst all pods in the stateful set (Shared directory) and the other is unique for each pod in the stateful set. As all pods run as the same user, it would make sense that all of those mounts have the same uid/gid.

We were really hoping that this dynamic provisioning would allow us to do that without having to connect to each efs access point separately to update the permissions.

Is there a technical reason this is not supported, or is it just a 'not yet' thing?

@tfrancisci
Copy link

I noticed #434 after I wrote my above comment so I see that this is being worked on.

However, I have been digging further and did see some comments that the range of allowed uids should indeed be a part of the SC, but specifying the UID for any given PV should be done (optionally) at the PVC level.

With the proposed changes in the above PR, then any dynamically created volume created with that SC would have the same uid/gid. This means that you would need a different SC for each application that could potentially be running as a different user.

If this uid were taken from the PVC spec, then each pod could specify its own uid/gid. This uid/gid would then be applied to the new Access Point and the correct permissions would be applied to the created directory.

As I have not dug deeply into the controller code yet, i'm not sure if it has access to spec or annotations from the PVC when creating the PV, so I don't know if this is possible...

@wochanda
Copy link

Thanks for the request here. I'm interested in hearing more about your use cases and other things you've tried. The intention of dynamic provisioning is to create new, empty, private volumes for applications to use. Logically, each PV/PVC represents a single volume or data set. When two pods need to share a volume, rather than having two PVs with the same configuration (path and identity), the best practice is to reference the PVC in both pod configurations.

Further, if you already have a file system with specific data in a path, and you want to access that data with a particular identity, the best practice would be to create a static PV that references the filesystemid/accesspointid that has the appropriate path/identity, then your application can claim it.

Could you help me understand if there are use cases that the above approaches do not satisfy?

@tfrancisci
Copy link

Hi Will, thanks for your interest in possible use-cases. I'm still working my way through some of this so it's entirely possible I have some things wrong, please correct me if this is the case or if there are preferred methods to get what I need. This could get long :)

There are a few scenarios that I alluded to above, so I will attempt to split them up further down. First an overview of what we are currently doing and trying to do.

What are we currently doing?

Right now, we are using Static provisioning, either path based or Access Point based.

  • If we do path based provisioning, we manually (Well, with Terraform) create the PVs that point to a specific path in the EFS. We then use an extra process to mount the pod as root to change the FS permissions.
  • If we do the Access Point based provisioning then we no longer need to do the song and dance of mounting the volumes to fix them, but we do need to create all the AP's, with advance knowledge of the uid that will be used by the application.

What are we trying to achieve?

We were hoping that with Dynamic provisioning we would be able to avoid the steps of having to manually provision the PVs. That's the intent right? This way, there is reduced overhead on the K8s administration team to do these extra steps before new applications are deployed.
However, without being able to specify the uid/gid this is a non-starter for any applications that like to fiddle with the ownership, these scenarios should help to explain why.

Scenario 1 - Per-pod volumes.

In this scenario, consider a single pod that needs persistent storage. This pod may be from some upstream vendor where we have no control over what the pod does. It may be that this pod decides to check the uid/gid of the persistent volume and then run chown commands on them to keep them in check. When the new uid of the chown command does not match the uid in the Access Point, it fails and can stop the pod from starting.

This is apparently the case in some other issues I have read recently, particularly here: #300
It's great that some upstream projects are able to change this chown behaviour, but we cannot rely on all to do that.

If we were able to specify the uid/gid, then the chown commands have no effect (as the before/after uid is the same now) and they do not fail. Of course for applications that do not care about the ownership on these persistent volumes, there is no problem with the dynamically allocated uid.

Some have suggested hacks like creating another user in the pod (based on the uid of the PV) and running the app as that, but it sounds hacky.

Scenario 2 - Pods with multiple volumes

In this scenario, there may be a pod with multiple persistent volumes backed by multiple dynamically provisioned Access Points. One may be a private volume just for that pod, another may be a shared volume that is used by multiple pods (We will move on to that in a moment). In this scenario, each of the mounted volumes will have a different uid. We run into the same problems here as we do in Scenario 1, except now you cannot even use the hack like running as a new user with the same uid as the PV.

Scenario 3 - Multiple pods using a single PV as a shared volume.

This is not really any different to scenario 1, as it still just presents as a mismatch between the PV uid and the container uid. The reason I am including it is to clarify that earlier, I was not talking about sharing volumes between pods by having separate PVs mapping back to the same EFS AP. I was referring to pods that reference the same PVC that uses ReadWriteMany. The same uid mismatch problems will still occur though.

Scenario 4 - Multiple applications using dynamically provisioned PVs on a single EFS FS

I have seen reference to the use case where the K8s administrators will provision a single EFS FS and then use Dynamic Provisioning to create APs & PVs for multiple applications.
In my above examples, I was referring to a single application needing a single uid. If #434 is implemented, then that will work just fine when all pods need their PVs to be configured with the same uid.
However in the scenario where multiple applications may request dynamically created PVs, then they would all get PVs with the same UID. This would not work where one application (Say, Jira) needs 2001 and another application (Say, Confluence) needs 2002.
To get around this problem you would need to define multiple SC's (possibly with separate backing EFS FS's), one for each application. This again turns into extra overhead for the K8s administrators.
This scenario is the reason for me asking about moving the uid definition to the PVC instead of the SC. This way the application could define the uid it requires.

A bit long, but I hope that helped shed some light on my thinking here.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 9, 2021
@aboyanov
Copy link

I too would like to have efs access points and the ability to specify the uid/gid for each.

My current use case is that we'd like to backup our cluster using Velero and some of our applications are using EFS.
When trying to restore Velero is trying to chown the data and is currently failing, because of "chown: operation not permitted".
If I'm able to specify the owner of the filesystem dynamically, then I'd be able to specify the user/group that the Velero InitContainer is running and would solve my problem for manually do some steps or workarounds.

@wochanda
Copy link

wochanda commented Aug 16, 2021

Thanks all for the comments on what you're trying to do. I have an alternate proposal on how to solve this: instead of hardcoding the UID/GID of each PV to a value specified at the storage class level, we can add a 'trustPodIdentity' option to the storage class. When enabled, instead of creating access points with a set UID/GID, the driver omits these values in the access point, causing EFS to trust the UID/GID sent over the wire from the client. In cases where the container is opinionated about its identity (which seems to be the case here), that identity will be used for any read/write/chown/etc operations, allowing these applications to work out of the box. The benefit of this approach compared to the existing PR is 1/ the SC owner/creator doesn't have to know the applications UID/GID ahead of time and 2/ the SC can support multiple applications that want to assume different UID/GIDs.

Now, in order for this to work we'll also have to create the access point's root directory with open (777) permissions because we won't know the UID/GID of the container ahead of time, but any container that does a chown/chmod on startup will be able to constrain this. This model also doesn't have the same level of security isolation that the default model with unique UID/GID per PV, snippit from EFS documentation:

Security Model for Access Point Root Directories
When a root directory override is in effect, Amazon EFS behaves like a Linux NFS server with the no_subtree_check option enabled.

In the NFS protocol, servers generate file handles that are used by clients as unique references when accessing files. EFS securely generates file handles that are unpredictable and specific to an EFS file system. When a root directory override is in place, EFS doesn't disclose file handles for files outside the specified root directory. However, in some cases a user might get a file handle for a file outside of their access point by using an out-of-band mechanism. For example, they might do so if they have access to a second access point. If they do this, they can perform read and write operations on the file.

File ownership and access permissions are always enforced, for access to files within and outside of a user's access point root directory.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 15, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@holmesb
Copy link
Contributor

holmesb commented Nov 22, 2021

I'm surprised you can't specify a uid\gid for each EFS volume in Kubernetes. Seems a glaring ommission when best practice is to run containers as non-root. Each app is unlikely to share uid\gid. Agree that it should occur in pvc, not sc.

Pinning the uid\gid to a static\predictable one, such that chown works is an improvement. But best to do away with initcontainers\chown and permission volumes right in the first place.

@tavin
Copy link

tavin commented Jun 16, 2022

So it seems that we can specify uid and gid in the StorageClass now. Thanks @nicolas-geniteau et al. for the improvement.

Is it not technically possible to specify uid and gid in the PersistentVolumeClaim instead of the StorageClass?

@jonathanrainer
Copy link
Contributor

@tavin Sadly no it isn't if you look at the API Documentation for PersistentVolumeClaims, they don't carry a huge amount of information (https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#persistentvolumeclaim-v1-core). StorageClasses have the nice property that the parameters field is just a collection of key, value pairs so you can put arbitrary data into them and then interpret it afterwards. PVCs don't have anything like this so there's no way to communicate via the PVC to the driver the kind of details that you want most unfortunately.

@tavin
Copy link

tavin commented Jun 16, 2022

@jonathanrainer Could the driver be programmed to look at some custom annotations on the PVC?

@jonathanrainer
Copy link
Contributor

@tavin So unfortunately not, while it's a very good idea in principle it would require changes to the external-provisioner a tool used to monitor the Kubernetes objects that get created (PVCs) and to create PVs on the back of that. While that has been supplemented with passing through some metadata in the recent past there are a lot of opinions on whether that should be extended any further: kubernetes-csi/external-provisioner#86

So in the short term the storageClass is really the only mechanism for passing that information through

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests