-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Host mount not removed and can be reused if multiple CSI-SMB PV/PVCs use the same network address #353
Comments
currently one PV on one node would only have one smb mount, shared by multiple pods, so it's one mount per PV, instead of one mount per pod, this could reduce smb mount num on the node. |
Hi @andyzhangx , Thank you for getting back to me so quickly. Unfortunately, I'm working on a project where this workaround won't be possible. My project involves a web interface where different users are able to log in and independently specify different network storage configurations (address, username and password). The system will then create separate PVs and start a pod that outputs data to each separate PV. Sharing a PV between pods would require that end-users coordinate to use only a single network share, with a shared username and password, which doesn't fit the user interface design. I think that a Kubernetes clustered environment should support more than one user simultaneously, but independently, using the same network host address with different credentials. Would it be possible for the CSI-SMB driver to support more than one PV referencing the same network host address? Thank you for your help, Best regards, |
I think you could set up multiple PVs with different settings, e.g. network share, username |
Hi @andyzhangx , I agree that users could all specify their own configuration that uses different settings relative to each other. The difficulty is how the users can reliably avoid specifying a configuration that is already in use. This is especially true when new users try a common, well-known network share to make sure the product is working for them. If my project needs to enforce this uniqueness, it will need to present error messages to the user like: "You cannot use this network share configuration because it is in use by another user on the system." This is a confusing error message, especially for a shared cluster environment where user segregation is normally assured. Would it be possible for the CSI-SMB driver to support more than one PV referencing the same network host address, share name and username combination? Thank you for your help, Best regards, |
CSI-SMB driver supports multiple PVs referencing the same network host address, share name and username combination |
The bug reproduction steps above show that OS mount points are not correctly unmounted if multiple PVs are created that reference the same network host address, share name and username combination and then one of these PVs is deleted. Have I provided enough information in my steps above for you to reproduce this bug? Thank you for your help, Best regards, |
@snazzysoftware could you provide node driver logs on that agent node, follow by: https://github.com/kubernetes-csi/csi-driver-smb/blob/master/docs/csi-debug.md#case2-volume-mountunmount-failed If one PV is not used by any pod on the node, it should be unmounted when the last pod is terminated. There would be |
Hi @andyzhangx , I've followed the bug reproduction steps above and captured timings and logs. Please see the following test transcript that shows the commands I ran and the main timestamps. Please also find attached a zip of all the logs that include a warning from the NodeUnpublishVolume step (in
Test Transcript
Install the long-lived deployment:
Capture logs 1:
Check the host OS mounts:
Install the short-lived deployment:
Capture logs 2:
Check the host OS mounts:
Uninstall the short-lived deployment:
Capture logs 3:
Check the host OS mounts, unexpected result 1 (host OS mount point for short-lived PV is not unmounted):
Reinstall the short-lived deployment with incorrect password:
Capture logs 4:
Unexpected result 2 (pod has access to network share with incorrect password in secret):
Please let me know if you require any more information. Thank you very much for your help. Best regards, |
I did not find $ kubectl delete deployment short-lived-deployment
$ kubectl delete pvc short-lived-pvc (hung)
$ kubectl delete pv short-lived-pv
$ kubectl delete secret short-lived-secret can you only delete deployment and check whether short-lived cifs mount is there? $ kubectl delete deployment short-lived-deployment |
Hi @andyzhangx , I've run the bug reproduction steps above with your suggested modification to only delete the short-lived deployment. Please see the following test transcript that shows the commands I ran and the main timestamps. Please also find attached a zip of all the logs. Test Transcript
Install the long-lived deployment:
Capture logs 1:
Check the host OS mounts:
Install the short-lived deployment:
Capture logs 2:
Check the host OS mounts:
Modified procedure to only uninstall the short-lived deployment:
Capture logs 3:
Check the host OS mounts, unexpected result 1 (host OS mount point for short-lived PV is not unmounted):
Reinstall the short-lived deployment with incorrect password:
Capture logs 4:
Unexpected result 2 (pod has access to network share with incorrect password in secret):
Please let me know if you require any more information. Thank you very much for your help. Best regards, |
hi @snazzysoftware when you delete long-lived deployment, does the cifs mount of long-lived PV still exists? Also I am not sure whether it's related, can you remove |
nvm, I got the repro on |
I got the root cause after checking kubelet logs, during # mount | grep cifs | uniq | grep smb-server
//smb-server.default.svc.cluster.local/share on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-smb/globalmount type cifs (rw,relatime,vers=3.0,cache=strict,username=USERNAME,uid=0,noforceuid,gid=0,noforcegid,addr=10.0.14.5,file_mode=0777,dir_mode=0777,soft,nounix,serverino,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1)
//smb-server.default.svc.cluster.local/share on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-smb2/globalmount type cifs (rw,relatime,vers=3.0,cache=strict,username=USERNAME,uid=0,noforceuid,gid=0,noforcegid,addr=10.0.14.5,file_mode=0777,dir_mode=0777,soft,nounix,serverino,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1) I think we may need to filter out if two references are in the same directory tree in k8s upstream
|
related code is here, not sure whether there is good way to fix this issue:
|
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What happened:
The mount point on the underlying Kubernetes host for a CSI-SMB PV/PVC is not unmounted if one or more other PV/PVCs remain that are also configured with the same shared folder network address. The host mount point that remains is reused by the CSI-SMB driver for future CSI-SMB PV/PVCs with the same name, avoiding the need to provide correct credentials to gain access to the shared folder.
What you expected to happen:
I would expect that PVs used by separate pods and configured to connect to the same shared drive would be mounted and unmounted without affecting each other.
How to reproduce it:
See the attached diagram.png for an overview of the deployment timelines for this issue.
The failure scenario involves:
All the Kubernetes definition files used in the test steps below are included in the attached yaml_files.zip file.
The steps assume availability of a SMB network share with address '//10.44.131.76/testshare', username 'testuser' and password 'correctpassword'.
//10.44.131.76/testshare on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/long-lived-pv/globalmount type cifs (rw,relatime,vers=3.0,cache=strict,username=testuser,domain=RGH_NAS,uid=0,noforceuid,gid=0,noforcegid,addr=10.44.131.76,file_mode=0777,dir_mode=0777,seal,soft,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)
//10.44.131.76/testshare on /var/lib/kubelet/pods/96edf4c3-50a2-403a-8cfa-de829eead8ea/volumes/kubernetes.io~csi/long-lived-pv/mount type cifs (rw,relatime,vers=3.0,cache=strict,username=testuser,domain=RGH_NAS,uid=0,noforceuid,gid=0,noforcegid,addr=10.44.131.76,file_mode=0777,dir_mode=0777,seal,soft,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)
long-lived-deployment-547474ff76-7kx9p 1/1 Running 0 3m20s
...
test.txt
//10.44.131.76/testshare on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/long-lived-pv/globalmount type cifs (rw,relatime,vers=3.0,cache=strict,username=testuser,domain=RGH_NAS,uid=0,noforceuid,gid=0,noforcegid,addr=10.44.131.76,file_mode=0777,dir_mode=0777,seal,soft,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)
//10.44.131.76/testshare on /var/lib/kubelet/pods/96edf4c3-50a2-403a-8cfa-de829eead8ea/volumes/kubernetes.io~csi/long-lived-pv/mount type cifs (rw,relatime,vers=3.0,cache=strict,username=testuser,domain=RGH_NAS,uid=0,noforceuid,gid=0,noforcegid,addr=10.44.131.76,file_mode=0777,dir_mode=0777,seal,soft,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)
//10.44.131.76/testshare on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/short-lived-pv/globalmount type cifs (rw,relatime,vers=3.0,cache=strict,username=testuser,domain=RGH_NAS,uid=0,noforceuid,gid=0,noforcegid,addr=10.44.131.76,file_mode=0777,dir_mode=0777,seal,soft,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)
//10.44.131.76/testshare on /var/lib/kubelet/pods/19951d18-9951-4d13-9568-ccc3a89cd30d/volumes/kubernetes.io~csi/short-lived-pv/mount type cifs (rw,relatime,vers=3.0,cache=strict,username=testuser,domain=RGH_NAS,uid=0,noforceuid,gid=0,noforcegid,addr=10.44.131.76,file_mode=0777,dir_mode=0777,seal,soft,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)
short-lived-deployment-6d48cdc984-gwx4b 1/1 Running 0 85s
...
test.txt
//10.44.131.76/testshare on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/long-lived-pv/globalmount type cifs (rw,relatime,vers=3.0,cache=strict,username=testuser,domain=RGH_NAS,uid=0,noforceuid,gid=0,noforcegid,addr=10.44.131.76,file_mode=0777,dir_mode=0777,seal,soft,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)
//10.44.131.76/testshare on /var/lib/kubelet/pods/96edf4c3-50a2-403a-8cfa-de829eead8ea/volumes/kubernetes.io~csi/long-lived-pv/mount type cifs (rw,relatime,vers=3.0,cache=strict,username=testuser,domain=RGH_NAS,uid=0,noforceuid,gid=0,noforcegid,addr=10.44.131.76,file_mode=0777,dir_mode=0777,seal,soft,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)
//10.44.131.76/testshare on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/short-lived-pv/globalmount type cifs (rw,relatime,vers=3.0,cache=strict,username=testuser,domain=RGH_NAS,uid=0,noforceuid,gid=0,noforcegid,addr=10.44.131.76,file_mode=0777,dir_mode=0777,seal,soft,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)
short-lived-deployment-6d48cdc984-8s2vm 1/1 Running 0 32s
...
test.txt
Anything else we need to know?:
Environment:
kubernetes-csi/csi-driver-smb v1.2.0:
https://github.com/kubernetes-csi/csi-driver-smb/releases/tag/v1.2.0
kubectl version
):'kubectl version' output:
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2+k3s1", GitCommit:"5a67e8dc473f8945e8e181f6f0b0dbbc387f6fca", GitTreeState:"clean", BuildDate:"2021-06-21T20:52:44Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2+k3s1", GitCommit:"5a67e8dc473f8945e8e181f6f0b0dbbc387f6fca", GitTreeState:"clean", BuildDate:"2021-06-21T20:52:44Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}
Using K3s distribution of Kubernetes.
'cat /etc/os-release' output:
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
uname -a
):'uname -a' output:
Linux ussd-tst-bacn05.edgeos.illumina.com 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered: