-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
when csi-plugin need exit and restart for upgrade or painc, pod will recive error msg that 'Transport endpoint is not connected', #91
Comments
Please attach the plugin logs. |
what i mean is that may be ceph-csi need a feature that can mounted history mounted path, |
mountpath is given by kubelet. if the pod is deleted, the mountpoint will be gone too. |
if we need upgrade ceph-csi now, we need to taint all node all drain all pod that used ceph-csi plugin , so if the ceph-csi plugin support remount last mounted path, may be ceph-csi plugin can support rolling update |
I meet the same issue. Is there any solution for it? Do we need to monitor the plugin and drain node when it restart, panic or updated? |
yes, drain the node before update. It is not the best solution but gives you some protection |
@rootfs do we need to do something in the code to fix this issue? if not can we close this one? |
@Madhu-1 would you add a upgrade process in readme? for cephfs mount, drain the node before upgrade. I believe this process applies to other FUSE mount drivers too. |
@rootfs as mentioned in #217 if csi plugin exit unexpect, the pod use cephfs pv can not auto recovery until pod be killed and reschedule. i think this is may be a problem. may be csi plugin can do more thing to remount the old path so when pod may be auto recovery when exit and restart , the old mount path can use |
issue ceph#217 Goal we try to solve when csi exit unexpect, the pod use cephfs pv can not auto recovery because lost mount relation until pod be killed and reschedule to other node. i think this is may be a problem. may be csi plugin can do more thing to remount the old path so when pod may be auto recovery when pod exit and restart, the old mount path can use. NoGoal Pod should exit and restart when csi plugin pod exit and mount point lost. if pod not exit will get error of **transport endpoint is not connected**. implment logic csi-plugin start: 1. load all MountCachEntry from node local dir 2. check if volID exist in cluster, if no we ignore this entry, if yes continue 3. check if stagingPath exist, if yes we mount the path 4. check if all targetPath exist, if yes we binmount to staging path NodeServer: 1. NodeStageVolume: add MountCachEntry on local dir include readonly attr and ceph secret 2. NodeStagePublishVolume: add pod bind mount path to MountCachEntry and persist local dir 3. NodeStageunPublishVolume: remove pod bind mount path From MountCachEntry and persist local dir 4. NodeStageunStageVolume: remove MountCachEntry from local dir
issue ceph#217 Goal we try to solve when csi exit unexpect, the pod use cephfs pv can not auto recovery because lost mount relation until pod be killed and reschedule to other node. i think this is may be a problem. may be csi plugin can do more thing to remount the old path so when pod may be auto recovery when pod exit and restart, the old mount path can use. NoGoal Pod should exit and restart when csi plugin pod exit and mount point lost. if pod not exit will get error of **transport endpoint is not connected**. implment logic csi-plugin start: 1. load all MountCachEntry from node local dir 2. check if volID exist in cluster, if no we ignore this entry, if yes continue 3. check if stagingPath exist, if yes we mount the path 4. check if all targetPath exist, if yes we binmount to staging path NodeServer: 1. NodeStageVolume: add MountCachEntry on local dir include readonly attr and ceph secret 2. NodeStagePublishVolume: add pod bind mount path to MountCachEntry and persist local dir 3. NodeStageunPublishVolume: remove pod bind mount path From MountCachEntry and persist local dir 4. NodeStageunStageVolume: remove MountCachEntry from local dir
sync devel branch with upstream
when csi-plugin exit and restart for upgrade or painc, pod will recive error msg that 'Transport endpoint is not connected', is cephfs-csi plan to support remount history mounted path when
csi-plugin start
The text was updated successfully, but these errors were encountered: