chore(seaweedfs-csi-driver): delete unnecessary stage #151

duanhongyi · 2024-01-12T02:50:32Z

NodeStageVolume is an additional step for the block devices, it's used to partition and format the disk and mount the disk on a node global directory.The mount of weed uses fuse, similar to NFS, not a block device, so you don't need to use NodeStageVolume,

At the same time, this change is also to prepare for future use of requiresRepublish, which means there is a mechanism to restore the pod's mount when nodeserver error, without the need to restart the pod.

Ceph nfs did not use NodeStageVolume(Non block devices):
https://github.com/ceph/ceph-csi/blob/devel/internal/nfs/nodeserver/nodeserver.go#L142

Ceph fs uses NodeStageVolume(block device):
https://github.com/ceph/ceph-csi/blob/devel/internal/cephfs/nodeserver.go#L158

chrislusf · 2024-01-12T04:27:25Z

Thanks and good to learn!

kvaster · 2024-01-16T21:03:24Z

And what about caches ? i.e. if some volume will be mounted in several pods on the same node ? Before this PR only one process will be used for this, but now each mount will use it's own process for this.

kvaster · 2024-01-16T21:19:49Z

Also requiresRepublish is not a mechanism for restoring the pod's mount.
See: https://kubernetes-csi.github.io/docs/csi-driver-object.html
Especially: CSI drivers should only atomically update the contents of the volume. Mount point change will not be seen by a running container.

The problem is that we really need to propagate new mount point into the running container and currently there is no any mechanism for this.

kvaster · 2024-01-16T21:50:37Z

@duanhongyi, and it's definitely wrong assuming that NodeStageVolume is used just for block devices.

NodeStageVolume – Mounts a persistent volume to a global path on a worker node.
NodePublishVolume – Mounts a persistent volume from the global path on a worker node to a pod.

From here: https://arslan.io/2018/06/21/how-to-write-a-container-storage-interface-csi-plugin/

NodeStageVolume: This method is called by the CO to temporarily mount the volume to a staging path. Usually this staging path is a global directory on the node. In Kubernetes, after it's mounted to the global directory, you mount it into the pod directory (via NodePublishVolume). The reason that mounting is a two step operation is because Kubernetes allows you to use a single volume by multiple pods. This is allowed when the storage system supports it (say NFS) or if all pods run on the same node. One thing to note is that you also need to format the volume if it's not formatted already. Keep that in mind.

NodePublishVolume: This method is called to mount the volume from staging to target path. Usually what you do here is a bind mount. A bind mount allows you to mount a path to a different path (instead of mounting a device to a path). In Kubernetes, this allows us for example to use the mounted volume from the staging path (i.e global directory) to the target path (pod directory). Here, formatting is not needed because we already did it in NodeStageVolume.

As a result NodeStageVolume is a wonderfull mechanism for ReadWriteMany volumes of any type. It allows to share mount many times on the same node.

chrislusf · 2024-01-16T21:52:33Z

@kvaster , seems @duanhongyi is in a different time zone. Let's revert this PR?

chrislusf · 2024-01-16T21:54:53Z

#153 is after this PR.

To revert this PR, #153 also need to be reverted.

kvaster · 2024-01-16T22:07:20Z

@chrislusf , my proposition is to wait for @duanhongyi answer. My personal position is to respect opinion and work of others. And it will be wonderfull if we all agree about the way to go.

kvaster · 2024-01-18T09:48:38Z

@duanhongyi, any comments ? If there are no any comments, my proposition is to revert this two PRs today.

duanhongyi · 2024-01-18T10:12:45Z

Sorry, I've been quite busy lately and don't have time for discussions; The original intention of mentioning this PR is to simplify the mounting process of CSI. In my opinion, it is unnecessary to share a mounting point among multiple nodes in one node. Instead, it increases the complexity of the code. Perhaps as you said, cache sharing is possible, but I think it is not necessary compared to the increase in complexity it brings.

@kvaster @chrislusf

kvaster · 2024-01-18T10:56:19Z

@duanhongyi , thanks for your answer! This is all about resources, not only cache. For use it saves lot's of resources (including network) when mount process is shared across many small pods on the node. I don't think that this brings too much complexity. But it brings flexibility and options without any other downside (except a little bit more complex code). Also it will not solve any of the problems mentioned - i.e remounting volume in case of upgrade or process crash.

duanhongyi · 2024-01-18T11:01:30Z

@duanhongyi , thanks for your answer! This is all about resources, not only cache. For use it saves lot's of resources (including network) when mount process is shared across many small pods on the node. I don't think that this brings too much complexity. But it brings flexibility and options without any other downside (except a little bit more complex code). Also it will not solve any of the problems mentioned - i.e remounting volume in case of upgrade or process crash.

Okay, you convinced me. The last two submissions can be rolled back.

chrislusf · 2024-01-18T16:06:45Z

ok. Reverted last two PRs.

chore(seaweedfs-csi-driver): delete unnecessary stage

6c5d2b4

chrislusf merged commit 44283c0 into seaweedfs:master Jan 12, 2024
1 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(seaweedfs-csi-driver): delete unnecessary stage #151

chore(seaweedfs-csi-driver): delete unnecessary stage #151

duanhongyi commented Jan 12, 2024

chrislusf commented Jan 12, 2024

kvaster commented Jan 16, 2024

kvaster commented Jan 16, 2024

kvaster commented Jan 16, 2024

chrislusf commented Jan 16, 2024

chrislusf commented Jan 16, 2024

kvaster commented Jan 16, 2024

kvaster commented Jan 18, 2024

duanhongyi commented Jan 18, 2024

kvaster commented Jan 18, 2024

duanhongyi commented Jan 18, 2024

chrislusf commented Jan 18, 2024

chore(seaweedfs-csi-driver): delete unnecessary stage #151

chore(seaweedfs-csi-driver): delete unnecessary stage #151

Conversation

duanhongyi commented Jan 12, 2024

chrislusf commented Jan 12, 2024

kvaster commented Jan 16, 2024

kvaster commented Jan 16, 2024

kvaster commented Jan 16, 2024

chrislusf commented Jan 16, 2024

chrislusf commented Jan 16, 2024

kvaster commented Jan 16, 2024

kvaster commented Jan 18, 2024

duanhongyi commented Jan 18, 2024

kvaster commented Jan 18, 2024

duanhongyi commented Jan 18, 2024

chrislusf commented Jan 18, 2024