Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc(storage-network): support RWX volumes #911

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions content/docs/1.7.0/advanced-resources/deploy/storage-network.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,16 @@ Set the setting [Storage Network](../../../references/settings#storage-network).
>
> Longhorn is not aware of the updates. Hence this will cause malfunctioning and error. Instead, you can create a new NetworkAttachmentDefinition custom resource and update it to the setting.

## History
[Original Feature Request](https://github.com/longhorn/longhorn/issues/2285)
### Setting Storage Network For RWX Volumes

Available since v1.3.0
Configure the setting [Storage Network For RWX Volume Enabled](../../../references/settings#storage-network-for-rwx-volume-enabled).

# Limitation

When an RWX volume is created with the storage network, the NFS mount point connection must be re-established when the CSI plugin pod restarts. Longhorn provides the [Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly](../../../references/settings#automatically-delete-workload-pod-when-the-volume-is-detached-unexpectedly) setting, which automatically deletes RWX volume workload pods when the CSI plugin pod restarts. However, the workload pod's NFS mount point could become unresponsive when the setting is disabled or the pod is not managed by a controller. In such cases, you must manually restart the CSI plugin pod.

For more information, see [Storage Network Support for Read-Write-Many (RWX) Volume](../../..//deploy/important-notes/#storage-network-support-for-read-write-many-rwx-volumes) in Important Note.

# History
- [Original Feature Request (since v1.3.0)](https://github.com/longhorn/longhorn/issues/2285)
- [[FEATURE] Support storage network for RWX volumes (since v1.7.0)](https://github.com/longhorn/longhorn/issues/8184)
16 changes: 16 additions & 0 deletions content/docs/1.7.0/deploy/important-notes/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -246,3 +246,19 @@ The attribute `backendStoreDriver`, which is defined in the parameters of Storag
### Updating the Linux Kernel on Longhorn Nodes

Host machines with Linux kernel 5.15 may unexpectedly reboot when volume-related IO errors occur. Update the Linux kernel on Longhorn nodes to version 5.19 or later to prevent such issues. For more information, see [Prerequisites](../../v2-data-engine/prerequisites/).

### Storage Network Support for Read-Write-Many (RWX) Volumes

Starting with Longhorn v1.7.0, the [storage network](../../advanced-resources/deploy/storage-network/) supports RWX volumes. However, the network's reliance on Multus results in a significant restriction.

Multus networks operate within the Kubernetes network namespace, so Longhorn can mount NFS endpoints only within the CSI plugin pod container network namespace. Consequently, NFS mount connections to the Share Manager pod become unresponsive when the CSI plugin pod restarts. This occurs because the namespace in which the connection was established is no longer available.

Longhorn circumvents this restriction by providing the following settings:
- [Storage Network For RWX Volume Enabled](../../references/settings#storage-network-for-rwx-volume-enabled): When this setting is disabled, the storage network applies only to RWO volumes. The NFS client for RWX volumes is mounted over the cluster network in the host network namespace. This means that restarting the CSI plugin pod does not affect the NFS mount connections
- [Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly](../../references/settings#automatically-delete-workload-pod-when-the-volume-is-detached-unexpectedly): When the RWX volumes are created over the storage network, this setting actively deletes RWX volume workload pods when the CSI plugin pod restarts. This allows the pods to be remounted and prevents dangling mount entries.

You can upgrade clusters with pre-existing RWX volume workloads to Longhorn v1.7.0. During and after the upgrade, the workload pod must not be interrupted because the NFS share connection uses the cluster IP, which remains valid in the host network namespace.

To apply the storage network to existing RWX volumes, you must detach the volumes, enable the [Storage Network For RWX Volume Enabled](../../references/settings#storage-network-for-rwx-volume-enabled) setting, and then reattach the volumes.

For more information, see [Issue #8184](https://github.com/longhorn/longhorn/issues/8184).
19 changes: 18 additions & 1 deletion content/docs/1.7.0/references/settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@ If disabled, Longhorn will not delete the workload pod that is managed by a cont

> **Note:** This setting doesn't apply to below cases.
> - The workload pods don't have a controller; Longhorn never deletes them.
> - The volumes used by workloads are RWX, because the Longhorn share manager, which provides the RWX NFS service, has its own resilience mechanism to ensure availability until the volume gets reattached without relying on the pod lifecycle to trigger volume reattachment. For details, see [here](../../nodes-and-volumes/volumes/rwx-volumes).
> - Workload pods with *cluster network* RWX volumes. The setting does not apply to such pods because the Longhorn Share Manager, which provides the RWX NFS service, has its own resilience mechanism. This mechanism ensures availability until the volume is reattached without relying on the pod lifecycle to trigger volume reattachment. The setting does apply, however, to workload pods with *storage network* RWX volumes. For more information, see [ReadWriteMany (RWX) Volume](../../nodes-and-volumes/volumes/rwx-volumes) and [Storage Network](../../advanced-resources/deploy/storage-network#limitation).

#### Automatic Salvage

Expand Down Expand Up @@ -346,6 +346,7 @@ This information will help us gain insights how Longhorn is being used, which wi
- Snapshot Data Integrity
- Snapshot DataIntegrity Immediate Check After Snapshot Creation
- Storage Minimal Available Percentage
- Storage Network For RWX Volume Enabled
- Storage Over Provisioning Percentage
- Storage Reserved Percentage For Default Disk
- Support Bundle Failed History Limit
Expand Down Expand Up @@ -911,10 +912,26 @@ See [Kubernetes Cluster Autoscaler Support](../../high-availability/k8s-cluster-

The storage network uses Multus NetworkAttachmentDefinition to segregate the in-cluster data traffic from the default Kubernetes cluster network.

By default, the this setting applies only to RWO (Read-Write-Once) volumes. For RWX (Read-Write-Many) volumes, see [Storage Network for RWX Volume Enabled](#storage-network-for-rwx-volume-enabled) setting.

> **Warning:** This setting should change after all Longhorn volumes are detached because some pods that run Longhorn system components are recreated to apply the setting. When all volumes are detached, Longhorn attempts to restart all Instance Manager and Backing Image Manager pods immediately. When volumes are in use, Longhorn components are not restarted, and you need to reconfigure the settings after detaching the remaining volumes; otherwise, you can wait for the setting change to be reconciled in an hour.

See [Storage Network](../../advanced-resources/deploy/storage-network) for details.

#### Storage Network For RWX Volume Enabled

> Default: `false`

This setting allows Longhorn to use the storage network for RWX volumes.

> **Warning:**
> This setting should change after all Longhorn RWX volumes are detached because some pods that run Longhorn components are recreated to apply the setting. When all RWX volumes are detached, Longhorn attempts to restart all CSI plugin pods immediately. When volumes are in use, pods that run Longhorn components are not restarted, so the settings must be reconfigured after the remaining volumes are detached. If you are unable to manually reconfigure the settings, you can opt to wait because settings are synchronized hourly.
>
> The RWX volumes are mounted with the storage network within the CSI plugin pod container network namespace. As a result, restarting the CSI plugin pod may lead to unresponsive RWX volume mounts. When this occurs, you must restart the workload pod to re-establish the mount connection. Alternatively, you can enable the [Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly](#automatically-delete-workload-pod-when-the-volume-is-detached-unexpectedly) setting.

For more information, see [Storage Network](../../advanced-resources/deploy/storage-network).


#### Remove Snapshots During Filesystem Trim

> Example: `false`
Expand Down
Loading