From 0a12de00900051bfcefbdb9204fd6c22ab887307 Mon Sep 17 00:00:00 2001 From: Chin-Ya Huang Date: Mon, 3 Jun 2024 14:40:59 +0800 Subject: [PATCH] doc(storage-network): support RWX volumes longhorn/longhorn-8184 Signed-off-by: Chin-Ya Huang --- .../deploy/storage-network.md | 15 ++++++++++++--- .../1.7.0/deploy/important-notes/index.md | 16 ++++++++++++++++ content/docs/1.7.0/references/settings.md | 19 ++++++++++++++++++- 3 files changed, 46 insertions(+), 4 deletions(-) diff --git a/content/docs/1.7.0/advanced-resources/deploy/storage-network.md b/content/docs/1.7.0/advanced-resources/deploy/storage-network.md index 175f96131..7281debe5 100644 --- a/content/docs/1.7.0/advanced-resources/deploy/storage-network.md +++ b/content/docs/1.7.0/advanced-resources/deploy/storage-network.md @@ -42,7 +42,16 @@ Set the setting [Storage Network](../../../references/settings#storage-network). > > Longhorn is not aware of the updates. Hence this will cause malfunctioning and error. Instead, you can create a new NetworkAttachmentDefinition custom resource and update it to the setting. -## History -[Original Feature Request](https://github.com/longhorn/longhorn/issues/2285) +### Setting Storage Network For RWX Volumes -Available since v1.3.0 +Configure the setting [Storage Network For RWX Volume Enabled](../../../references/settings#storage-network-for-rwx-volume-enabled). + +# Limitation + +When an RWX volume is created with the storage network, the NFS mount point connection must be re-established when the CSI plugin pod restarts. Longhorn provides the [Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly](../../../references/settings#automatically-delete-workload-pod-when-the-volume-is-detached-unexpectedly) setting, which automatically deletes RWX volume workload pods when the CSI plugin pod restarts. However, the workload pod's NFS mount point could become unresponsive when the setting is disabled or the pod is not managed by a controller. In such cases, you must manually restart the CSI plugin pod. + +For more information, see [Storage Network Support for Read-Write-Many (RWX) Volume](../../..//deploy/important-notes/#storage-network-support-for-read-write-many-rwx-volumes) in Important Note. + +# History +- [Original Feature Request (since v1.3.0)](https://github.com/longhorn/longhorn/issues/2285) +- [[FEATURE] Support storage network for RWX volumes (since v1.7.0)](https://github.com/longhorn/longhorn/issues/8184) diff --git a/content/docs/1.7.0/deploy/important-notes/index.md b/content/docs/1.7.0/deploy/important-notes/index.md index 0ebb9d689..8730c360b 100644 --- a/content/docs/1.7.0/deploy/important-notes/index.md +++ b/content/docs/1.7.0/deploy/important-notes/index.md @@ -246,3 +246,19 @@ The attribute `backendStoreDriver`, which is defined in the parameters of Storag ### Updating the Linux Kernel on Longhorn Nodes Host machines with Linux kernel 5.15 may unexpectedly reboot when volume-related IO errors occur. Update the Linux kernel on Longhorn nodes to version 5.19 or later to prevent such issues. For more information, see [Prerequisites](../../v2-data-engine/prerequisites/). + +### Storage Network Support for Read-Write-Many (RWX) Volumes + +Starting with Longhorn v1.7.0, the [storage network](../../advanced-resources/deploy/storage-network/) supports RWX volumes. However, the network's reliance on Multus results in a significant restriction. + +Multus networks operate within the Kubernetes network namespace, so Longhorn can mount NFS endpoints only within the CSI plugin pod container network namespace. Consequently, NFS mount connections to the Share Manager pod become unresponsive when the CSI plugin pod restarts. This occurs because the namespace in which the connection was established is no longer available. + +Longhorn circumvents this restriction by providing the following settings: +- [Storage Network For RWX Volume Enabled](../../references/settings#storage-network-for-rwx-volume-enabled): When this setting is disabled, the storage network applies only to RWO volumes. The NFS client for RWX volumes is mounted over the cluster network in the host network namespace. This means that restarting the CSI plugin pod does not affect the NFS mount connections +- [Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly](../../references/settings#automatically-delete-workload-pod-when-the-volume-is-detached-unexpectedly): When the RWX volumes are created over the storage network, this setting actively deletes RWX volume workload pods when the CSI plugin pod restarts. This allows the pods to be remounted and prevents dangling mount entries. + +You can upgrade clusters with pre-existing RWX volume workloads to Longhorn v1.7.0. During and after the upgrade, the workload pod must not be interrupted because the NFS share connection uses the cluster IP, which remains valid in the host network namespace. + +To apply the storage network to existing RWX volumes, you must detach the volumes, enable the [Storage Network For RWX Volume Enabled](../../references/settings#storage-network-for-rwx-volume-enabled) setting, and then reattach the volumes. + +For more information, see [Issue #8184](https://github.com/longhorn/longhorn/issues/8184). diff --git a/content/docs/1.7.0/references/settings.md b/content/docs/1.7.0/references/settings.md index 7cce23399..d77fde94c 100644 --- a/content/docs/1.7.0/references/settings.md +++ b/content/docs/1.7.0/references/settings.md @@ -151,7 +151,7 @@ If disabled, Longhorn will not delete the workload pod that is managed by a cont > **Note:** This setting doesn't apply to below cases. > - The workload pods don't have a controller; Longhorn never deletes them. -> - The volumes used by workloads are RWX, because the Longhorn share manager, which provides the RWX NFS service, has its own resilience mechanism to ensure availability until the volume gets reattached without relying on the pod lifecycle to trigger volume reattachment. For details, see [here](../../nodes-and-volumes/volumes/rwx-volumes). +> - Workload pods with *cluster network* RWX volumes. The setting does not apply to such pods because the Longhorn Share Manager, which provides the RWX NFS service, has its own resilience mechanism. This mechanism ensures availability until the volume is reattached without relying on the pod lifecycle to trigger volume reattachment. The setting does apply, however, to workload pods with *storage network* RWX volumes. For more information, see [ReadWriteMany (RWX) Volume](../../nodes-and-volumes/volumes/rwx-volumes) and [Storage Network](../../advanced-resources/deploy/storage-network#limitation). #### Automatic Salvage @@ -346,6 +346,7 @@ This information will help us gain insights how Longhorn is being used, which wi - Snapshot Data Integrity - Snapshot DataIntegrity Immediate Check After Snapshot Creation - Storage Minimal Available Percentage + - Storage Network For RWX Volume Enabled - Storage Over Provisioning Percentage - Storage Reserved Percentage For Default Disk - Support Bundle Failed History Limit @@ -911,10 +912,26 @@ See [Kubernetes Cluster Autoscaler Support](../../high-availability/k8s-cluster- The storage network uses Multus NetworkAttachmentDefinition to segregate the in-cluster data traffic from the default Kubernetes cluster network. +By default, the this setting applies only to RWO (Read-Write-Once) volumes. For RWX (Read-Write-Many) volumes, see [Storage Network for RWX Volume Enabled](#storage-network-for-rwx-volume-enabled) setting. + > **Warning:** This setting should change after all Longhorn volumes are detached because some pods that run Longhorn system components are recreated to apply the setting. When all volumes are detached, Longhorn attempts to restart all Instance Manager and Backing Image Manager pods immediately. When volumes are in use, Longhorn components are not restarted, and you need to reconfigure the settings after detaching the remaining volumes; otherwise, you can wait for the setting change to be reconciled in an hour. See [Storage Network](../../advanced-resources/deploy/storage-network) for details. +#### Storage Network For RWX Volume Enabled + +> Default: `false` + +This setting allows Longhorn to use the storage network for RWX volumes. + +> **Warning:** +> This setting should change after all Longhorn RWX volumes are detached because some pods that run Longhorn components are recreated to apply the setting. When all RWX volumes are detached, Longhorn attempts to restart all CSI plugin pods immediately. When volumes are in use, pods that run Longhorn components are not restarted, so the settings must be reconfigured after the remaining volumes are detached. If you are unable to manually reconfigure the settings, you can opt to wait because settings are synchronized hourly. +> +> The RWX volumes are mounted with the storage network within the CSI plugin pod container network namespace. As a result, restarting the CSI plugin pod may lead to unresponsive RWX volume mounts. When this occurs, you must restart the workload pod to re-establish the mount connection. Alternatively, you can enable the [Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly](#automatically-delete-workload-pod-when-the-volume-is-detached-unexpectedly) setting. + +For more information, see [Storage Network](../../advanced-resources/deploy/storage-network). + + #### Remove Snapshots During Filesystem Trim > Example: `false`