Basic Linstor deployment #20

luissimas · 2025-01-18T18:34:40Z

Adds a Linstor playbook with the bare minimum for a development setup. This is needed to support the development of the Linstor integration on Incus, tracked by in lxc/incus#564. The idea of this PR is to both make this automation available for developers working in the feature as well as creating a common place for discussing the integration of Linstor into incus-deploy. I wouldn't consider this production-ready, and I left some notes on things that I think could be improved.

For a production-ready setup we'd probably also want to setup SSL for encrypting both controller<->satellite as well as incus<->controller traffic. It's also worth noting that I'm using the linstor physical-storage create-device-pool to create the underlying storage setup (VGs or zPools). While this makes it easy to support the use of LVM and ZFS on the playbook with no extra logic, it does not expose many options for configuring the underlying storage. Ideally we'd want to create the VGs and zPools manually, which would give the user the ability to configure them as needed through extra variables in the playbook.

The resulting Linstor deployment is the following:

root@server01:~# linstor node list
╭─────────────────────────────────────────────────────────────╮
┊ Node     ┊ NodeType  ┊ Addresses                   ┊ State  ┊
╞═════════════════════════════════════════════════════════════╡
┊ server01 ┊ SATELLITE ┊ 10.172.117.141:3366 (PLAIN) ┊ Online ┊
┊ server02 ┊ SATELLITE ┊ 10.172.117.171:3366 (PLAIN) ┊ Online ┊
┊ server03 ┊ SATELLITE ┊ 10.172.117.123:3366 (PLAIN) ┊ Online ┊
╰─────────────────────────────────────────────────────────────╯
root@server01:~# linstor sp list
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node     ┊ Driver   ┊ PoolName                            ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName                    ┊
╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ server01 ┊ DISKLESS ┊                                     ┊              ┊               ┊ False        ┊ Ok    ┊ server01;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ server02 ┊ DISKLESS ┊                                     ┊              ┊               ┊ False        ┊ Ok    ┊ server02;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ server03 ┊ DISKLESS ┊                                     ┊              ┊               ┊ False        ┊ Ok    ┊ server03;DfltDisklessStorPool ┊
┊ incus                ┊ server01 ┊ LVM_THIN ┊ linstor_linstor-incus/linstor-incus ┊    39.91 GiB ┊     39.91 GiB ┊ True         ┊ Ok    ┊ server01;incus                ┊
┊ incus                ┊ server02 ┊ LVM_THIN ┊ linstor_linstor-incus/linstor-incus ┊    39.91 GiB ┊     39.91 GiB ┊ True         ┊ Ok    ┊ server02;incus                ┊
┊ incus                ┊ server03 ┊ LVM_THIN ┊ linstor_linstor-incus/linstor-incus ┊    39.91 GiB ┊     39.91 GiB ┊ True         ┊ Ok    ┊ server03;incus                ┊
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

We can then create a Resource Group that will eventually be consumed by Incus and spawn volumes from it. In this example I'm specifying --place-count 2, which means that Linstor will create two physical replicas and one diskless replica to reach quorum (a TieBreaker).

root@server01:~# linstor rg create incus-volumes --storage-pool incus --place-count 2
...
root@server01:~# linstor rg spawn incus-volumes vol1 10G
root@server01:~# linstor resource list
╭────────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node     ┊ Layers       ┊ Usage  ┊ Conns ┊      State ┊ CreatedOn           ┊
╞════════════════════════════════════════════════════════════════════════════════════════════╡
┊ vol1         ┊ server01 ┊ DRBD,STORAGE ┊ Unused ┊ Ok    ┊   UpToDate ┊ 2025-01-18 17:59:31 ┊
┊ vol1         ┊ server02 ┊ DRBD,STORAGE ┊ Unused ┊ Ok    ┊   UpToDate ┊ 2025-01-18 17:59:31 ┊
┊ vol1         ┊ server03 ┊ DRBD,STORAGE ┊ Unused ┊ Ok    ┊ TieBreaker ┊ 2025-01-18 17:59:29 ┊
╰────────────────────────────────────────────────────────────────────────────────────────────╯
root@server01:~# linstor volume list
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Resource ┊ Node     ┊ StoragePool          ┊ VolNr ┊ MinorNr ┊ DeviceName    ┊ Allocated ┊ InUse  ┊      State ┊
╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ vol1     ┊ server01 ┊ incus                ┊     0 ┊    1000 ┊ /dev/drbd1000 ┊  2.05 MiB ┊ Unused ┊   UpToDate ┊
┊ vol1     ┊ server02 ┊ incus                ┊     0 ┊    1000 ┊ /dev/drbd1000 ┊  2.05 MiB ┊ Unused ┊   UpToDate ┊
┊ vol1     ┊ server03 ┊ DfltDisklessStorPool ┊     0 ┊    1000 ┊ /dev/drbd1000 ┊           ┊ Unused ┊ TieBreaker ┊
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Testing the setup

I've tested the deployment with the following setup. For testing purposes, I reduced the total number of servers to 3, removed Ceph from the deployment and assigned the two Ceph OSD disks on each machine to Linstor instead.

`terraform/terraform.tfvars`

# Incus variables
incus_remote       = "local"    # Name of the Incus remote to deploy on (see `incus remote list`)
incus_storage_pool = "default"  # Name of the storage pool to use for the VMs and volumes
incus_network      = "incusbr0" # Name of the network to use for the VMs

# OVN uplink configuration
ovn_uplink_ipv4_address = "172.31.254.1/24"
ovn_uplink_ipv6_address = "fd00:1e4d:637d:1234::1/64"

instance_names = ["server01", "server02", "server03"]

`ansible/hosts.yaml`

all:
  vars:
    ceph_fsid: "e2850e1f-7aab-472e-b6b1-824e19a75071"
    ceph_rbd_cache: "2048Mi"
    ceph_rbd_cache_max: "1792Mi"
    ceph_rbd_cache_target: "1536Mi"

    incus_name: "baremetal"
    incus_release: "stable"

    lvmcluster_name: "baremetal"

    ovn_name: "baremetal"
    ovn_az_name: "zone1"
    ovn_release: "ppa"

    linstor_pool_name: "incus"
    linstor_pool_driver: "lvmthin"
  children:
    baremetal:
      vars:
        ansible_connection: incus
        ansible_incus_remote: local
        ansible_user: root
        ansible_become: no
        ansible_incus_project: dev-incus-deploy

        incus_init:
          network:
            LOCAL:
              type: macvlan
              local_config:
                parent: enp5s0
              description: Directly attach to host networking
            UPLINK:
              type: physical
              config:
                ipv4.gateway: "172.31.254.1/24"
                ipv6.gateway: "fd00:1e4d:637d:1234::1/64"
                ipv4.ovn.ranges: "172.31.254.10-172.31.254.254"
                dns.nameservers: "1.1.1.1,1.0.0.1"
              local_config:
                parent: enp6s0
              description: Physical network for OVN routers
            default:
              type: ovn
              config:
                network: UPLINK
              default: true
              description: Initial OVN network
          storage:
            local:
              driver: zfs
              local_config:
                source: "/dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_incus_disk3"
              description: Local storage pool
            # remote:
            #   driver: ceph
            #   local_config:
            #     source: "incus_{{ incus_name }}"
            #   description: Distributed storage pool (cluster-wide)
            shared:
              driver: lvmcluster
              local_config:
                lvm.vg_name: "vg0"
                source: "vg0"
              default: true
              description: Shared storage pool (cluster-wide)

        incus_roles:
          - cluster
          - ui

        lvmcluster_metadata_size: 100m
        lvmcluster_vgs:
          vg0: "/dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_incus_disk4"

        ovn_roles:
          - host
      hosts:
        server01:
          linstor_disks:
            - nvme-QEMU_NVMe_Ctrl_incus_disk1
            - nvme-QEMU_NVMe_Ctrl_incus_disk2
          linstor_roles:
            - controller
            - satellite

          ovn_roles:
            - central
            - host
        server02:
          linstor_disks:
            - nvme-QEMU_NVMe_Ctrl_incus_disk1
            - nvme-QEMU_NVMe_Ctrl_incus_disk2
          linstor_roles:
            - satellite

          ovn_roles:
            - central
            - host
        server03:
          linstor_disks:
            - nvme-QEMU_NVMe_Ctrl_incus_disk1
            - nvme-QEMU_NVMe_Ctrl_incus_disk2
          linstor_roles:
            - satellite

          ovn_roles:
            - central
            - host

Exposes an instance_names variable to allow users to more easily change the number of instances for the deployment. Signed-off-by: Luís Simas <luissimas@protonmail.com>

Adds a new playbook for deploying linstor. The playbook installs the needed packages for Linstor and the underlying storage utilities. A storage pool is created on each node using the storage driver and disks specified in the Ansible inventory. Signed-off-by: Luís Simas <luissimas@protonmail.com>

Disables secureboot on instances to allow loading the DRBD kernel modules when deploying Linstor. Signed-off-by: Luís Simas <luissimas@protonmail.com>

Adds a new disk for instances to be consumed by Linstor. Signed-off-by: Luís Simas <luissimas@protonmail.com>

Adds values to the Linstor configuration variables in the sample inventory. Signed-off-by: Luís Simas <luissimas@protonmail.com>

winiciusallan

Nice job! Just some questions

winiciusallan · 2025-01-20T16:22:56Z

ansible/books/linstor.yaml

@@ -0,0 +1,147 @@
+---
+- name: Linstor - Add package repository
+  hosts: all


We could use group declaration in hosts.yaml, that way we avoid checking if the host is into "linstor_roles", making the tasks way simpler. Here[1] in docs has examples. Above the children group in hosts.yaml, should have a Linstor hosts declaration.

[1] https://docs.ansible.com/ansible/latest/inventory_guide/intro_inventory.html#hosts-in-multiple-groups

We're not currently using groups so probably best to just use the linstor_roles behavior for consistency now.

There is some planned work to re-shuffle things quite a bit and use Ansible roles and other constructs, so that will come in later.

winiciusallan · 2025-01-20T16:28:17Z

ansible/books/linstor.yaml

+
+    - name: Parse storage pools
+      set_fact:
+        satellites_without_storage_pools: >-


Is it necessary to check if a satellite has no storage pool? Should nothing occur if the host already has a storage pool?

The idea is to make the creation of the storage pool idempotent and ensure that we don't try to create it again when running the playbook multiple times. To be honest I don't remember if the linstor physical-storage create-device-pool already takes care of that for us, I'll check tomorrow and come back with the results.

With that said, I think the current approach is quite naive and doesn't take factors like pre existing storage pools into account. I basically replicated the logic for adding the satellite nodes, but that is a simpler problem to solve.

I tried removing the check and running the playbook twice and we indeed get an error. So the check is needed to make the storage pool creation idempotent.

TASK [Create storage pool] ****************************************************************************************************************************************************************************************** skipping: [server03] => (item=server01) skipping: [server03] => (item=server02) skipping: [server03] => (item=server03) skipping: [server03] skipping: [server02] => (item=server01) skipping: [server02] => (item=server02) skipping: [server02] => (item=server03) skipping: [server02] failed: [server01] (item=server01) => {"ansible_loop_var": "item", "changed": false, "cmd": "linstor physical-storage create-device-pool --storage-pool incus --pool-name linstor-incus zfsthin server01 /dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_incus_disk5", "delta": "0:00:00.276529", "end": "2025-01-22 14:25:53.877272", "item": "server01", "msg": "non-zero return code", "rc": 10, "start": "2025-01-22 14:25:53.600743", "stderr": "", "stderr_lines": [], "stdout": "\u001b[1;31mERROR:\n\u001b[0m (Node: 'server01') Zpool name already used.", "stdout_lines": ["\u001b[1;31mERROR:", "\u001b[0m (Node: 'server01') Zpool name already used."]} failed: [server01] (item=server02) => {"ansible_loop_var": "item", "changed": false, "cmd": "linstor physical-storage create-device-pool --storage-pool incus --pool-name linstor-incus zfsthin server02 /dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_incus_disk5", "delta": "0:00:00.273696", "end": "2025-01-22 14:25:54.892492", "item": "server02", "msg": "non-zero return code", "rc": 10, "start": "2025-01-22 14:25:54.618796", "stderr": "", "stderr_lines": [], "stdout": "\u001b[1;31mERROR:\n\u001b[0m (Node: 'server02') Zpool name already used.", "stdout_lines": ["\u001b[1;31mERROR:", "\u001b[0m (Node: 'server02') Zpool name already used."]} failed: [server01] (item=server03) => {"ansible_loop_var": "item", "changed": false, "cmd": "linstor physical-storage create-device-pool --storage-pool incus --pool-name linstor-incus zfsthin server03 /dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_incus_disk5", "delta": "0:00:00.298110", "end": "2025-01-22 14:25:55.994142", "item": "server03", "msg": "non-zero return code", "rc": 10, "start": "2025-01-22 14:25:55.696032", "stderr": "", "stderr_lines": [], "stdout": "\u001b[1;31mERROR:\n\u001b[0m (Node: 'server03') Zpool name already used.", "stdout_lines": ["\u001b[1;31mERROR:", "\u001b[0m (Node: 'server03') Zpool name already used."]} NO MORE HOSTS LEFT ************************************************************************************************************************************************************************************************** PLAY RECAP ********************************************************************************************************************************************************************************************************** server01 : ok=18 changed=0 unreachable=0 failed=1 skipped=6 rescued=0 ignored=0 server02 : ok=12 changed=0 unreachable=0 failed=0 skipped=13 rescued=0 ignored=0 server03 : ok=12 changed=0 unreachable=0 failed=0 skipped=13 rescued=0 ignored=0

terraform: Add instance_names variable

6b13677

Exposes an instance_names variable to allow users to more easily change the number of instances for the deployment. Signed-off-by: Luís Simas <luissimas@protonmail.com>

luissimas mentioned this pull request Jan 18, 2025

Add a linstor storage driver lxc/incus#564

Open

6 tasks

luissimas added 4 commits January 18, 2025 15:44

terraform: Disable secureboot for instances

98cf8c2

Disables secureboot on instances to allow loading the DRBD kernel modules when deploying Linstor. Signed-off-by: Luís Simas <luissimas@protonmail.com>

terraform: Provision Linstor drive

933f063

Adds a new disk for instances to be consumed by Linstor. Signed-off-by: Luís Simas <luissimas@protonmail.com>

ansible: Add Linstor configuration to sample inventory

2c9ea4d

Adds values to the Linstor configuration variables in the sample inventory. Signed-off-by: Luís Simas <luissimas@protonmail.com>

luissimas force-pushed the linstor branch from bcc1460 to 2c9ea4d Compare January 18, 2025 18:46

winiciusallan reviewed Jan 20, 2025

View reviewed changes

luissimas mentioned this pull request Feb 1, 2025

Add Linstor storage driver lxc/incus#1621

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic Linstor deployment #20

Basic Linstor deployment #20

luissimas commented Jan 18, 2025

winiciusallan left a comment

winiciusallan Jan 20, 2025

stgraber Jan 20, 2025

winiciusallan Jan 20, 2025

luissimas Jan 20, 2025

luissimas Jan 22, 2025

Basic Linstor deployment #20

Are you sure you want to change the base?

Basic Linstor deployment #20

Conversation

luissimas commented Jan 18, 2025

Testing the setup

terraform/terraform.tfvars

ansible/hosts.yaml

winiciusallan left a comment

Choose a reason for hiding this comment

winiciusallan Jan 20, 2025

Choose a reason for hiding this comment

stgraber Jan 20, 2025

Choose a reason for hiding this comment

winiciusallan Jan 20, 2025

Choose a reason for hiding this comment

luissimas Jan 20, 2025

Choose a reason for hiding this comment

luissimas Jan 22, 2025

Choose a reason for hiding this comment

`terraform/terraform.tfvars`

`ansible/hosts.yaml`