Skip to content
This repository has been archived by the owner on Mar 9, 2021. It is now read-only.

Added initial creator for loopback based files #18

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions blockstore/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Overview

This repo contains files necessary to use files in Gluster volumes as loop
mounted XFS formatted persistent volumes in Kubernetes and OpenShift. It
consists of three main items:

1. `flex-volume`
This is a a flex volume plugin to allow loop mounting Gluster files as XFS
based mounts into containers.
2. `creator`
This is a script that can be run on a Gluster server to pre-create the
files and format them with XFS.
3. `pv-recycler-pod`
This is a pod that is run in the cluster to watch for PVs that get released.
It deletes the files used as a loop device, recreates a fresh file for PV
reuse and marks the PV as available.

Further the `test` directory contains tests, that help sanitize the scripts and
future changes to the same.

---
# License

This code is licensed under Apache v2.0 with the exception of JQ which is
covered by the MIT license (see [COPYING_JQ](glfs-subvol/COPYING_JQ)).
76 changes: 76 additions & 0 deletions blockstore/creator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Creating sub-volume block device PVs

The script in this directory is used to pre-create files within a Gluster volume
that will be used as storage for PVs. The files are formatted as an XFS
filesystem and are used as block devices on the target nodes. The script is
designed to bulk-create the files as well as, generate a yaml file that can be
passed to `kubectl` to create the actual PV entries.

## Sub-volume structure

The script creates a higher level directory named `blockstore`, within the
provided Gluster volume, to separate the namespace within the Gluster volume.

The script uses a 2-level directory structure with each level having a two
hex-digit name, and within this directory creates a file with the 2 hex-digit
name. This permits up to 65536 total PVs to be created from a single
volume while also keeping individual file size manageable.

The script refers to these subdirs via a numeric index (0 - 65535) which is then
mapped to a directory name by converting to a 4-digit hex number and dividing
into path components. For example, index 20000 would be directory:
20000 == 0x4e20 ==>/4e/20,
within which a (sparse) file named 4e20 would be created and formatted as an
XFS filesystem.

## Usage

The `creator.sh` script needs to be run from one of the Gluster server nodes
because it makes modifications to the underlying volume configuration using the
`gluster` command.

NOTE: As of now the script does not change any gluster configuration, but the
limitation is retained, as it may in the future (at which point this note
will be removed)

The following walkthrough will take an existing, empty Gluster volume named
`testvol` and pre-create 1000 files for use as PVs, with each designed
to hold 1GiB of data. The Gluster servers are 192.168.173.[15-17]

Start by mounting the volume on any server:
```sh
$ sudo mkdir /mnt/data
$ sudo mount -tglusterfs 192.168.173.15:/testvol /mnt/data
```

Run the creator script:
```sh
$ sudo ./creator.sh 192.168.173.15:192.168.173.16:192.168.173.17 testvol /mnt/data 1 0 999
```

The script will:
* Create a top level director named `blockstore`
* Create directories `/00/00` through `/03/e7` (via `/mnt/data/blockstore`)
* Create a sparse file of size 1GiB in each directory above named 0000 through 03e7
* Write the yaml PV description for the volumes into `/mnt/data/blockstore/pvs-0-999.yml`

The yaml file can then be applied to create the corresponding PVs:
```sh
$ kubectl apply -f pvs-0-999.yml
```

The Gluster volume may be unmounted:
```sh
$ sudo umount /mnt/data
$ sudo rmdir /mnt/data
```

## Note on gluster-block-subvol-sc.yml

This is a convinence file placed here. This is to be used in an Openshift or a
k8s environment, when it is desired that the gluster-block-subvol be made the
default storage class. To enable gluster-block-subvol to be the default stroage
class, assuming that the PVs are created use,
```sh
$ kubectl apply -f gluster-block-subvol-sc.yml
```
142 changes: 142 additions & 0 deletions blockstore/creator/creator.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
#! /bin/bash
# vim: set ts=4 sw=4 et :

# Copyright 2018 Red Hat, Inc. and/or its affiliates.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

blockstore_base="blockstore"

function usage() {
echo "Usage: $0 <server1:server2:...> <volume> <base_path> <quota_in_GB> <start> <end>"
echo " 0 <= start <= end <= 65535"
}

function tohexpath() {
local -i l1=$1/256
local -i l2=$1%256
printf '%02x/%02x' "$l1" "$l2"
}

function tohexname() {
local -i l1=$1/256
local -i l2=$1%256
printf '%02x%02x' "$l1" "$l2"
}

function mkPvTemplate() {
local servers=$1
local volume=$2
local subdir=$3
local blockfile=$4
local capacity=$5
local uuid=$6

local pv_name
pv_name=$(echo "gluster-block-${uuid}-${subdir}-${blockfile}" | tr '/' '-')
cat - << EOT
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: "$pv_name"
labels:
cluster: "$(echo "$servers" | tr ':' '-')"
volume: "$volume"
subdir: "$(echo "${blockstore_base}/${subdir}" | tr '/' '-')"
supervol: "$uuid"
spec:
capacity:
storage: $capacity
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: gluster-block-subvol
flexVolume:
driver: "rht/glfs-block-subvol"
options:
cluster: "$servers"
volume: "$volume"
dir: "${blockstore_base}/${subdir}"
file: "$blockfile"
EOT
}


servers=$1
volume_name=$2
base_path_in=$3
volsize_gb=$4
declare -i i_start=$5
declare -i i_end=$6

declare -i i=$i_start

if [ $# -ne 6 ]; then usage; exit 1; fi
if [ "$i" -lt 0 ]; then usage; exit 1; fi
if [ "$i" -gt "$i_end" ]; then usage; exit 1; fi
if [ "$i_end" -gt 65535 ]; then usage; exit 1; fi

base_path="${base_path_in}/${blockstore_base}"
if [ ! -d "${base_path}" ]; then
if ! mkdir "${base_path}"; then
echo "Unable to create $base_path"
exit 2
fi
fi

if [ ! -f "${base_path}/supervol-uuid" ]; then
uuidgen -r > "${base_path}/supervol-uuid"
fi
supervol_uuid=$(cat "${base_path}/supervol-uuid")

if [ -f "${base_path}/pvs-${i_start}-${i_end}.yml" ]; then
rm "${base_path}/pvs-${i_start}-${i_end}.yml"
fi

while [ "$i" -le "$i_end" ]; do
subdir=$(tohexpath "$i")
dir="${base_path}/${subdir}"
echo "creating: ${dir} (${i}/${i_end})"
if ! mkdir -p "$dir"; then
echo "Unable to create $dir"
exit 2
fi
blockfile=$(tohexname "$i")
blockfqpath="${base_path}/${subdir}/${blockfile}"
# File should not exist, or do not mess up existing devices here!
if [ -f "$blockfqpath" ]; then
echo "Found an existing device file with at $blockfile; skipping device creation"
((++i))
continue
fi
if ! touch "$blockfqpath"; then
echo "Unable to create file ${blockfile}"
exit 2
fi
# Create a sparse file of required volume size
if ! dd bs=1 count=1 if=/dev/zero of="${blockfqpath}" seek="$((volsize_gb * 1024 * 1024 *1024))" status=none; then
echo "Error in dd to ${blockfile}"
exit 2
fi
# Format the file with XFS
if ! mkfs.xfs -q "${blockfqpath}"; then
echo "mkfs.xfs failed for ${blockfqpath}"
exit 2
fi

mkPvTemplate "$servers" "$volume_name" "$subdir" "$blockfile" "${volsize_gb}Gi" "$supervol_uuid" >> "${base_path}/pvs-${i_start}-${i_end}.yml"
((++i))
done

exit 0
9 changes: 9 additions & 0 deletions blockstore/creator/gluster-block-subvol-sc.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: gluster-block-subvol
annotations:
storageclass.kubernetes.io/is-default-class: "true"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably don't want this. If a user wants this as the default storageclass, they can patch it once it's been added. Having it automatically set itself as cluster default raises the possibility of accidentally having 2 that are set as default. In that case, neither acts as default and default provisioning breaks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This yml is a helper/reminder for anyone who wants to add this as the default, right? Normally we do not need to add this as a storage class, as we use the Flexvol scheme. Isn't that right?

Assuming so, I would assume we leave this as is, for interested users to set this as a default. If my assumptions are wrong, then I would need to change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good point. Since we use static provisioning, there is no need for the StorageClass object unless one wants it to become the default class. (not related to flex)

provisioner: none
84 changes: 84 additions & 0 deletions blockstore/flex-volume/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Installation of flex volume plugin

This is a flex volume plugin that needs to be installed on each Kubernetes node.
Included in this directory is an ansible playbook (`install_plugin.yml`) that
performs the install. This playbook:
* Creates the directory for the plugin:
`/usr/libexec/kubernetes/kubelet-plugins/volume/exec/rht~glfs-block-subvol`
* Copies both the plugin script `glfs-block-subvol` to that directory.

Upon first install, it may be necessary to restart kubelet for it to find the
plugin.

# Usage
To use the plugin, include the following as a volume description.
```yaml
flexVolume:
driver: "rht/glfs-block-subvol"
options:
cluster: 192.168.173.15:192.168.173.16:192.168.173.17
volume: "testvol"
dir: "00/01"
file: "0001"
```
The required options for the driver are:
* `cluster`: A colon separated list of the Gluster nodes in the cluster. The
first will be used as the primary for mounting, and the rest will be listed as
backup volume servers.
* `volume`: This is the name of the large Gluster volume that is being
subdivided.
* `dir`: This is the path from the root of the volume to the subdirectory which
will contain the file that would be loop mounted to be the volume.
* `file`: This is the name of the file within `dir` that is loop mounted as an
XFS file system as the volume for the claim.

The above example would use 192.168.173.15:/testvol/00/01/0001 to hold the PV
contents.

# Diagnostics/debugging
The `glfs-block-subvol` script has logging for all of its actions to help
diagnose problems with the plugin. The logging settings are at the top of the
script file:
```sh
# if DEBUG, log everything to a file as we do it
DEBUG=1
DEBUGFILE='/tmp/glfs-block-subvol.out'
```
When `DEBUG` is `1`, all calls and actions taken by the plugin are logged to
`DEBUGFILE`. The following is an example of the log file:
```
[1520361740.373690279] > init
[1520361740.373690279] < 0 {"status": "Success", "capabilities": {"attach": false, "selinuxRelabel": false}}
[1520361740.405577771] > mount /mnt/pods/00/volumes/vol1 {"cluster":"127.0.0.1:127.0.0.1","dir":"blockstore/00/00","volume":"patchy","file":"0000"}
[1520361740.405577771] volserver 127.0.0.1
[1520361740.405577771] backupservers 127.0.0.1
[1520361740.405577771] Using lockfile: /var/lock/glfs-block-subvol/127.0.0.1-patchy.lock
[1520361740.405577771] ! mount -t glusterfs -o backup-volfile-servers=127.0.0.1 127.0.0.1:/patchy /mnt/script-dir/mnt/blockstore/127.0.0.1-patchy
[1520361740.405577771] ! mount /mnt/script-dir/mnt/blockstore/127.0.0.1-patchy/blockstore/00/00/0000 /mnt/pods/00/volumes/vol1 -t xfs -o loop,discard
[1520361740.405577771] < 0 {"status": "Success", "message": "volserver=127.0.0.1 backup=127.0.0.1 volume=patchy mountpoint=/mnt/script-dir/mnt/blockstore/127.0.0.1-patchy bindto=/mnt/pods/00/volumes/vol1"}
[1520361740.849832326] > unmount /mnt/pods/00/volumes/vol1
[1520361740.849832326] ldevice=/dev/loop0
[1520361740.849832326] ldevicefile=/mnt/script-dir/mnt/blockstore/127.0.0.1-patchy/blockstore/00/00/0000
[1520361740.849832326] gdevicedir=/mnt/script-dir/mnt/blockstore/127.0.0.1-patchy
[1520361740.849832326] mntsuffix=127.0.0.1-patchy
[1520361740.849832326] ! umount /mnt/pods/00/volumes/vol1
[1520361740.849832326] Using lockfile: /var/lock/glfs-block-subvol/127.0.0.1-patchy.lock
[1520361740.849832326] /mnt/script-dir/mnt/blockstore/127.0.0.1-patchy has 0 loop mounted files
[1520361740.849832326] We were last user of /mnt/script-dir/mnt/blockstore/127.0.0.1-patchy; unmounting it.
[1520361740.849832326] ! umount /mnt/script-dir/mnt/blockstore/127.0.0.1-patchy
[1520361740.849832326] ! rmdir /mnt/script-dir/mnt/blockstore/127.0.0.1-patchy
[1520361740.849832326] < 0 {"status": "Success", "message": "Unmounting from /mnt/pods/00/volumes/vol1"}
```

In the log file, each line begins with a timestamp, and the timestamp remains
constant for the length of the execution of the script. The purpose is to allow
multiple, overlapping invocations to be teased apart. The second (optional)
field is a single character.
* Lines with ">" are logs of the scripts invocation arguments.
* Lines with "<" are the script's output back to the driver.
* Lines with "!" are external command invocations made by the script.
* Lines without one of these characters are free-form diagnostic messages.

In the event that the logging generates too much output, it can be disabled by
setting `DEBUG` to `0`. However, when changing this value, be careful to update
the script in an atomic fashion if the node is currently in-use.
Loading