This repository has been archived by the owner on Mar 9, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6
Added initial creator for loopback based files #18
Open
ShyamsundarR
wants to merge
8
commits into
gluster:master
Choose a base branch
from
ShyamsundarR:lodev-subvols
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 7 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
03d3c1d
Added initial creator for loopback based files
ShyamsundarR bb0e887
Added loopback device based block flexvol provider
ShyamsundarR 4a7088a
Moved block based flex vol work to its own directory
ShyamsundarR 75d03e1
blockstore: added recycler shell script
ShyamsundarR ecba25e
Added Dockerfile to create block recycler image
ShyamsundarR 36fa814
Addressed review comments and fixed failures during testing
ShyamsundarR 169d53b
Addressed various TODOs and added documentation
ShyamsundarR 2871dfe
Tested and updated recycler scripts
ShyamsundarR File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
# Overview | ||
|
||
This repo contains files necessary to use files in Gluster volumes as loop | ||
mounted XFS formatted persistent volumes in Kubernetes and OpenShift. It | ||
consists of three main items: | ||
|
||
1. `flex-volume` | ||
This is a a flex volume plugin to allow loop mounting Gluster files as XFS | ||
based mounts into containers. | ||
2. `creator` | ||
This is a script that can be run on a Gluster server to pre-create the | ||
files and format them with XFS. | ||
3. `pv-recycler-pod` | ||
This is a pod that is run in the cluster to watch for PVs that get released. | ||
It deletes the files used as a loop device, recreates a fresh file for PV | ||
reuse and marks the PV as available. | ||
|
||
Further the `test` directory contains tests, that help sanitize the scripts and | ||
future changes to the same. | ||
|
||
--- | ||
# License | ||
|
||
This code is licensed under Apache v2.0 with the exception of JQ which is | ||
covered by the MIT license (see [COPYING_JQ](glfs-subvol/COPYING_JQ)). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# Creating sub-volume block device PVs | ||
|
||
The script in this directory is used to pre-create files within a Gluster volume | ||
that will be used as storage for PVs. The files are formatted as an XFS | ||
filesystem and are used as block devices on the target nodes. The script is | ||
designed to bulk-create the files as well as, generate a yaml file that can be | ||
passed to `kubectl` to create the actual PV entries. | ||
|
||
## Sub-volume structure | ||
|
||
The script creates a higher level directory named `blockstore`, within the | ||
provided Gluster volume, to separate the namespace within the Gluster volume. | ||
|
||
The script uses a 2-level directory structure with each level having a two | ||
hex-digit name, and within this directory creates a file with the 2 hex-digit | ||
name. This permits up to 65536 total PVs to be created from a single | ||
volume while also keeping individual file size manageable. | ||
|
||
The script refers to these subdirs via a numeric index (0 - 65535) which is then | ||
mapped to a directory name by converting to a 4-digit hex number and dividing | ||
into path components. For example, index 20000 would be directory: | ||
20000 == 0x4e20 ==>/4e/20, | ||
within which a (sparse) file named 4e20 would be created and formatted as an | ||
XFS filesystem. | ||
|
||
## Usage | ||
|
||
The `creator.sh` script needs to be run from one of the Gluster server nodes | ||
because it makes modifications to the underlying volume configuration using the | ||
`gluster` command. | ||
|
||
NOTE: As of now the script does not change any gluster configuration, but the | ||
limitation is retained, as it may in the future (at which point this note | ||
will be removed) | ||
|
||
The following walkthrough will take an existing, empty Gluster volume named | ||
`testvol` and pre-create 1000 files for use as PVs, with each designed | ||
to hold 1GiB of data. The Gluster servers are 192.168.173.[15-17] | ||
|
||
Start by mounting the volume on any server: | ||
```sh | ||
$ sudo mkdir /mnt/data | ||
$ sudo mount -tglusterfs 192.168.173.15:/testvol /mnt/data | ||
``` | ||
|
||
Run the creator script: | ||
```sh | ||
$ sudo ./creator.sh 192.168.173.15:192.168.173.16:192.168.173.17 testvol /mnt/data 1 0 999 | ||
``` | ||
|
||
The script will: | ||
* Create a top level director named `blockstore` | ||
* Create directories `/00/00` through `/03/e7` (via `/mnt/data/blockstore`) | ||
* Create a sparse file of size 1GiB in each directory above named 0000 through 03e7 | ||
* Write the yaml PV description for the volumes into `/mnt/data/blockstore/pvs-0-999.yml` | ||
|
||
The yaml file can then be applied to create the corresponding PVs: | ||
```sh | ||
$ kubectl apply -f pvs-0-999.yml | ||
``` | ||
|
||
The Gluster volume may be unmounted: | ||
```sh | ||
$ sudo umount /mnt/data | ||
$ sudo rmdir /mnt/data | ||
``` | ||
|
||
## Note on gluster-block-subvol-sc.yml | ||
|
||
This is a convinence file placed here. This is to be used in an Openshift or a | ||
k8s environment, when it is desired that the gluster-block-subvol be made the | ||
default storage class. To enable gluster-block-subvol to be the default stroage | ||
class, assuming that the PVs are created use, | ||
```sh | ||
$ kubectl apply -f gluster-block-subvol-sc.yml | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,142 @@ | ||
#! /bin/bash | ||
# vim: set ts=4 sw=4 et : | ||
|
||
# Copyright 2018 Red Hat, Inc. and/or its affiliates. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
blockstore_base="blockstore" | ||
|
||
function usage() { | ||
echo "Usage: $0 <server1:server2:...> <volume> <base_path> <quota_in_GB> <start> <end>" | ||
echo " 0 <= start <= end <= 65535" | ||
} | ||
|
||
function tohexpath() { | ||
local -i l1=$1/256 | ||
local -i l2=$1%256 | ||
printf '%02x/%02x' "$l1" "$l2" | ||
} | ||
|
||
function tohexname() { | ||
local -i l1=$1/256 | ||
local -i l2=$1%256 | ||
printf '%02x%02x' "$l1" "$l2" | ||
} | ||
|
||
function mkPvTemplate() { | ||
local servers=$1 | ||
local volume=$2 | ||
local subdir=$3 | ||
local blockfile=$4 | ||
local capacity=$5 | ||
local uuid=$6 | ||
|
||
local pv_name | ||
pv_name=$(echo "gluster-block-${uuid}-${subdir}-${blockfile}" | tr '/' '-') | ||
cat - << EOT | ||
--- | ||
apiVersion: v1 | ||
kind: PersistentVolume | ||
metadata: | ||
name: "$pv_name" | ||
labels: | ||
cluster: "$(echo "$servers" | tr ':' '-')" | ||
volume: "$volume" | ||
subdir: "$(echo "${blockstore_base}/${subdir}" | tr '/' '-')" | ||
supervol: "$uuid" | ||
spec: | ||
capacity: | ||
storage: $capacity | ||
accessModes: | ||
- ReadWriteOnce | ||
persistentVolumeReclaimPolicy: Retain | ||
storageClassName: gluster-block-subvol | ||
flexVolume: | ||
driver: "rht/glfs-block-subvol" | ||
options: | ||
cluster: "$servers" | ||
volume: "$volume" | ||
dir: "${blockstore_base}/${subdir}" | ||
file: "$blockfile" | ||
EOT | ||
} | ||
|
||
|
||
servers=$1 | ||
volume_name=$2 | ||
base_path_in=$3 | ||
volsize_gb=$4 | ||
declare -i i_start=$5 | ||
declare -i i_end=$6 | ||
|
||
declare -i i=$i_start | ||
|
||
if [ $# -ne 6 ]; then usage; exit 1; fi | ||
if [ "$i" -lt 0 ]; then usage; exit 1; fi | ||
if [ "$i" -gt "$i_end" ]; then usage; exit 1; fi | ||
if [ "$i_end" -gt 65535 ]; then usage; exit 1; fi | ||
|
||
base_path="${base_path_in}/${blockstore_base}" | ||
if [ ! -d "${base_path}" ]; then | ||
if ! mkdir "${base_path}"; then | ||
echo "Unable to create $base_path" | ||
exit 2 | ||
fi | ||
fi | ||
|
||
if [ ! -f "${base_path}/supervol-uuid" ]; then | ||
uuidgen -r > "${base_path}/supervol-uuid" | ||
fi | ||
supervol_uuid=$(cat "${base_path}/supervol-uuid") | ||
|
||
if [ -f "${base_path}/pvs-${i_start}-${i_end}.yml" ]; then | ||
rm "${base_path}/pvs-${i_start}-${i_end}.yml" | ||
fi | ||
|
||
while [ "$i" -le "$i_end" ]; do | ||
subdir=$(tohexpath "$i") | ||
dir="${base_path}/${subdir}" | ||
echo "creating: ${dir} (${i}/${i_end})" | ||
if ! mkdir -p "$dir"; then | ||
echo "Unable to create $dir" | ||
exit 2 | ||
fi | ||
blockfile=$(tohexname "$i") | ||
blockfqpath="${base_path}/${subdir}/${blockfile}" | ||
# File should not exist, or do not mess up existing devices here! | ||
if [ -f "$blockfqpath" ]; then | ||
echo "Found an existing device file with at $blockfile; skipping device creation" | ||
((++i)) | ||
continue | ||
fi | ||
if ! touch "$blockfqpath"; then | ||
echo "Unable to create file ${blockfile}" | ||
exit 2 | ||
fi | ||
# Create a sparse file of required volume size | ||
if ! dd bs=1 count=1 if=/dev/zero of="${blockfqpath}" seek="$((volsize_gb * 1024 * 1024 *1024))" status=none; then | ||
echo "Error in dd to ${blockfile}" | ||
exit 2 | ||
fi | ||
# Format the file with XFS | ||
if ! mkfs.xfs -q "${blockfqpath}"; then | ||
echo "mkfs.xfs failed for ${blockfqpath}" | ||
exit 2 | ||
fi | ||
|
||
mkPvTemplate "$servers" "$volume_name" "$subdir" "$blockfile" "${volsize_gb}Gi" "$supervol_uuid" >> "${base_path}/pvs-${i_start}-${i_end}.yml" | ||
((++i)) | ||
done | ||
|
||
exit 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
--- | ||
|
||
kind: StorageClass | ||
apiVersion: storage.k8s.io/v1 | ||
metadata: | ||
name: gluster-block-subvol | ||
annotations: | ||
storageclass.kubernetes.io/is-default-class: "true" | ||
provisioner: none |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# Installation of flex volume plugin | ||
|
||
This is a flex volume plugin that needs to be installed on each Kubernetes node. | ||
Included in this directory is an ansible playbook (`install_plugin.yml`) that | ||
performs the install. This playbook: | ||
* Creates the directory for the plugin: | ||
`/usr/libexec/kubernetes/kubelet-plugins/volume/exec/rht~glfs-block-subvol` | ||
* Copies both the plugin script `glfs-block-subvol` to that directory. | ||
|
||
Upon first install, it may be necessary to restart kubelet for it to find the | ||
plugin. | ||
|
||
# Usage | ||
To use the plugin, include the following as a volume description. | ||
```yaml | ||
flexVolume: | ||
driver: "rht/glfs-block-subvol" | ||
options: | ||
cluster: 192.168.173.15:192.168.173.16:192.168.173.17 | ||
volume: "testvol" | ||
dir: "00/01" | ||
file: "0001" | ||
``` | ||
The required options for the driver are: | ||
* `cluster`: A colon separated list of the Gluster nodes in the cluster. The | ||
first will be used as the primary for mounting, and the rest will be listed as | ||
backup volume servers. | ||
* `volume`: This is the name of the large Gluster volume that is being | ||
subdivided. | ||
* `dir`: This is the path from the root of the volume to the subdirectory which | ||
will contain the file that would be loop mounted to be the volume. | ||
* `file`: This is the name of the file within `dir` that is loop mounted as an | ||
XFS file system as the volume for the claim. | ||
|
||
The above example would use 192.168.173.15:/testvol/00/01/0001 to hold the PV | ||
contents. | ||
|
||
# Diagnostics/debugging | ||
The `glfs-block-subvol` script has logging for all of its actions to help | ||
diagnose problems with the plugin. The logging settings are at the top of the | ||
script file: | ||
```sh | ||
# if DEBUG, log everything to a file as we do it | ||
DEBUG=1 | ||
DEBUGFILE='/tmp/glfs-block-subvol.out' | ||
``` | ||
When `DEBUG` is `1`, all calls and actions taken by the plugin are logged to | ||
`DEBUGFILE`. The following is an example of the log file: | ||
``` | ||
[1520361740.373690279] > init | ||
[1520361740.373690279] < 0 {"status": "Success", "capabilities": {"attach": false, "selinuxRelabel": false}} | ||
[1520361740.405577771] > mount /mnt/pods/00/volumes/vol1 {"cluster":"127.0.0.1:127.0.0.1","dir":"blockstore/00/00","volume":"patchy","file":"0000"} | ||
[1520361740.405577771] volserver 127.0.0.1 | ||
[1520361740.405577771] backupservers 127.0.0.1 | ||
[1520361740.405577771] Using lockfile: /var/lock/glfs-block-subvol/127.0.0.1-patchy.lock | ||
[1520361740.405577771] ! mount -t glusterfs -o backup-volfile-servers=127.0.0.1 127.0.0.1:/patchy /mnt/script-dir/mnt/blockstore/127.0.0.1-patchy | ||
[1520361740.405577771] ! mount /mnt/script-dir/mnt/blockstore/127.0.0.1-patchy/blockstore/00/00/0000 /mnt/pods/00/volumes/vol1 -t xfs -o loop,discard | ||
[1520361740.405577771] < 0 {"status": "Success", "message": "volserver=127.0.0.1 backup=127.0.0.1 volume=patchy mountpoint=/mnt/script-dir/mnt/blockstore/127.0.0.1-patchy bindto=/mnt/pods/00/volumes/vol1"} | ||
[1520361740.849832326] > unmount /mnt/pods/00/volumes/vol1 | ||
[1520361740.849832326] ldevice=/dev/loop0 | ||
[1520361740.849832326] ldevicefile=/mnt/script-dir/mnt/blockstore/127.0.0.1-patchy/blockstore/00/00/0000 | ||
[1520361740.849832326] gdevicedir=/mnt/script-dir/mnt/blockstore/127.0.0.1-patchy | ||
[1520361740.849832326] mntsuffix=127.0.0.1-patchy | ||
[1520361740.849832326] ! umount /mnt/pods/00/volumes/vol1 | ||
[1520361740.849832326] Using lockfile: /var/lock/glfs-block-subvol/127.0.0.1-patchy.lock | ||
[1520361740.849832326] /mnt/script-dir/mnt/blockstore/127.0.0.1-patchy has 0 loop mounted files | ||
[1520361740.849832326] We were last user of /mnt/script-dir/mnt/blockstore/127.0.0.1-patchy; unmounting it. | ||
[1520361740.849832326] ! umount /mnt/script-dir/mnt/blockstore/127.0.0.1-patchy | ||
[1520361740.849832326] ! rmdir /mnt/script-dir/mnt/blockstore/127.0.0.1-patchy | ||
[1520361740.849832326] < 0 {"status": "Success", "message": "Unmounting from /mnt/pods/00/volumes/vol1"} | ||
``` | ||
|
||
In the log file, each line begins with a timestamp, and the timestamp remains | ||
constant for the length of the execution of the script. The purpose is to allow | ||
multiple, overlapping invocations to be teased apart. The second (optional) | ||
field is a single character. | ||
* Lines with ">" are logs of the scripts invocation arguments. | ||
* Lines with "<" are the script's output back to the driver. | ||
* Lines with "!" are external command invocations made by the script. | ||
* Lines without one of these characters are free-form diagnostic messages. | ||
|
||
In the event that the logging generates too much output, it can be disabled by | ||
setting `DEBUG` to `0`. However, when changing this value, be careful to update | ||
the script in an atomic fashion if the node is currently in-use. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably don't want this. If a user wants this as the default storageclass, they can patch it once it's been added. Having it automatically set itself as cluster default raises the possibility of accidentally having 2 that are set as default. In that case, neither acts as default and default provisioning breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This yml is a helper/reminder for anyone who wants to add this as the default, right? Normally we do not need to add this as a storage class, as we use the Flexvol scheme. Isn't that right?
Assuming so, I would assume we leave this as is, for interested users to set this as a default. If my assumptions are wrong, then I would need to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes good point. Since we use static provisioning, there is no need for the StorageClass object unless one wants it to become the default class. (not related to flex)