Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Storage Calculator to be more robust #2947

Merged
merged 5 commits into from
Dec 20, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions services/storage-calculator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Storage Calculator

This service is responsible for:

* Recording the used size of any Persistent-Volume (PV)
* Recording the used size of any database (`data` and `index`)

The result of the size calculations is sent to the Lagoon API.

The storage is measured in `KB`.

If you want to retrieve the storage for a given project, you can use GraphQL:

```graphql
query {
projectByName(name: "PROJECT") {
id
name
gitUrl
productionEnvironment
environments {
name
deployType
environmentType
storages {
claim: persistentStorageClaim
kb: bytesUsed
updated
}
}
openshift {
id
name
}
}
}
```

## Configuration

These are the environment variables that can influence the behaviour of the storage calculator:

* `PROJECT_REGEX` - defaults to `.+` (everything), this is a way to only include certain projects by name. This takes precedence of the Lagoon API `storageCalc` value for a given project.
* `LAGOON_STORAGE_LABEL_NAMESPACE` - if set (any value), then the namespace will be labeled with the current storage. This is useful for being able to see this information in the Kubernetes API.
* `LAGOON_STORAGE_IGNORE_REGEX` - you can optionally choose to ignore trying to mount and calculate the size of a given `PV` by name. This is useful for example to not try to calculate `RWO` PVs such as `solr` and `redis`. An example string this could be set to is `solr|redis|elasticsearch`.
244 changes: 125 additions & 119 deletions services/storage-calculator/calculate-storage.sh
Original file line number Diff line number Diff line change
@@ -1,20 +1,21 @@
#!/bin/bash

# Send a GraphQL query to Lagoon API.
# Accpets query as first param. Usage:
#
# Accepts query as first param. Usage:
# ALL_ENVIRONMENTS=$(apiQuery "query {
# allProjects {
# name
# }
# }")
apiQuery() {
local api_token=$(./create_jwt.py)
local authz_header="Authorization: bearer $api_token"
local API_TOKEN=$(./create_jwt.py)
local AUTH_HEADER="Authorization: bearer $API_TOKEN"

# Convert GraphQL file into single line (but with still \n existing), turn \n into \\n, esapee the Quotes
# Convert GraphQL file into single line (but with still \n existing)
# turn `\n` into `\\n` and escape the quotes.
local query=$(echo $1 | sed 's/"/\\"/g' | sed 's/\\n/\\\\n/g' | awk -F'\n' '{if(NR == 1) {printf $0} else {printf "\\n"$0}}')
local result=$(curl -s -XPOST -H 'Content-Type: application/json' -H "$authz_header" "${GRAPHQL_ENDPOINT:-api:3000/graphql}" -d "{\"query\": \"$query\"}")

local result=$(curl -s -XPOST -H 'Content-Type: application/json' -H "$AUTH_HEADER" "${GRAPHQL_ENDPOINT:-api:3000/graphql}" -d "{\"query\": \"$query\"}")
echo "$result"
}

Expand All @@ -36,127 +37,132 @@ ALL_ENVIRONMENTS=$(apiQuery 'query {
}
}')

echo "$ALL_ENVIRONMENTS" | jq -c '.data.environments[] | select((.environments | length) >= 1)' | while read PROJECT ; do
PROJECT_NAME=$(echo "$PROJECT" | jq -r '.name')

echo "$ALL_ENVIRONMENTS" | jq -c '.data.environments[] | select((.environments|length)>=1)' | while read project
do
PROJECT_NAME=$(echo "$project" | jq -r '.name')
# Match the Project name to the Project Regex
if [[ $PROJECT_NAME =~ $PROJECT_REGEX ]]; then
STORAGE_CALC=$(echo "$project" | jq -r '.storageCalc')
echo "Handling project: $PROJECT_NAME"
# loop through each environment of the current project
echo "$project" | jq -c '.environments[]' | while read environment
do
OPENSHIFT_URL=$(echo "$environment" | jq -r '.openshift.consoleUrl')
OPENSHIFT_TOKEN=$(echo "$environment" | jq -r '.openshift.token // empty')
ENVIRONMENT_OPENSHIFT_PROJECTNAME=$(echo "$environment" | jq -r '.openshiftProjectName')
ENVIRONMENT_NAME=$(echo "$environment" | jq -r '.name')
ENVIRONMENT_ID=$(echo "$environment" | jq -r '.id')

echo "$OPENSHIFT_URL - $PROJECT_NAME: handling development environment $ENVIRONMENT_NAME"

if [[ $STORAGE_CALC != "1" ]]; then
echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: storage calculation disabled, skipping"

apiQuery "mutation {
addOrUpdateEnvironmentStorage(input:{environment:${ENVIRONMENT_ID}, persistentStorageClaim:\"storage-calc-disabled\", bytesUsed:0}) {
id
}
}"

continue

fi

OC="oc --insecure-skip-tls-verify --token=$OPENSHIFT_TOKEN --server=$OPENSHIFT_URL -n $ENVIRONMENT_OPENSHIFT_PROJECTNAME"

# Skip if namespace doesn't exist.
if ! ${OC} get ns ${ENVIRONMENT_OPENSHIFT_PROJECTNAME} >/dev/null 2>&1 ; then
echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: no valid namespace found"
continue
fi

echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: creating storage-calc pod"

# Cleanup any existing storage-calc deployments
${OC} delete deployment/storage-calc >/dev/null 2>&1

# Start storage-calc deployment
deployment_template=$(${OC} create --dry-run=true -o yaml deployment storage-calc --image imagecache.amazeeio.cloud/amazeeio/alpine-mysql-client)
deployment=$(echo "$deployment_template" | yq '.spec.template.spec.containers[0].command = ["sh", "-c", "while sleep 3600; do :; done"]')
echo "$deployment" | ${OC} create -f -
${OC} rollout pause deployment/storage-calc

# Copy environment variable from lagoon-env configmap.
${OC} set env --from=configmap/lagoon-env deployment/storage-calc

PVCS=($(${OC} get pvc -o name | sed 's/persistentvolumeclaim\///'))

for PVC in "${PVCS[@]}"
do
echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: mounting ${PVC} into storage-calc"
${OC} set volume deployment/storage-calc --add --name=${PVC} --type=persistentVolumeClaim --claim-name=${PVC} --mount-path=/storage/${PVC}
done

${OC} rollout resume deployment/storage-calc
echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: redeploying storage-calc to mount volumes"
${OC} rollout status deployment/storage-calc --watch

POD=$(${OC} get pods -l app=storage-calc -o json | jq -r '[.items[] | select(.metadata.deletionTimestamp == null) | select(.status.phase == "Running")] | first | .metadata.name // empty')

if [[ ! $POD ]]; then
echo "No running pod found for storage-calc"
# Clean up any failed deployments.
${OC} delete deployment/storage-calc >/dev/null 2>&1
exit 1
fi

echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: loading storage information"

if [[ ! ${#PVCS[@]} -gt 0 ]]; then
echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: no PVCs found writing API with 0 bytes"

apiQuery "mutation {
addOrUpdateEnvironmentStorage(input:{environment:${ENVIRONMENT_ID}, persistentStorageClaim:\"none\", bytesUsed:0}) {
id
}
}"
# Guard statement, to ensure the project regex is respected.
if ! [[ $PROJECT_NAME =~ $PROJECT_REGEX ]] ; then
echo "Project: $PROJECT_NAME [skip, does not match regex: ${PROJECT_REGEX}]"
continue
fi

else
for PVC in "${PVCS[@]}"
do
STORAGE_BYTES=$(${OC} exec ${POD} -- sh -c "du -s /storage/${PVC} | cut -f1")
# STORAGE_BYTES=$(echo "${DF}" | grep /storage/${PVC} | awk '{ print $4 }')
echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: ${PVC} uses ${STORAGE_BYTES} kilobytes"

apiQuery "mutation {
addOrUpdateEnvironmentStorage(input:{environment:${ENVIRONMENT_ID}, persistentStorageClaim:\"${PVC}\", bytesUsed:${STORAGE_BYTES}}) {
id
}
}"

# Update namespace labels
if [ ! -z "$LAGOON_STORAGE_LABEL_NAMESPACE" ]; then
${OC} label namespace $ENVIRONMENT_OPENSHIFT_PROJECTNAME lagoon/storage-${PVC}=${STORAGE_BYTES} --overwrite
fi

done
STORAGE_CALC=$(echo "$PROJECT" | jq -r '.storageCalc')
echo "Project: $PROJECT_NAME [storage calc: ${STORAGE_CALC}]"

# Loop through the environments.
echo "$PROJECT" | jq -c '.environments[]' | while read environment ; do
OPENSHIFT_URL=$(echo "$environment" | jq -r '.openshift.consoleUrl')
OPENSHIFT_TOKEN=$(echo "$environment" | jq -r '.openshift.token // empty')
ENVIRONMENT_OPENSHIFT_PROJECTNAME=$(echo "$environment" | jq -r '.openshiftProjectName')
ENVIRONMENT_NAME=$(echo "$environment" | jq -r '.name')
ENVIRONMENT_ID=$(echo "$environment" | jq -r '.id')

echo " > $OPENSHIFT_URL - $PROJECT_NAME: environment $ENVIRONMENT_NAME"

if [[ $STORAGE_CALC != "1" ]] ; then
echo " > $OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: storage calculation disabled, skipping"
apiQuery "mutation {
addOrUpdateEnvironmentStorage(input:{environment:${ENVIRONMENT_ID}, persistentStorageClaim:\"storage-calc-disabled\", bytesUsed:0}) {
id
}
}"
continue
fi

OC="oc --insecure-skip-tls-verify --token=$OPENSHIFT_TOKEN --server=$OPENSHIFT_URL -n $ENVIRONMENT_OPENSHIFT_PROJECTNAME"

# Skip if namespace doesn't exist.
NAMESPACE=$(${OC} get namespace ${ENVIRONMENT_OPENSHIFT_PROJECTNAME} --ignore-not-found=true);
if ! [[ "$NAMESPACE" ]] ; then
echo " > $OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: no valid namespace found"
continue
fi

echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: creating storage-calc pod"

# Cleanup any existing storage-calc deployments.
${OC} delete deployment/storage-calc --ignore-not-found=true

# Start storage-calc deployment.
deployment_template=$(${OC} create --dry-run=true -o yaml deployment storage-calc --image imagecache.amazeeio.cloud/amazeeio/alpine-mysql-client)
deployment=$(echo "$deployment_template" | yq '.spec.template.spec.containers[0].command = ["sh", "-c", "while sleep 3600; do :; done"]')
echo "$deployment" | ${OC} create -f -
${OC} rollout pause deployment/storage-calc

# Copy environment variable from lagoon-env configmap.
${OC} set env --from=configmap/lagoon-env deployment/storage-calc

# Loop through all PVCs, and attempt to attach them, so long as they are not in the ignore list.
PVCS=($(${OC} get pvc -o name | sed 's/persistentvolumeclaim\///'))
for PVC in "${PVCS[@]}" ; do
if [ ! -z "$LAGOON_STORAGE_IGNORE_REGEX" ] ; then
if [[ $PVC =~ $LAGOON_STORAGE_IGNORE_REGEX ]]; then
echo "> PVC: ${PVC} [skip mounting, it matches the skip regex: ${LAGOON_STORAGE_IGNORE_REGEX}]"
continue
fi
fi
echo "> PVC: ${PVC} [mounting ${PVC} into storage-calc]"
${OC} set volume deployment/storage-calc --add --name=${PVC} --type=persistentVolumeClaim --claim-name=${PVC} --mount-path=/storage/${PVC}
done

if mariadb_size=$(${OC} exec ${POD} -- sh -c "if [ \"\$MARIADB_HOST\" ]; then mysql -N -s -h \$MARIADB_HOST -u\$MARIADB_USERNAME -p\$MARIADB_PASSWORD -P\$MARIADB_PORT -e 'SELECT ROUND(SUM(data_length + index_length) / 1024, 0) FROM information_schema.tables'; else exit 1; fi") && [ "$mariadb_size" ]; then
echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: Database uses ${mariadb_size} kilobytes"

${OC} rollout resume deployment/storage-calc
echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: redeploying storage-calc to mount volumes"
${OC} rollout status deployment/storage-calc --watch --timeout=30s

POD=$(${OC} get pods -l app=storage-calc -o json | jq -r '[.items[] | select(.metadata.deletionTimestamp == null) | select(.status.phase == "Running")] | first | .metadata.name // empty')

if [[ ! $POD ]] ; then
echo "No running pod found for storage-calc"
${OC} delete deployment/storage-calc --ignore-not-found=true
continue
fi

echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: loading storage information"

if [[ ! ${#PVCS[@]} -gt 0 ]] ; then
echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: no PVCs found writing API with 0 bytes"
apiQuery "mutation {
addOrUpdateEnvironmentStorage(input:{environment:${ENVIRONMENT_ID}, persistentStorageClaim:\"none\", bytesUsed:0}) {
id
}
}"

else
# Loop through all PVCs, and calculate their usage, so long as they are not in the ignore list.
for PVC in "${PVCS[@]}" ; do
if [ ! -z "$LAGOON_STORAGE_IGNORE_REGEX" ] ; then
if [[ $PVC =~ $LAGOON_STORAGE_IGNORE_REGEX ]]; then
continue
fi
fi
STORAGE_BYTES=$(${OC} exec ${POD} -- sh -c "du -s /storage/${PVC} | cut -f1")
echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: ${PVC} uses ${STORAGE_BYTES} kilobytes"
apiQuery "mutation {
addOrUpdateEnvironmentStorage(input:{environment:${ENVIRONMENT_ID}, persistentStorageClaim:\"mariadb\", bytesUsed:${mariadb_size}}) {
addOrUpdateEnvironmentStorage(input:{environment:${ENVIRONMENT_ID}, persistentStorageClaim:\"${PVC}\", bytesUsed:${STORAGE_BYTES}}) {
id
}
}"
# Update namespace labels.
if [ ! -z "$LAGOON_STORAGE_LABEL_NAMESPACE" ] ; then
${OC} label namespace $ENVIRONMENT_OPENSHIFT_PROJECTNAME lagoon/storage-${PVC}=${STORAGE_BYTES} --overwrite
fi
done
fi

if mariadb_size=$(${OC} exec ${POD} -- sh -c "if [ \"\$MARIADB_HOST\" ]; then mysql -N -s -h \$MARIADB_HOST -u\$MARIADB_USERNAME -p\$MARIADB_PASSWORD -P\$MARIADB_PORT -e 'SELECT ROUND(SUM(data_length + index_length) / 1024, 0) FROM information_schema.tables'; else exit 1; fi") && [ "$mariadb_size" ]; then
echo "$OPENSHIFT_URL - $PROJECT_NAME - $ENVIRONMENT_NAME: Database uses ${mariadb_size} kilobytes"
apiQuery "mutation {
addOrUpdateEnvironmentStorage(input:{environment:${ENVIRONMENT_ID}, persistentStorageClaim:\"mariadb\", bytesUsed:${mariadb_size}}) {
id
}
}"
# Update namespace labels.
if [ ! -z "$LAGOON_STORAGE_LABEL_NAMESPACE" ] ; then
${OC} label namespace $ENVIRONMENT_OPENSHIFT_PROJECTNAME lagoon/storage-mariadb=${mariadb_size} --overwrite
fi
fi

${OC} delete deployment/storage-calc
${OC} delete deployment/storage-calc --ignore-not-found=true

done
else
echo "$PROJECT_NAME: SKIP, does not match Regex: $PROJECT_REGEX"
fi
done
done
done