Skip to content

Commit

Permalink
Merge pull request #100 from daos-stack/develop
Browse files Browse the repository at this point in the history
DAOSGCP-218 Merge develop to main for v0.5.0

Signed-off-by: Mark Olson <115657904+mark-olson@users.noreply.github.com>
  • Loading branch information
mark-olson authored Jan 17, 2024
2 parents 80d8b18 + 75c61db commit 43f7fe3
Show file tree
Hide file tree
Showing 50 changed files with 447 additions and 271 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Direnv
.envrc

# Local .terraform directories
**/.terraform/
**/.terraform/*
Expand Down
2 changes: 1 addition & 1 deletion .tflint.hcl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
plugin "google" {
enabled = true
version = "0.16.1"
version = "0.26.0"
source = "github.com/terraform-linters/tflint-ruleset-google"
}
rule "terraform_deprecated_index" {
Expand Down
61 changes: 22 additions & 39 deletions docs/deploy_daos_cluster_example.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,19 @@ These instructions describe how to deploy a DAOS Cluster using the example in [t
Deployment tasks described in these instructions:

- Deploy a DAOS cluster using Terraform
- Log into the first DAOS client instance
- Perform DAOS administrative tasks to prepare the storage
- Mount a DAOS container with [DFuse (DAOS FUSE)](https://docs.daos.io/v2.0/user/filesystem/?h=dfuse#dfuse-daos-fuse)
- Mount a DAOS container with [DFuse (DAOS FUSE)](https://docs.daos.io/v2.4/user/filesystem/?h=dfuse#dfuse-daos-fuse)
- Store files in a DAOS container
- Unmount the container
- Remove the deployment (terraform destroy)
- Undeploy DAOS cluster (terraform destroy)

## Prerequisites

The steps in the [Pre-Deployment Guide](pre-deployment_guide.md) must be completed prior to deploying the DAOS cluster in this example.

The [Pre-Deployment Guide](pre-deployment_guide.md) describes how to build the DAOS images that are used to deploy server and client instances.


## Clone the repository

Expand All @@ -25,7 +33,7 @@ cd ~/google-cloud-daos/terraform/examples/daos_cluster

## Create a `terraform.tfvars` file

Before you run `terraform` you need to create a `terraform.tfvars` file in the `terraform/examples/daos_cluster` directory.
Before you run `terraform apply` to deploy the DAOS cluster you need to create a `terraform.tfvars` file in the `terraform/examples/daos_cluster` directory.

The `terraform.tfvars` file contains the variable values for the configuration.

Expand Down Expand Up @@ -111,23 +119,19 @@ gcloud compute instances list \
--format="value(name,INTERNAL_IP)"
```

## Perform DAOS administration tasks

After your DAOS cluster has been deployed you can log into the first DAOS server instance to perform administrative tasks.

### Log into the first DAOS server instance
## Log into the first DAOS client instance

Log into the first server instance

```bash
gcloud compute ssh daos-server-0001
gcloud compute ssh daos-client-0001
```

### Verify that all daos-server instances have joined
## Perform DAOS administration tasks

The DAOS Management Tool `dmg` is meant to be used by administrators to manage the DAOS storage system and pools.
The `dmg` command is used to perform adminstrative tasks such as formatting storage and managing pools and therefore must be run with `sudo`.

You will need to run `dmg` with `sudo`.
### Verify that all daos-server instances have joined

Use `dmg` to verify that the DAOS storage system is ready.

Expand Down Expand Up @@ -172,9 +176,7 @@ This shows how much NVMe-Free space is available for each server.
Create a pool named `pool1` that uses the total NVMe-Free for all servers.

```bash
TOTAL_NVME_FREE="$(sudo dmg storage query usage | awk '{split($0,a," "); sum += a[10]} END {print sum}')TB"
echo "Total NVMe-Free: ${TOTAL_NVME_FREE}"
sudo dmg pool create --size="${TOTAL_NVME_FREE}" --tier-ratio=3 --label=pool1
sudo dmg pool create --size="100%" pool1
```

View the ACLs on *pool1*
Expand All @@ -193,44 +195,23 @@ A:G:GROUP@:rw

Here we see that root owns the pool.

Add an [ACE](https://docs.daos.io/v2.0/admin/pool_operations/#adding-and-updating-aces) that will allow any user to create a container in the pool
Add an [ACE](https://docs.daos.io/v2.4/admin/pool_operations/#adding-and-updating-aces) that will allow any user to create a container in the pool

```bash
sudo dmg pool update-acl -e A::EVERYONE@:rcta pool1
```

This completes the administration tasks for the pool.

For more information about pools see

- [Overview - Storage Model - DAOS Pool](https://docs.daos.io/latest/overview/storage/#daos-pool)
- [Administration Guide - Pool Operations](https://docs.daos.io/latest/admin/pool_operations/)

### Log out of the first server instance

Now that the administrative tasks have been completed, you may log out of the first server instance.

```bash
logout
```

## Create a Container

User tasks such as creating and mounting a container will be done on the first client

### Log into the first DAOS client instance

Log into the first client instance

```bash
gcloud compute ssh daos-client-0001
```


Create a [container](https://docs.daos.io/latest/overview/storage/#daos-container) in the pool

```bash
daos container create --type=POSIX --properties=rf:0 --label=cont1 pool1
daos container create --type=POSIX --properties=rf:0 pool1 cont1
```

For more information about containers see
Expand Down Expand Up @@ -261,8 +242,10 @@ Create a 20GiB file which will be stored in the DAOS filesystem.

```bash
cd ${HOME}/daos/cont1

# Create a 20GB file
time LD_PRELOAD=/usr/lib64/libioil.so \
dd if=/dev/zero of=./test21G.img bs=1G count=20
dd if=/dev/zero of=./test20.img bs=1G count=20
```

## Unmount the container and logout of the first client
Expand Down
2 changes: 1 addition & 1 deletion docs/pre-deployment_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ Since *project name* and *project ID* are used in many configurations it is reco

To create a project, refer to the following documentation

- [Get Started with Google Cloud](https://cloud.google.com/docs/get-started)
- [Creating and managing projects](https://cloud.google.com/resource-manager/docs/creating-managing-projects)

Make note of the *Project Name* and *Project ID* for the project that you plan to use for your DAOS deployment as you will be using it later in various configurations.
Expand Down Expand Up @@ -152,6 +151,7 @@ If you are currently in Cloud Shell, you don't need to run this command.

```bash
gcloud auth login
gcloud auth application-default login
```

To learn more about using the Google Cloud CLI see the various [How-to Guides](https://cloud.google.com/sdk/docs/how-to).
Expand Down
93 changes: 59 additions & 34 deletions images/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,27 @@
# Images

This directory contains files necessary for building DAOS images using [Cloud Build](https://cloud.google.com/build) and [Packer](https://developer.hashicorp.com/packer/downloads).
This directory contains files necessary for building DAOS images using
[Cloud Build](https://cloud.google.com/build) and
[Packer](https://developer.hashicorp.com/packer/downloads).

## Pre-Deployment steps required

If you have not done so yet, please complete the steps in [Pre-Deployment Guide](../docs/pre-deployment_guide.md).
If you have not done so yet, please complete the steps in the
[Pre-Deployment Guide](../docs/pre-deployment_guide.md).

The pre-deployment steps will have you run the `images/build.sh` script once in order to build a DAOS server image and a DAOS client image with the configured default settings.
The pre-deployment steps will have you run the `images/build.sh` script once in
order to build a DAOS server image and a DAOS client image with the configured
default settings.

That should be all you need to run the Terraform examples in the `terraform/examples` directory or to run the [DAOS examples in the Google HPC Toolkit](https://github.com/GoogleCloudPlatform/hpc-toolkit/tree/main/community/examples/intel).
That should be all you need to run the Terraform examples in
the `terraform/examples` directory or to run the [DAOS examples in the Google HPC Toolkit](https://github.com/GoogleCloudPlatform/hpc-toolkit/tree/main/community/examples/intel).

The information in this document is provided in case you need to build custom images with non-default settings.
The information in this document is provided in case you need to build custom
images with non-default settings.

## Building DAOS images

To rebuild the images with the default settings run:
To build the images with the default settings run:

```bash
cd images
Expand All @@ -23,13 +30,32 @@ cd images

## The Packer HCL template file

A single Packer HCL template file `daos.pkr.hcl` is used to build either a DAOS server or DAOS client image.
A single Packer HCL template file `daos.pkr.hcl` is used to build either a DAOS
server or DAOS client image.

The `daos.pkr.hcl` file does not build both server and client images in a single `packer build` run. This is by design since there are use cases in which only one type of image is needed. If both types of images are needed, then `packer build` must be run twice with different variable values.
The `daos.pkr.hcl` file does not build both server and client images in a single `packer build` run.
This is by design since there are use cases in which only one type of image is needed. If both types
of images are needed, then `packer build` must be run twice with different variable values.

The `build.sh` script does this for you by running packer twice with different variable values for
server and client images.

### Source Block

Within the `daos.pkr.hcl` template there is a single `source` block. Most of the settings for the block are set by variable values.
Within the `daos.pkr.hcl` template there is a single `source` block. The settings
settings for the block are provided by variable values. This allows the settings
to be passed to packer via a variables file which is specified by the `-var-file` parameter
of the `packer build` command.

The `build.sh` script generates a packer variables file from the `GCP_*` and `DAOS_*` environment
variables defined in the script.

Run `./build.sh --help` to see a list of environment variables that are used
by the `./build.sh` script to create a packer variables file that will be
passed to packer to create the images.

You can export these variables before running the `build.sh` script to customize
the images or to modify Cloud Build settings.

### Build Block

Expand All @@ -41,7 +67,8 @@ The `build` block consists of provisioners that do the following:

These provisioners are the same for building both DAOS server and DAOS client images.

The `daos_install_type` variable in the `daos.pkr.hcl` template is passed in the `--extra-vars` parameter when running the `daos.yml` ansible playbook.
The `daos_install_type` variable in the `daos.pkr.hcl` template is passed in the `--extra-vars`
parameter of the `ansible-playbook` command when running the `daos.yml` ansible playbook.

If `daos_install_type=server`, then the `daos.yml` playbook will install the DAOS server packages.

Expand Down Expand Up @@ -74,13 +101,15 @@ The `images/build.sh` script uses the following environment variables.

To view the default values for these variables see the defaults set in the `build.sh` script.

Running `build.sh --help` will display the values of these variables so that you can inspect them before running `build.sh`
Running `build.sh --help` will display the values of these variables so that you can inspect them
before running `build.sh`

### Controlling the version of DAOS to be installed

Official DAOS packages are hosted at https://packages.daos.io/

Unfortunately, the paths to the `.repo` files for each repository do not follow a standard convention that can be dynamically created based on something like the `/etc/os-release` file.
Unfortunately, the paths to the `.repo` files for each repository do not follow a standard
convention that can be dynamically created based on something like the `/etc/os-release` file.

To specify the path to a repo file the following 3 environment variables are used:

Expand All @@ -98,28 +127,16 @@ The values of these variables should not start or end with a `/`

**Examples:**

To install DAOS v2.2.0 on CentOS 7

```bash
DAOS_REPO_BASE_URL=https://packages.daos.io
DAOS_VERSION="2.2.0"
DAOS_PACKAGES_REPO_FILE="CentOS7/packages/x86_64/daos_packages.repo"
```

To install DAOS v2.2.0 on Rocky 8
To install DAOS v2.4.0 on Rocky 8

```bash
DAOS_REPO_BASE_URL=https://packages.daos.io
DAOS_VERSION="2.2.0"
DAOS_PACKAGES_REPO_FILE="EL8/packages/x86_64/daos_packages.repo"
```
```bash
DAOS_REPO_BASE_URL=https://packages.daos.io
DAOS_VERSION="2.4.0"
DAOS_PACKAGES_REPO_FILE="EL8/packages/x86_64/daos_packages.repo"
```

## Building only the DAOS Server or the DAOS Client image

If you do not want to build one of the images, you must set the appropriate environment variable.

For example,

To build only the DAOS Server image

```bash
Expand All @@ -138,7 +155,8 @@ export DAOS_BUILD_SERVER_IMAGE="false" # Do not run the job to build the DAOS se

## Custom image builds

To create images that do not use the default settings, export one or more of the environment variables listed above before running `build.sh`
To create images that do not use the default settings, export one or more of the environment
variables listed above before running `build.sh`

### Change the name of the image family

Expand All @@ -151,7 +169,8 @@ export DAOS_CLIENT_IMAGE_FAMILY="my-daos-client"

### Use a different source image

For the source image, use the `rocky-linux-8-optimized-gcp` community image instead of the `hpc-rocky-linux-8` image.
For the source image, use the `rocky-linux-8-optimized-gcp` community image instead of the
`hpc-rocky-linux-8` image.

```bash
cd images
Expand Down Expand Up @@ -204,6 +223,12 @@ export GCP_USE_CLOUDBUILD="false" # Do not run packer in Cloud Build
./build.sh
```

When running `build.sh` this way, all project configuration steps are skipped.
When running `build.sh` this way, all GCP project configuration steps (setting permissions) are skipped.

When `GCP_USE_CLOUDBUILD="true"` the `build.sh` will check your GCP project to ensure the default
service account has the proper permissions needed for the Cloud Build job to run packer and create
the images in your project.

When `GCP_USE_CLOUDBUILD="true"` the `build.sh` will check your GCP project to ensure the default service account has the proper permissions needed for the Cloud Build job to run packer and create the images in your project. Setting `GCP_USE_CLOUDBUILD="true"` will skip the project configuration steps. In this case, it's up to you to make sure the proper permissions are configured for you to run packer locally to build the images.
Setting `GCP_USE_CLOUDBUILD="false"` will skip the project configuration steps. In this case, it's
up to you to make sure the proper permissions are configured for you to run packer locally to build
the images.
3 changes: 2 additions & 1 deletion images/ansible_playbooks/daos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@

vars:
daos_install_type: "all"
daos_version: "2.2.0"
daos_version: "2.4.0"
daos_repo_base_url: "https://packages.daos.io"
daos_packages_repo_file: "EL8/packages/x86_64/daos_packages.repo"
daos_packages:
Expand All @@ -33,6 +33,7 @@
packages:
- clustershell
- curl
- fuse
- git
- jq
- patch
Expand Down
2 changes: 1 addition & 1 deletion images/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
set -eo pipefail
trap 'echo "Unexpected and unchecked error. Exiting."' ERR

: "${DAOS_VERSION:="2.2.0"}"
: "${DAOS_VERSION:="2.4.0"}"
: "${DAOS_REPO_BASE_URL:="https://packages.daos.io"}"
: "${DAOS_PACKAGES_REPO_FILE:="EL8/packages/x86_64/daos_packages.repo"}"
: "${GCP_PROJECT:=}"
Expand Down
4 changes: 2 additions & 2 deletions images/daos.pkr.hcl
Original file line number Diff line number Diff line change
Expand Up @@ -134,9 +134,9 @@ build {
provisioner "shell" {
execute_command = "echo 'packer' | sudo -S env {{ .Vars }} {{ .Path }}"
inline = [
"dnf clean packages",
"dnf -y install epel-release",
"dnf -y install python3.11 python3.11-pip ansible-core",
"alternatives --set python3 /usr/bin/python3.11"
"dnf -y install ansible-core"
]
}

Expand Down
Loading

0 comments on commit 43f7fe3

Please sign in to comment.