Skip to content

Latest commit

 

History

History
138 lines (95 loc) · 6.54 KB

README.org

File metadata and controls

138 lines (95 loc) · 6.54 KB

OpenShift AI Hackathon

This repository contains a basic nextjs frontend designed to be exported as a static site and served via github pages, for the purposes of running an OpenShift AI hackathon.

Below are the instructions for manually setting up an environment to run the hackathon.

Pre-requisites

This guide assumes you have the following packages installed locally:

oc version && rosa version && aws --version

Cluster provisioning

Each team participating in the hackathon will require a Red Hat OpenShift on AWS (ROSA) cluster, which we will provision via the Red Hat Demo System. When requesting the environments we enable the workshop user interface with:

  • The title `OpenShift AI Hackathon`
  • Number of instances set to `12`
  • AWS Region `eu-west-1`

Cluster setup

For each cluster provisioned for the hackathon, the following steps need to be performed:

Log in to cluster and rosa cli

Before we begin lets ensure our command line tools are authenticated. For `rosa` you’ll need a token from the Rosa Console.

oc login --username "cluster-admin" --password "${PASSWORD}" <api-route>

rosa login --token "${ROSA_TOKEN}"

aws configure

Create gpu machine pool

Our first task is to ensure each cluster has a GPU `MachineSet` present, we can follow the instructions from https://cloud.redhat.com/experts/rosa/gpu to complete this.

# Define paramaters for machineset
export GPU_INSTANCE_TYPE='g5.8xlarge'
export CLUSTER_NAME=rosa-jfccs
export MACHINE_POOL_NAME=nvidia-gpu-pool
export MACHINE_POOL_REPLICA_COUNT=1

# Create the machineset with rosa cli
rosa create machinepool \
  --cluster="${CLUSTER_NAME}" \
  --name="${MACHINE_POOL_NAME}" \
  --replicas="${MACHINE_POOL_REPLICA_COUNT}" \
  --instance-type="${GPU_INSTANCE_TYPE}"

# Wait for the machineset to be ready
oc wait --for=jsonpath='{.status.readyReplicas}'=1 machineset \
  --selector hive.openshift.io/machine-pool="${MACHINE_POOL_NAME}" \
  --namespace openshift-machine-api \
  --timeout=600s

Install and configure minio via oc

Once the cluster gpu machinepool has been created we need to deploy minio so we can create storage buckets and pre seed models on the cluster for hackathon participants to consume.

# Deploy minio
oc new-project minio && oc --namespace minio apply --filename setup/minio-setup.yaml

# Wait for minio to come up
oc --namespace minio rollout status deployment/minio --watch

With minio deployed we need to create a bucket and upload some content to it to pre seed a model. We can do this by remotely executing minio cli `mc` commands within the `minio` pod.

# Retrieve the running minio pod
pod=$(oc get pods --namespace "minio" --output name)

# Retrieve the minio credentials
minio_user=$(oc --namespace "minio" get secret "minio-secret" -o jsonpath='{.data.minio_root_user}' | base64 --decode)
minio_pass=$(oc --namespace "minio" get secret "minio-secret" -o jsonpath='{.data.minio_root_password}' | base64 --decode)

# Configure the minio cli alias
oc --namespace "minio" exec "${pod}" -- mc alias set local http://localhost:9000 "${minio_user}" "${minio_pass}"

# Make the models bucket
oc --namespace "minio" exec "${pod}" -- mc mb "local/models"

Populate minio bucket with model

With the bucket created we need to push some model content to it, we can do that with the magic of `git` and `mc`. We need to start by downloading the bucket content.

pod=$(oc get pods --namespace "minio" --output name)
oc --namespace "minio" exec "${pod}" -- mkdir /tmp/model --parent
oc --namespace minio exec "${pod}" -- curl https://huggingface.co/instructlab/granite-7b-lab/resolve/main/added_tokens.json?download=true -o /tmp/model/added_tokens.json
oc --namespace minio exec "${pod}" -- curl https://huggingface.co/instructlab/granite-7b-lab/resolve/main/config.json?download=true -o /tmp/model/config.json
oc --namespace minio exec "${pod}" -- curl https://huggingface.co/instructlab/granite-7b-lab/resolve/main/generation_config.json?download=true -o /tmp/model/generation_config.json
oc --namespace minio exec "${pod}" -- curl -L https://huggingface.co/instructlab/granite-7b-lab/resolve/main/model-00001-of-00003.safetensors?download=true -o /tmp/model/model-00001-of-00003.safetensors
oc --namespace minio exec "${pod}" -- curl -L https://huggingface.co/instructlab/granite-7b-lab/resolve/main/model-00002-of-00003.safetensors?download=true -o /tmp/model/model-00002-of-00003.safetensors
oc --namespace minio exec "${pod}" -- curl -L https://huggingface.co/instructlab/granite-7b-lab/resolve/main/model-00003-of-00003.safetensors?download=true -o /tmp/model/model-00003-of-00003.safetensors
oc --namespace minio exec "${pod}" -- curl https://huggingface.co/instructlab/granite-7b-lab/resolve/main/model.safetensors.index.json?download=true -o /tmp/model/model.safetensors.index.json
oc --namespace minio exec "${pod}" -- curl https://huggingface.co/instructlab/granite-7b-lab/resolve/main/special_tokens_map.json?download=true -o /tmp/model/special_tokens_map.json
oc --namespace minio exec "${pod}" -- curl https://huggingface.co/instructlab/granite-7b-lab/resolve/main/tokenizer.json?download=true -o /tmp/model/tokenizer.json
oc --namespace minio exec "${pod}" -- curl -L https://huggingface.co/instructlab/granite-7b-lab/resolve/main/tokenizer.model?download=true -o /tmp/model/tokenizer.model
oc --namespace minio exec "${pod}" -- curl https://huggingface.co/instructlab/granite-7b-lab/resolve/main/tokenizer_config.json?download=true -o /tmp/model/tokenizer_config.json
# Retreieve the pod name
pod=$(oc get pods --namespace "minio" --output name)

# Upload files via minio cli
oc --namespace "minio" exec "${pod}" -- mc cp --recursive /tmp/model local/models/granite-7b-lab