Skip to content

Commit

Permalink
Fix step numbering
Browse files Browse the repository at this point in the history
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
  • Loading branch information
ryanaoleary committed Sep 25, 2024
1 parent c8d0c8b commit b1aadea
Showing 1 changed file with 4 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Push the vLLM image to your Artifact registry:
```sh
docker push LOCATION-docker.pkg.dev/PROJECT-ID/REPOSITORY/IMAGE
```
## Step 3: Create a Kubernetes Secret for Hugging Face credentials
## Step 4: Create a Kubernetes Secret for Hugging Face credentials

This example uses meta-llama/Meta-Llama-3-70B, a gated Hugging Face model that requires access to be granted before use. Create a Hugging Face account, if you don't already have one, and follow the steps on the [model page](https://huggingface.co/meta-llama/Meta-Llama-3-70B) to request access to the model. Save your Hugging Face token for the following steps.

Expand All @@ -61,7 +61,7 @@ kubectl create secret generic hf-secret \
--dry-run=client -o yaml | kubectl apply -f -
```

## Step 4: Install the RayService CR
## Step 5: Install the RayService CR

Create a file named vllm-ray-serve-tpu.yaml with the following contents:
```sh
Expand Down Expand Up @@ -169,7 +169,7 @@ Create the RayService CR:
kubectl apply -f vllm-ray-serve-tpu.yaml
```

## Step 5: View the Serve deployment in the Ray Dashboard
## Step 6: View the Serve deployment in the Ray Dashboard

Verify that you deployed the RayService CR and it's running:

Expand All @@ -185,7 +185,7 @@ Port-forward the Ray Dashboard from the Ray Serve service. To view the dashboard
kubectl port-forward svc/vllm-tpu-serve-svc 8265:8265 2>&1 >/dev/null &
```

## Step 6: Send prompts to the model server
## Step 7: Send prompts to the model server

Port-forward the model endpoint from Ray head:
```sh
Expand Down

0 comments on commit b1aadea

Please sign in to comment.