From 49e9a528d5433ad8c2abfcc5ac326acd5a7e469f Mon Sep 17 00:00:00 2001 From: Zhanghao Wu Date: Sun, 18 Aug 2024 21:18:59 +0000 Subject: [PATCH 1/8] Add instruction for SkyPilot --- README.md | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/README.md b/README.md index 5434bb25f62..bb7b3bd6abc 100644 --- a/README.md +++ b/README.md @@ -87,6 +87,45 @@ docker run --gpus all \ 1. Copy the [compose.yml](./docker/compose.yaml) to your local machine 2. Execute the command `docker compose up -d` in your terminal. +### Method 5: Run on Clouds with SkyPilot + +To deploy on any cloud or Kubernetes cluster, you can use [SkyPilot](https://github.com/skypilot-org/skypilot). + +1. Install SkyPilot and setup your cloud or Kubernetes cluster: see [SkyPilot's documentation](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html). +2. Deploy on your own infra with a single command and get the HTTP API endpoint: +
+SkyPilot YAML: sglang_server.yaml + +```yaml +# sglang_server.yaml +envs: + HF_TOKEN: null + +resources: + image_id: docker:lmsysorg/sglang:latest + accelerators: A100 + ports: 30000 + +run: | + conda deactivate + python3 -m sglang.launch_server \ + --model-path meta-llama/Meta-Llama-3.1-8B-Instruct \ + --host 0.0.0.0 \ + --port 30000 +``` + +
+ +```bash +# Deploy on any cloud or Kubernetes cluster, use --cloud to select a specific cloud provider. +HF_TOKEN= sky launch -c sglang --env HF_TOKEN sglang_server.yaml + +# Get the HTTP API endpoint +sky status --endpoint 30000 sglang +``` + + + ### Common Notes - [FlashInfer](https://github.com/flashinfer-ai/flashinfer) is currently one of the dependencies that must be installed for SGLang. If you are using NVIDIA GPU devices below sm80, such as T4, you can't use SGLang for the time being. We expect to resolve this issue soon, so please stay tuned. If you encounter any FlashInfer-related issues on sm80+ devices (e.g., A100, L40S, H100), consider using Triton's kernel by `--disable-flashinfer --disable-flashinfer-sampling` and raise a issue. - If you only need to use the OpenAI backend, you can avoid installing other dependencies by using `pip install "sglang[openai]"`. From 529ff8719fcbbe231ad3658dbb0802a900f4a9c0 Mon Sep 17 00:00:00 2001 From: Zhanghao Wu Date: Sun, 18 Aug 2024 21:20:16 +0000 Subject: [PATCH 2/8] rename yaml --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index bb7b3bd6abc..b94cd92da79 100644 --- a/README.md +++ b/README.md @@ -94,10 +94,10 @@ To deploy on any cloud or Kubernetes cluster, you can use [SkyPilot](https://git 1. Install SkyPilot and setup your cloud or Kubernetes cluster: see [SkyPilot's documentation](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html). 2. Deploy on your own infra with a single command and get the HTTP API endpoint:
-SkyPilot YAML: sglang_server.yaml +SkyPilot YAML: sglang.yaml ```yaml -# sglang_server.yaml +# sglang.yaml envs: HF_TOKEN: null @@ -118,7 +118,7 @@ run: | ```bash # Deploy on any cloud or Kubernetes cluster, use --cloud to select a specific cloud provider. -HF_TOKEN= sky launch -c sglang --env HF_TOKEN sglang_server.yaml +HF_TOKEN= sky launch -c sglang --env HF_TOKEN sglang.yaml # Get the HTTP API endpoint sky status --endpoint 30000 sglang From f60b03c402c5fc573b818722e9754ab8649eea59 Mon Sep 17 00:00:00 2001 From: Zhanghao Wu Date: Sun, 18 Aug 2024 15:20:13 -0700 Subject: [PATCH 3/8] Update README.md Co-authored-by: Zongheng Yang --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b94cd92da79..11d7007c303 100644 --- a/README.md +++ b/README.md @@ -87,7 +87,7 @@ docker run --gpus all \ 1. Copy the [compose.yml](./docker/compose.yaml) to your local machine 2. Execute the command `docker compose up -d` in your terminal. -### Method 5: Run on Clouds with SkyPilot +### Method 5: Run on Kubernetes or Cloud VMs with SkyPilot To deploy on any cloud or Kubernetes cluster, you can use [SkyPilot](https://github.com/skypilot-org/skypilot). From e8af960efcee3d339d917d464b295fccfeb96b67 Mon Sep 17 00:00:00 2001 From: Zhanghao Wu Date: Sun, 18 Aug 2024 15:20:37 -0700 Subject: [PATCH 4/8] Update README.md Co-authored-by: Zongheng Yang --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 11d7007c303..42ddf52beb1 100644 --- a/README.md +++ b/README.md @@ -91,7 +91,7 @@ docker run --gpus all \ To deploy on any cloud or Kubernetes cluster, you can use [SkyPilot](https://github.com/skypilot-org/skypilot). -1. Install SkyPilot and setup your cloud or Kubernetes cluster: see [SkyPilot's documentation](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html). +1. Install SkyPilot and set up cloud VM or Kubernetes cluster access: see [SkyPilot's documentation](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html). 2. Deploy on your own infra with a single command and get the HTTP API endpoint:
SkyPilot YAML: sglang.yaml From 0e8170b5188b7a421ded7d994473c3a0cc9777d7 Mon Sep 17 00:00:00 2001 From: Zhanghao Wu Date: Sun, 18 Aug 2024 15:20:42 -0700 Subject: [PATCH 5/8] Update README.md Co-authored-by: Zongheng Yang --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 42ddf52beb1..e3a9056c325 100644 --- a/README.md +++ b/README.md @@ -117,7 +117,7 @@ run: |
```bash -# Deploy on any cloud or Kubernetes cluster, use --cloud to select a specific cloud provider. +# Deploy on any cloud or Kubernetes cluster. Use --cloud to select a specific cloud provider. HF_TOKEN= sky launch -c sglang --env HF_TOKEN sglang.yaml # Get the HTTP API endpoint From f34dcb1e1e0c4e0f9ba420e45e3dd239e124d502 Mon Sep 17 00:00:00 2001 From: Zhanghao Wu Date: Sun, 18 Aug 2024 22:24:00 +0000 Subject: [PATCH 6/8] update --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index e3a9056c325..a01a9303c69 100644 --- a/README.md +++ b/README.md @@ -87,9 +87,9 @@ docker run --gpus all \ 1. Copy the [compose.yml](./docker/compose.yaml) to your local machine 2. Execute the command `docker compose up -d` in your terminal. -### Method 5: Run on Kubernetes or Cloud VMs with SkyPilot +### Method 5: Run on Kubernetes or Clouds with SkyPilot -To deploy on any cloud or Kubernetes cluster, you can use [SkyPilot](https://github.com/skypilot-org/skypilot). +To deploy on Kubernetes or clouds, you can use [SkyPilot](https://github.com/skypilot-org/skypilot). 1. Install SkyPilot and set up cloud VM or Kubernetes cluster access: see [SkyPilot's documentation](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html). 2. Deploy on your own infra with a single command and get the HTTP API endpoint: From 3875e011fec79798cd9af159cefc1a589362f986 Mon Sep 17 00:00:00 2001 From: Zhanghao Wu Date: Sun, 18 Aug 2024 22:32:13 +0000 Subject: [PATCH 7/8] update --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a01a9303c69..563093247ad 100644 --- a/README.md +++ b/README.md @@ -91,7 +91,7 @@ docker run --gpus all \ To deploy on Kubernetes or clouds, you can use [SkyPilot](https://github.com/skypilot-org/skypilot). -1. Install SkyPilot and set up cloud VM or Kubernetes cluster access: see [SkyPilot's documentation](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html). +1. Install SkyPilot and set up clouds or Kubernetes cluster access: see [SkyPilot's documentation](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html). 2. Deploy on your own infra with a single command and get the HTTP API endpoint:
SkyPilot YAML: sglang.yaml From 379e54b2defd577781ad8e51c6ede226f23bcf15 Mon Sep 17 00:00:00 2001 From: Zhanghao Wu Date: Sun, 18 Aug 2024 22:33:37 +0000 Subject: [PATCH 8/8] update --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 563093247ad..1810ca2f19f 100644 --- a/README.md +++ b/README.md @@ -91,7 +91,7 @@ docker run --gpus all \ To deploy on Kubernetes or clouds, you can use [SkyPilot](https://github.com/skypilot-org/skypilot). -1. Install SkyPilot and set up clouds or Kubernetes cluster access: see [SkyPilot's documentation](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html). +1. Install SkyPilot and set up Kubernetes cluster or cloud access: see [SkyPilot's documentation](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html). 2. Deploy on your own infra with a single command and get the HTTP API endpoint:
SkyPilot YAML: sglang.yaml