From 7d7e3b78a3c265ab3c57eeff43af56f509907998 Mon Sep 17 00:00:00 2001 From: Woosuk Kwon Date: Thu, 21 Sep 2023 18:26:47 -0700 Subject: [PATCH] Use `--ipc=host` in docker run for distributed inference (#1125) --- docs/source/getting_started/installation.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/source/getting_started/installation.rst b/docs/source/getting_started/installation.rst index 7611bc2be228f..1105d050df69a 100644 --- a/docs/source/getting_started/installation.rst +++ b/docs/source/getting_started/installation.rst @@ -46,4 +46,5 @@ You can also build and install vLLM from source: .. code-block:: console $ # Pull the Docker image with CUDA 11.8. - $ docker run --gpus all -it --rm --shm-size=8g nvcr.io/nvidia/pytorch:22.12-py3 + $ # Use `--ipc=host` to make sure the shared memory is large enough. + $ docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:22.12-py3