Use `--ipc=host` in `docker run` for distributed inference #1125

WoosukKwon · 2023-09-21T07:56:33Z

When the shared memory allocated for a docker container is too small and when TP is used, vLLM can hang or perform bad. While users can manually increase shm-size, --ipc=host can be a simple solution to this.

Reference: pytorch/pytorch#1158 (comment)

zhuohan123

LGTM! Thanks for the fix!

…t#1125)

Minor fix on installation

71a6327

WoosukKwon requested a review from zhuohan123 September 21, 2023 07:56

zhuohan123 approved these changes Sep 21, 2023

View reviewed changes

WoosukKwon merged commit 7d7e3b7 into main Sep 22, 2023

WoosukKwon deleted the dist-docs branch September 22, 2023 01:26

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

Use --ipc=host in docker run for distributed inference (vllm-projec…

4a0d53a

…t#1125)

sjchoi1 pushed a commit to casys-kaist-internal/vllm that referenced this pull request May 7, 2024

Use --ipc=host in docker run for distributed inference (vllm-projec…

ed262be

…t#1125)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `--ipc=host` in `docker run` for distributed inference #1125

Use `--ipc=host` in `docker run` for distributed inference #1125

WoosukKwon commented Sep 21, 2023

zhuohan123 left a comment

Use --ipc=host in docker run for distributed inference #1125

Use --ipc=host in docker run for distributed inference #1125

Conversation

WoosukKwon commented Sep 21, 2023

zhuohan123 left a comment

Choose a reason for hiding this comment

Use `--ipc=host` in `docker run` for distributed inference #1125

Use `--ipc=host` in `docker run` for distributed inference #1125