[Ray] Integration compiled DAG off by default #2471

rkooo567 · 2024-01-17T22:53:40Z

This PR adds the experimental compiled DAG API support for VLLM NCCL opimization. It improves the benchmark_latency.py or 70B * 8 A100 GPUs' throughput benchmark around 5~7%.

The feature is off by default, and it can be enabled using an env var (VLLM_USE_RAY_COMPILED_DAG=1). There are some rough edges we are fixing within Anyscale, and once we fix them and are more confident, we can turn this on by default.

rkooo567 · 2024-01-25T12:18:42Z

cc @simon-mo can you take a look at this PR?

vllm/engine/llm_engine.py

vllm/engine/ray_utils.py

vllm/engine/llm_engine.py

…nto nccl-pathway-integration

rkooo567 · 2024-02-08T14:47:49Z

cc @simon-mo to merge!

[ROCm] Fix build problem resulted from previous commit related to FP8 kv-cache support (vllm-project#2790) Add documentation on how to do incremental builds (vllm-project#2796) [Ray] Integration compiled DAG off by default (vllm-project#2471) Disable custom all reduce by default (vllm-project#2808) add usage context removed usage_context from Engine_args Move IO to another process added http request [ROCm] support Radeon™ 7900 series (gfx1100) without using flash-attention (vllm-project#2768) Add documentation section about LoRA (vllm-project#2834) Refactor 2 awq gemm kernels into m16nXk32 (vllm-project#2723) Co-authored-by: Chunan Zeng <chunanzeng@Chunans-Air.attlocal.net> Added additional arg for from_engine_args comments

rkooo567 added 2 commits January 17, 2024 00:42

ip

de958df

done basic version

2e65bbb

This was referenced Jan 18, 2024

[WIP][Accelerated DAG 1/2] Support Accelerated DAG API off by default #2462

Closed

[Ray Integration] Integrate vllm with experimental accelerated DAG API #2201

Closed

rkooo567 closed this Jan 18, 2024

Merge branch 'main' into nccl-pathway-integration

500d7b6

rkooo567 reopened this Jan 19, 2024

rkooo567 changed the title ~~[WIP]~~ [Ray] Integration compiled DAG off by default Jan 24, 2024

ready

c0c7b61

Merge branch 'main' into nccl-pathway-integration

085ccad

simon-mo approved these changes Jan 25, 2024

View reviewed changes

vllm/engine/llm_engine.py Show resolved Hide resolved

vllm/engine/ray_utils.py Outdated Show resolved Hide resolved

vllm/engine/llm_engine.py Outdated Show resolved Hide resolved

vllm/engine/llm_engine.py Outdated Show resolved Hide resolved

vllm/engine/llm_engine.py Show resolved Hide resolved

rkooo567 added 4 commits January 31, 2024 07:55

Merge branch 'nccl-pathway-integration' of github.com:rkooo567/vllm i…

ad112bd

…nto nccl-pathway-integration

Addressed code review.

e5a9644

Merge branch 'main' into nccl-pathway-integration

81d3d73

lint

8ddf045

simon-mo merged commit 65b89d1 into vllm-project:main Feb 8, 2024
17 checks passed

alexm-redhat pushed a commit to neuralmagic/nm-vllm that referenced this pull request Feb 13, 2024

[Ray] Integration compiled DAG off by default (vllm-project#2471)

23a9ded

jvmncs pushed a commit to jvmncs/vllm that referenced this pull request Feb 14, 2024

[Ray] Integration compiled DAG off by default (vllm-project#2471)

593578c

xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 20, 2024

[Ray] Integration compiled DAG off by default (vllm-project#2471)

a7da0c8

xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 22, 2024

[Ray] Integration compiled DAG off by default (vllm-project#2471)

63e6663

andy-neuma mentioned this pull request Feb 23, 2024

andy/bump main to v0.3.2 neuralmagic/nm-vllm#49

Closed

xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 4, 2024

[Ray] Integration compiled DAG off by default (vllm-project#2471)

e61f4d2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Ray] Integration compiled DAG off by default #2471

[Ray] Integration compiled DAG off by default #2471

rkooo567 commented Jan 17, 2024 •

edited

Loading

rkooo567 commented Jan 25, 2024

rkooo567 commented Feb 8, 2024

[Ray] Integration compiled DAG off by default #2471

[Ray] Integration compiled DAG off by default #2471

Conversation

rkooo567 commented Jan 17, 2024 • edited Loading

rkooo567 commented Jan 25, 2024

rkooo567 commented Feb 8, 2024

rkooo567 commented Jan 17, 2024 •

edited

Loading