-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adapt vllm distributed module to sglang #2244
Conversation
QQ why not use the v0.6.4.post1 |
I see pyproject.toml require vllm>=0.6.3.post1 |
This is to maintain compatibility, 0.6.3.post1 uses torch 2.4, and 0.6.4.post1 uses torch 2.5.1. The current main branch is compatible with both torch 2.4 and torch 2.5.1. Regarding the distributed part, I suggest directly updating to v0.6.4.post1 |
ok,currently it use 0.6.4.post1 as base |
84622c3
to
21d7d58
Compare
21d7d58
to
9fecd6e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM just left some comments Thanks!
python/sglang/srt/distributed/device_communicators/shm_broadcast.py
Outdated
Show resolved
Hide resolved
python/sglang/srt/distributed/device_communicators/xpu_communicator.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think we can merge it into remove-vllm-distributed
and verify afterward. You can then create another PR from remove-vllm-distributed
.
python/sglang/srt/distributed/device_communicators/custom_all_reduce.py
Outdated
Show resolved
Hide resolved
|
I think the test failed is caused by pr #2266 , @zhaochenyang20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Since it is not currently in use, I think it's safe to merge. More detailed testing and verification can be conducted when integrating the custom all-reduce CUDA kernel into sgl-kernel. cc @yizhang2077
Motivation
Modifications
Move vllm distributed module (v0.6.4.post1) to sglang, and current model is stilll using vllm.distributed module
Checklist
What to do Next