Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input_embeds support #2052

Merged
merged 15 commits into from
Nov 26, 2024
Merged

Input_embeds support #2052

merged 15 commits into from
Nov 26, 2024

Conversation

RinRin-32
Copy link
Contributor

@RinRin-32 RinRin-32 commented Nov 16, 2024

Motivation

Adding input_embeds support for sglang, motivated and inspired from
#745

Modifications

Changes in /manager to support new io_struct
Changes in /model_executor to support new io_struct
test cases under test/srt to demonstrate usage

In this implementation, curl request has to be made to use input_embeds.
This implementation DOES NOT have ARGS made to compliment input_embeds (I don't know where to make this edit),

When serving, to use input embeds please add
--disable-radix tag
i.e.
python3 -m sglang.launch_server --model-path Qwen/Qwen2.5-0.5B --port 30000 --disable-radix

test_input_embeddings.py is built and tested on Qwen2.5-0.5B, please edit the model name according to the model that you'd like to test

input_embeddings.json is my test case embeddings for Qwen2.5-0.5B model and comparison_response.txt is a comparison between text based input and input_embeds input.

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

@merrymercy
Copy link
Contributor

Thanks for the contribution. There is a related PR recently. Can you take a review on that? #2082

@RinRin-32
Copy link
Contributor Author

Thanks for the contribution. There is a related PR recently. Can you take a review on that? #2082

It seems that this PR update the documents and did more alteration with batch in mind. If their implementation doesn't require radix disabled then it'd be the perfect one among the 2 pr. My changes follow the guide in that open issue and is pretty minimal, with test to try out the feature.

Also can you please launch the checks again, I fixed some conditions since the last check you triggered, thank you.

Copy link
Contributor

@merrymercy merrymercy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I feel your solution is better than the other one. We can try to merge yours! I left a few comments.

python/sglang/srt/managers/schedule_batch.py Outdated Show resolved Hide resolved
python/sglang/srt/managers/schedule_batch.py Outdated Show resolved Hide resolved
python/sglang/srt/managers/schedule_batch.py Outdated Show resolved Hide resolved
python/sglang/srt/managers/scheduler.py Outdated Show resolved Hide resolved
python/sglang/srt/model_executor/forward_batch_info.py Outdated Show resolved Hide resolved
test/srt/test_input_embeddings.py Outdated Show resolved Hide resolved
@merrymercy
Copy link
Contributor

  1. Can you fix the conflicts?
  2. I added you as a co-author in this merged PR so you are not a first-time contributor anymore. The CI will be triggered automatically for any of your future commits. [minor] Clean up unused imports #2122

@RinRin-32
Copy link
Contributor Author

I apologize for the slow reply.
Will get to it as soon as I can. Expect it done in a week!

@merrymercy merrymercy merged commit 1aea19f into sgl-project:main Nov 26, 2024
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants