-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] Added GLM-4 series hf format model support vllm==0.6.4 #10561
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments.
Since the HF format GLM is highly similar to Llama
, I prefer to inherit from existing Llama implementation to reduce code complexity. :)
Thank you for your reply. I have updated the code according to your suggestion. Can it be merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparing with the llama
implementation, I think we can remove more duplicated codes to simplify the model implementation by inheriting from llama.
Co-authored-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Hey @sixsixcoder, I just made a refactor on your GLM implementation to directly inherit from (Please correct me if there's any issues about model implementation, because you are model vendors and more familiar with GLM than me) :) |
Yes, you are right. Thank you for helping to reorganize the code. I hope it can be merged as soon as possible. This work will enable vllm to support GLM-4 and GLM-Edge series text models. |
Signed-off-by: Isotr0py <2037008807@qq.com>
@sixsixcoder Can you merge from main branch? So that we can get some failing CI fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, have you tested this model?
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Yes, it can work with both |
have merged from main branch, but the CI is still failing. What should I do? |
I think the basic models test failure is from this PR. |
Signed-off-by: Isotr0py <2037008807@qq.com>
Oh, just noticed this model is introduced in |
…project#10561) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
…project#10561) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
…project#10561) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Overview
This update adds GLM-4 series text model support vllm==0.6.4, which is different from GLM-4v #9242
This code is written according to transformers==4.46.0
FIX #5306
Usage