-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] 运行OpenGVLab/InternVL2-2B-AWQ报错:KeyError: 'language_model.model.layers.0.feed_forward.w1.weight' #2168
Comments
What is the version of lmdepoy? @jianliao The latest lmdeploy can run the model with the default backend turbomind. |
@AllentDan I upgraded to the latest version (0.5.2.post1), but I am still encountering the same error with the following command:
|
Can you try adding |
@AllentDan @lvhan028 The issue has been resolved after applying the --model-format awq option. Thanks Bro. |
self.pipe = pipeline(model_path, backend_config=TurbomindEngineConfig(
I run the awq quantified model and meet the same promblem, my script to start inference like this, how to modify it ? |
|
Checklist
Describe the bug
> lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-AWQ
如果切换Backend,能够运行但是会输出大量的log,详见此附件bug.log
Reproduction
> lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-AWQ
or
lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-AWQ --backend pytorch
Environment
Error traceback
The text was updated successfully, but these errors were encountered: