[Bug] 运行OpenGVLab/InternVL2-2B-AWQ报错：KeyError: 'language_model.model.layers.0.feed_forward.w1.weight' #2168

jianliao · 2024-07-29T00:34:18Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

> lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-AWQ

KeyError: 'language_model.model.layers.0.feed_forward.w1.weight'

如果切换Backend，能够运行但是会输出大量的log，详见此附件bug.log

Reproduction

> lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-AWQ

or

lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-AWQ --backend pytorch

Environment

OS: Ubuntu 22.04
Python: 3.12
Model OpenGVLab/InternVL2-2B-AWQ

Error traceback

Traceback (most recent call last):
  File "/home/jianliao/anaconda3/envs/lmdeploy/bin/lmdeploy", line 8, in <module>
    sys.exit(run())
             ^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/cli/entrypoint.py", line 36, in run
    args.run(args)
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/cli/serve.py", line 298, in api_server
    run_api_server(args.model_path,
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/serve/openai/api_server.py", line 1285, in serve
    VariableInterface.async_engine = pipeline_class(
                                     ^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/serve/vl_async_engine.py", line 24, in __init__
    super().__init__(model_path, **kwargs)
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/serve/async_engine.py", line 190, in __init__
    self._build_turbomind(model_path=model_path,
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/serve/async_engine.py", line 235, in _build_turbomind
    self.engine = tm.TurboMind.from_pretrained(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/turbomind.py", line 340, in from_pretrained
    return cls(model_path=pretrained_model_name_or_path,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/turbomind.py", line 144, in __init__
    self.model_comm = self._from_hf(model_source=model_source,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/turbomind.py", line 235, in _from_hf
    output_model = OUTPUT_MODELS.get(output_model_name)(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/deploy/target_model/fp.py", line 26, in __init__
    super().__init__(input_model, cfg, to_file, out_dir)
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/deploy/target_model/base.py", line 172, in 
[bug.log](https://github.com/user-attachments/files/16405853/bug.log)
__init__
    self.cfg = self.get_config(cfg)
               ^^^^^^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/deploy/target_model/fp.py", line 38, in get_config
    w1, _, _ = bin.ffn(i)
               ^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/deploy/source_model/internlm2.py", line 69, in ffn
    return self._ffn(i, 'weight')
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/deploy/source_model/internlm2.py", line 62, in _ffn
    tensor = self.params[
             ^^^^^^^^^^^^
KeyError: 'language_model.model.layers.0.feed_forward.w1.weight'

The text was updated successfully, but these errors were encountered:

lvhan028 · 2024-07-29T03:56:54Z

The related PR #1984 #1913 haven't been merged yet.

AllentDan · 2024-07-29T06:24:43Z

What is the version of lmdepoy? @jianliao The latest lmdeploy can run the model with the default backend turbomind.

jianliao · 2024-07-29T19:00:18Z

@AllentDan I upgraded to the latest version (0.5.2.post1), but I am still encountering the same error with the following command:
lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-AWQ.
Here are the details of my lmdeploy version:

(lmdeploy) jianliao@jianliao-ubuntu:~$ pip show lmdeploy
Name: lmdeploy
Version: 0.5.2.post1
Summary: A toolset for compressing, deploying and serving LLM
Home-page: 
Author: OpenMMLab
Author-email: openmmlab@gmail.com
License: 
Location: /home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages
Requires: accelerate, einops, fastapi, fire, mmengine-lite, numpy, nvidia-cublas-cu12, nvidia-cuda-runtime-cu12, nvidia-curand-cu12, nvidia-nccl-cu12, peft, pillow, protobuf, pydantic, pynvml, safetensors, sentencepiece, shortuuid, tiktoken, torch, torchvision, transformers, triton, uvicorn
Required-by:

AllentDan · 2024-07-30T01:20:53Z

Can you try adding --model-format awq?

jianliao · 2024-08-03T20:50:44Z

@AllentDan @lvhan028 The issue has been resolved after applying the --model-format awq option. Thanks Bro.

lzk9508 · 2024-12-16T06:33:58Z

Can you try adding --model-format awq?

self.pipe = pipeline(model_path, backend_config=TurbomindEngineConfig(
session_len=self.session_len, cache_max_entry_count=self.cache_max_entry_count))

    self.pipe.vl_encoder.model.config.max_dynamic_patch = self.max_dynamic_patch

I run the awq quantified model and meet the same promblem, my script to start inference like this, how to modify it ?

AllentDan · 2024-12-16T08:55:45Z

Update to latest lmdeploy, it was resolved.
set model_format argument to 'awq' in TurbomindEngineConfig

jianliao closed this as completed Aug 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] 运行OpenGVLab/InternVL2-2B-AWQ报错：KeyError: 'language_model.model.layers.0.feed_forward.w1.weight' #2168

[Bug] 运行OpenGVLab/InternVL2-2B-AWQ报错：KeyError: 'language_model.model.layers.0.feed_forward.w1.weight' #2168

jianliao commented Jul 29, 2024

lvhan028 commented Jul 29, 2024

AllentDan commented Jul 29, 2024

jianliao commented Jul 29, 2024

AllentDan commented Jul 30, 2024

jianliao commented Aug 3, 2024

lzk9508 commented Dec 16, 2024

AllentDan commented Dec 16, 2024

[Bug] 运行OpenGVLab/InternVL2-2B-AWQ报错：KeyError: 'language_model.model.layers.0.feed_forward.w1.weight' #2168

[Bug] 运行OpenGVLab/InternVL2-2B-AWQ报错：KeyError: 'language_model.model.layers.0.feed_forward.w1.weight' #2168

Comments

jianliao commented Jul 29, 2024

Checklist

Describe the bug

Reproduction

Environment

Error traceback

lvhan028 commented Jul 29, 2024

AllentDan commented Jul 29, 2024

jianliao commented Jul 29, 2024

AllentDan commented Jul 30, 2024

jianliao commented Aug 3, 2024

lzk9508 commented Dec 16, 2024

AllentDan commented Dec 16, 2024