Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] turbomind mode support Llava-Qwen with new ImageEncoder #2497

Closed
deepindeed2022 opened this issue Sep 23, 2024 · 1 comment
Closed
Assignees

Comments

@deepindeed2022
Copy link
Contributor

Motivation

我们想实现 turbomind 的离线推理模型 llava-interleave-qwen-7b-hf, 请问有什么可以参考的案例吗?包括模型参数配置、模型转换/加载过程以及模型推理实现中的注意问题等。

Related resources

Additional context

No response

@irexyc
Copy link
Collaborator

irexyc commented Sep 24, 2024

为了兼容pytorch/turbomind两个后端,我们把vision模型从vlm中拆了出来。

要支持turbomind离线推理,主要部分有三个:

  1. turbomind 模型加载逻辑 (已支持的 llm 模型一般映射一下 key 就好)
  2. vision 模型加载以及推理逻辑
  3. 对话模版部分,主要是 IMAGE_TOKEN 如何插入的问题。

我觉得可以参考一下这个PR是如何做的,基本上覆盖了上面说的几个部分,如果有问题的话可以再讨论。
https://github.com/InternLM/lmdeploy/pull/1425/files

deepindeed2022 added a commit to deepindeed2022/lmdeploy that referenced this issue Oct 25, 2024
- fix init raise exception because tie_word_embeddings config
- max_batch_size option for start
deepindeed2022 added a commit to deepindeed2022/lmdeploy that referenced this issue Oct 26, 2024
- fix init raise exception because tie_word_embeddings config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants