Support Qwen2-VL's multimodal RoPE implementation #384

li-plus · 2024-11-15T06:20:22Z

Summary

Support Qwen2-VL's multimodal RoPE kernel. See original implementation here: https://github.com/huggingface/transformers/blob/a3d69a8994d673899608a7c17fbf4f953f50474e/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py#L203-L245

Finished the TODO left in #175. Complete feature request #165.

Testing Done

Hardware Type: A800-SXM4-80GB
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

tyler-romero · 2024-11-15T19:58:47Z

Very nice!

Can you update the convergence tests for qwen2_vl to include your RoPE implementation as well? Right now there is a line excluding rope for qwen2_vl models specifically.

Also are you available to add a benchmark for your kernel? https://github.com/linkedin/Liger-Kernel/blob/main/docs/CONTRIBUTING.md#adding-a-new-kernel

ByronHsu · 2024-11-15T21:11:04Z

Thank you for making the non-trivial contribution!

Can we rebase Qwen2-VL Bug / Incompatibility Fixes #388 to properly test convergence?
I notice there is another apply_rotary_pos_emb_vision, should we implement and patch that too?
Are you on wechat? Very impressed by your contribution and want to discuss more with you. My id is wxid_nn8pbmlh9ae712

li-plus · 2024-11-18T03:11:44Z

Very nice!

Can you update the convergence tests for qwen2_vl to include your RoPE implementation as well? Right now there is a line excluding rope for qwen2_vl models specifically.

Also are you available to add a benchmark for your kernel? https://github.com/linkedin/Liger-Kernel/blob/main/docs/CONTRIBUTING.md#adding-a-new-kernel

@tyler-romero The Qwen2VL convergence tests are already fixed in #388. I've updated them to enable M-RoPE kernel injection for Qwen2VL. Also added benchmark scripts for M-RoPE in the latest commit. Benchmark results on A800 are visualized below:

You may test it on A100 and update all_benchmark_data.csv. I don't have A100 at hand.

li-plus · 2024-11-18T03:34:51Z

Thank you for making the non-trivial contribution!

Can we rebase Qwen2-VL Bug / Incompatibility Fixes #388 to properly test convergence?

I notice there is another apply_rotary_pos_emb_vision, should we implement and patch that too?

Are you on wechat? Very impressed by your contribution and want to discuss more with you. My id is wxid_nn8pbmlh9ae712

@ByronHsu Hi,

I've already rebased onto the latest master including this commit Qwen2-VL Bug / Incompatibility Fixes #388 and enabled RoPE for Qwen2VL.
The current apply_rotary_pos_emb_vision implementation is inefficient since it recomputes cos & sin for q & k. It's better to optimize it in modeling_qwen2_vl.py in upstream transformers to apply RoPE on both q & k at the same time. Then we can reuse the RoPE triton kernel of llama / mistral.
Thanks for the invitation. I'm on wechat but I couldn't find your account based on wxid. Maybe you could email me the QR code?

ByronHsu · 2024-11-19T05:45:31Z

Thank you @li-plus !!

li-plus force-pushed the qwen2vl-mrope branch from 69dd704 to b903e7e Compare November 15, 2024 06:20

li-plus added 2 commits November 18, 2024 10:38

Support Qwen2-VL's multimodal RoPE implementation

07ab8d6

Add benchmark scripts & re-enable convergence tests for Qwen2VL

8e2758a

li-plus force-pushed the qwen2vl-mrope branch from 9570f98 to 8e2758a Compare November 18, 2024 02:50

Merge branch 'main' into qwen2vl-mrope

822613f

ByronHsu approved these changes Nov 19, 2024

View reviewed changes

ByronHsu merged commit cc5561e into linkedin:main Nov 19, 2024
1 check passed

li-plus deleted the qwen2vl-mrope branch November 19, 2024 06:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Qwen2-VL's multimodal RoPE implementation #384

Support Qwen2-VL's multimodal RoPE implementation #384

li-plus commented Nov 15, 2024

tyler-romero commented Nov 15, 2024

ByronHsu commented Nov 15, 2024

li-plus commented Nov 18, 2024

li-plus commented Nov 18, 2024

ByronHsu commented Nov 19, 2024

Support Qwen2-VL's multimodal RoPE implementation #384

Support Qwen2-VL's multimodal RoPE implementation #384

Conversation

li-plus commented Nov 15, 2024

Summary

Testing Done

tyler-romero commented Nov 15, 2024

ByronHsu commented Nov 15, 2024

li-plus commented Nov 18, 2024

li-plus commented Nov 18, 2024

ByronHsu commented Nov 19, 2024