Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLM] Add expert parallel #9368

Merged

Conversation

DrownFish19
Copy link
Collaborator

@DrownFish19 DrownFish19 commented Nov 5, 2024

PR types

New features

PR changes

APIs

Description

Add expert parallel

  • added in Qwen2Moe

1. 精度验证代码:

import paddle
from paddlenlp.transformers.qwen2_moe.modeling import Qwen2MoeSparseMoEBlock, Qwen2MoeSparseMoEBlock_OLD

from paddlenlp.transformers import Qwen2MoeConfig

config = Qwen2MoeConfig.from_pretrained("Qwen/Qwen2-57B-A14B")
paddle.set_default_dtype(paddle.float32)

with paddle.amp.auto_cast(True):
    block = Qwen2MoeSparseMoEBlock(config)
    block_old = Qwen2MoeSparseMoEBlock_OLD(config)

state_dict = block.state_dict()
block_old.set_state_dict(state_dict)

for seq_len in [i * 1024 for i in range(1, 128)]:
    hidden_state = paddle.rand([1, 1024, config.hidden_size], dtype=paddle.float32).cast(paddle.get_default_dtype())

    block_output = block(hidden_state)
    block_output_old = block_old(hidden_state)

    print(seq_len, ": ", float(paddle.max(paddle.abs(block_output[0] - block_output_old[0]))))

2. 精度验证结果:

同Qwen2Moe原始Moe计算代码(没有专家并行)比较,序列长度计算1k-128k,最大diff保持不变。

float32 diff: 0.0
float16 diff: 1e-4
bfloat16 diff: 9e-4

Copy link

paddle-bot bot commented Nov 5, 2024

Thanks for your contribution!

Copy link

codecov bot commented Nov 5, 2024

Codecov Report

Attention: Patch coverage is 50.27473% with 181 lines in your changes missing coverage. Please review.

Project coverage is 52.89%. Comparing base (b5e3f0c) to head (358483b).
Report is 21 commits behind head on develop.

Files with missing lines Patch % Lines
paddlenlp/transformers/moe_gate.py 42.29% 131 Missing ⚠️
paddlenlp/transformers/moe_layer.py 54.62% 49 Missing ⚠️
paddlenlp/transformers/qwen2_moe/modeling.py 95.83% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9368      +/-   ##
===========================================
- Coverage    53.01%   52.89%   -0.12%     
===========================================
  Files          678      678              
  Lines       108787   108249     -538     
===========================================
- Hits         57668    57262     -406     
+ Misses       51119    50987     -132     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

Copy link
Collaborator

@wawltor wawltor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wawltor wawltor merged commit 590081a into PaddlePaddle:develop Nov 20, 2024
9 of 12 checks passed
@DrownFish19 DrownFish19 deleted the dev_20241018_add_expert_parallel branch November 20, 2024 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants