Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sequence parallel strategy support. #734

Merged
merged 2 commits into from
Sep 16, 2022

Conversation

GhostScreaming
Copy link
Contributor

@GhostScreaming GhostScreaming commented Sep 16, 2022

  1. Add sequence parallel strategy for GPTModelHybrid
  2. Output has been checked layer by layer both in forward
    and backward progress, and its loss curve of the beginning
    5000 steps fits the peer
  3. Performance is improved for about 10% with sequence_parallel
    strategy compared with pretrain_gpt_1.3B_mp8

loss

1. Add sequence parallel strategy for GPTModelHybrid
2. Output has been checked layer by layer both in forward
   and backward progress, and its loss curve of the beginning
   5000 steps fits the peer
3. Performance is improved for about 10% with sequence_parallel
   strategy compared with pretrain_gpt_1.3B_mp8
Copy link
Member

@ForFishes ForFishes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ForFishes ForFishes merged commit 85870f8 into PaddlePaddle:develop Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants