Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some bug of sequence_parallel #746

Merged
merged 4 commits into from
Sep 17, 2022

Conversation

GhostScreaming
Copy link
Contributor

  1. Add sequence_parallel option for GPTModel
  2. When mp=1, sequence_parallel option should
    always be set False

1. Add sequence parallel strategy for GPTModelHybrid
2. Output has been checked layer by layer both in forward
   and backward progress, and its loss curve of the beginning
   5000 steps fits the peer
3. Performance is improved for about 10% with sequence_parallel
   strategy compared with pretrain_gpt_1.3B_mp8
1. Add sequence_parallel option for GPTModel
2. When mp=1, sequence_parallel option should
   always be set False
Copy link
Member

@ForFishes ForFishes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ForFishes ForFishes merged commit d6c186d into PaddlePaddle:develop Sep 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants