Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix some bug of sequence_parallel (#746)
* Add sequence parallel strategy support. 1. Add sequence parallel strategy for GPTModelHybrid 2. Output has been checked layer by layer both in forward and backward progress, and its loss curve of the beginning 5000 steps fits the peer 3. Performance is improved for about 10% with sequence_parallel strategy compared with pretrain_gpt_1.3B_mp8 * Add sequence_parallel_utils.py file * Fix some bug of sequence_parallel. 1. Add sequence_parallel option for GPTModel 2. When mp=1, sequence_parallel option should always be set False
- Loading branch information