Skip to content

Commit

Permalink
Update on "Made some stylistic changes to apply_dp"
Browse files Browse the repository at this point in the history
[ghstack-poisoned]
  • Loading branch information
awgu committed Jul 10, 2024
2 parents 5c04a9b + 67c4e9b commit 19cba24
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions torchtitan/parallelisms/parallelize_llama.py
Original file line number Diff line number Diff line change
Expand Up @@ -459,6 +459,7 @@ def apply_dp(model, world_mesh, parallel_dims, job_config: JobConfig):
reduce_dtype=TORCH_DTYPE_MAP[job_config.training.mixed_precision_reduce],
)
fsdp_config = {"mesh": dp_mesh, "mp_policy": mp_policy}

for layer_id, transformer_block in model.layers.items():
if parallel_dims.pp_enabled:
# For PP, do not reshard after forward to avoid per-microbatch
Expand Down

0 comments on commit 19cba24

Please sign in to comment.