Skip to content

Commit

Permalink
fix(pt): do not overwrite disp_file when restarting training (#3985)
Browse files Browse the repository at this point in the history
New contents should be appended when restarting

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Improved training process by conditionally appending to files based on
the `restart_training` flag.
- Added functionality to create a record file when `SAMPLER_RECORD` is
true.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
njzjz and pre-commit-ci[bot] authored Jul 16, 2024
1 parent a2c74a0 commit 782f1e2
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion deepmd/pt/train/training.py
Original file line number Diff line number Diff line change
Expand Up @@ -630,7 +630,13 @@ def warm_up_linear(step, warmup_steps):

def run(self):
fout = (
open(self.disp_file, mode="w", buffering=1) if self.rank == 0 else None
open(
self.disp_file,
mode="w" if not self.restart_training else "a",
buffering=1,
)
if self.rank == 0
else None
) # line buffered
if SAMPLER_RECORD:
record_file = f"Sample_rank_{self.rank}.txt"
Expand Down

0 comments on commit 782f1e2

Please sign in to comment.