-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ddp fix for trainer.test() + add basic ddp tests #2997
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2997 +/- ##
======================================
- Coverage 90% 90% -0%
======================================
Files 82 82
Lines 7628 7626 -2
======================================
- Hits 6862 6860 -2
Misses 766 766 |
@@ -893,10 +893,6 @@ def init_ddp_connection(self, global_rank: int, world_size: int, is_slurm_managi | |||
log.info(f"initializing ddp: GLOBAL_RANK: {global_rank}, MEMBER: {global_rank+1}/{world_size}") | |||
torch_distrib.init_process_group(torch_backend, rank=global_rank, world_size=world_size) | |||
|
|||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did we remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it does not belong here. My question would be why is it here in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* add ddp script variations * add ddp test * rename * shell * test * test * try call * try without subprocess * test * display the error * list all variations * try string * try copy env * debug * pythonpath * path * update test * change * simple ddp test * replace * remove random port * random port * str * clean up * check run spawn * clean up * docs * docs * update test * docs * changelog * changelog
What does this PR do?
Fixes #2683
Fixes #2765
Fixes #2807 (docs for clarification)
Fixes #2537 (docs for clarification)
(maybe also #2901)
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Nope, in this one definitely not! :(