Code cleaning in preparation for #7258 [3/n] #7262

carmocca · 2021-04-28T22:43:22Z

What does this PR do?

Some changes related to #7258 but not critical which I have split into this PR

Non-exhaustive typing
Moved _validate_data_hooks into ConfigValidator
Fixed bug where scale_batch_size would fail if the number of trials was 0
Moved scale_batch_size tests from tests/trainer/test_trainer_tricks.py to tests/tuner/test_scale_batch_size.py

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
[n/a] Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

codecov · 2021-04-28T22:45:22Z

Codecov Report

Merging #7262 (1eeedac) into master (7a48db5) will decrease coverage by 0%.
The diff coverage is 76%.

@@          Coverage Diff           @@
##           master   #7262   +/-   ##
======================================
- Coverage      91%     91%   -0%     
======================================
  Files         199     199           
  Lines       12799   12793    -6     
======================================
- Hits        11701   11675   -26     
- Misses       1098    1118   +20

pytorch_lightning/tuner/lr_finder.py

pytorch_lightning/trainer/configuration_validator.py

ananthsub · 2021-04-29T02:34:20Z

pytorch_lightning/tuner/batch_size_scaling.py

 def scale_batch_size(
-    trainer,
-    model: LightningModule,
+    trainer: 'pl.Trainer',
+    model: 'pl.LightningModule',
    mode: str = 'power',
    steps_per_trial: int = 3,
    init_val: int = 2,
    max_trials: int = 25,
    batch_arg_name: str = 'batch_size',
    **fit_kwargs
-):
+) -> Optional[int]:


this should have some caveats that the tuner doesn't work with things like deepspeed or sharded ddp which have different behavior on multiple gpus right?

Agree with this. In general scale_batch_size is not really that well tested in multi-gpu settings. Even the most simple case where you are using multiple gpus of different types (so maybe one with 8 gb of vram and one with 16 gb of vram) it will not assign higher batch size to the second device.

@SkafteNicki since you are the most familiar with the tuner limitations, can you open a PR showing warnings or raising an error for these cases?

@carmocca will do. I basically think that anything else than single cpu/gpu batch scaling is not supported

pytorch_lightning/tuner/batch_size_scaling.py

tchaton

LGTM !

pytorch_lightning/tuner/lr_finder.py

SkafteNicki

LGTM

pytorch_lightning/trainer/connectors/data_connector.py

Code cleaning in preparation for 7258

d8b0bf5

carmocca added bug Something isn't working refactor labels Apr 28, 2021

carmocca added this to the v1.3 milestone Apr 28, 2021

carmocca self-assigned this Apr 28, 2021

carmocca requested review from awaelchli, Borda, justusschock, kaushikb11, SeanNaren, SkafteNicki, tchaton and williamFalcon as code owners April 28, 2021 22:43

Update CHANGELOG

1eeedac

carmocca changed the title ~~Code cleaning in preparation for #7258~~ Code cleaning in preparation for #7258 [3/n] Apr 28, 2021

carmocca mentioned this pull request Apr 28, 2021

Attach data refactor and tuner bugs [4/n] #7258

Merged

10 tasks

carmocca commented Apr 28, 2021

View reviewed changes

pytorch_lightning/tuner/lr_finder.py Show resolved Hide resolved

ananthsub approved these changes Apr 29, 2021

View reviewed changes

SkafteNicki reviewed Apr 29, 2021

View reviewed changes

pytorch_lightning/tuner/batch_size_scaling.py Show resolved Hide resolved

tchaton reviewed Apr 29, 2021

View reviewed changes

tchaton self-requested a review April 29, 2021 12:12

tchaton approved these changes Apr 29, 2021

View reviewed changes

SkafteNicki reviewed Apr 29, 2021

View reviewed changes

pytorch_lightning/tuner/lr_finder.py Show resolved Hide resolved

SkafteNicki approved these changes Apr 29, 2021

View reviewed changes

carmocca merged commit a5ac3f8 into master Apr 29, 2021

carmocca deleted the changes-for-7258 branch April 29, 2021 12:40

rohitgr7 reviewed May 4, 2021

View reviewed changes

pytorch_lightning/trainer/connectors/data_connector.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code cleaning in preparation for #7258 [3/n] #7262

Code cleaning in preparation for #7258 [3/n] #7262

carmocca commented Apr 28, 2021 •

edited

Loading

codecov bot commented Apr 28, 2021 •

edited

Loading

ananthsub Apr 29, 2021

SkafteNicki Apr 29, 2021

carmocca Apr 29, 2021

SkafteNicki Apr 29, 2021

tchaton left a comment

SkafteNicki left a comment

Code cleaning in preparation for #7258 [3/n] #7262

Code cleaning in preparation for #7258 [3/n] #7262

Conversation

carmocca commented Apr 28, 2021 • edited Loading

What does this PR do?

Before submitting

PR review

codecov bot commented Apr 28, 2021 • edited Loading

Codecov Report

ananthsub Apr 29, 2021

Choose a reason for hiding this comment

SkafteNicki Apr 29, 2021

Choose a reason for hiding this comment

carmocca Apr 29, 2021

Choose a reason for hiding this comment

SkafteNicki Apr 29, 2021

Choose a reason for hiding this comment

tchaton left a comment

Choose a reason for hiding this comment

SkafteNicki left a comment

Choose a reason for hiding this comment

carmocca commented Apr 28, 2021 •

edited

Loading

codecov bot commented Apr 28, 2021 •

edited

Loading