prune Tune arguments from Trainer init - LearnRate & BatchSize as callback #9103

Borda · 2021-08-25T07:46:32Z

🚀 Feature

Move the auto_scale_batch_size and auto_lr_find to the .tune method

Motivation

it is quite confusing setting some auto finds which needs to be called anyway with .tune() which uses another set of arguments.

Pitch

As a naive user, I would expect that the tuning is executed at the begging of .fit()

Alternatives

Also in such cases, the user is not much under control in which order they are called, for example, my use case is:

find the largest batch size
do something extra around the model
find the best LR

ad this moment I would need to create two Traner instances :(

I am also experiencing some OOM memory failer when I run Batch size and LR, have not the concrete case for debugging yet

Additional context

The only justification would be usage with Lightning CLI

The text was updated successfully, but these errors were encountered:

ananthsub · 2021-08-25T08:00:16Z

I see 3 options:

keep as is
move constructor arguments to specific method they apply to (applies to tune, fit, validate, test, predict as well)
create separate objects for users to interact with: Trainer, Evaluator, Predictor, Tester

the last one is the most extreme, but i wanted to mention for completeness. between the first two, i prefer the second one. there's a similar conversation here: #9006

SkafteNicki · 2021-08-25T08:36:01Z

Additional option:
Move lr finder and batch scaler to callbacks, removing the tune method
auto_scale_batch_size and auto_lr_find could still exist, setting them to True would just add the corresponding callback.

Borda · 2021-08-25T09:03:35Z

Move lr finder and batch scaler to callbacks, removing the tune method

I like this one and it may be the cleanest way as we don't add any other logic pattern/layer

ananthsub · 2021-08-26T06:31:54Z

@SkafteNicki @Borda - as a callback, this can be pretty invasive in setting data inside of the trainer. Is that a precedent we should be setting? In what callback hooks would these run? https://github.com/PyTorchLightning/pytorch-lightning/blob/69f66fd6bb361a7932a82291e4ef001f4f381f99/pytorch_lightning/tuner/batch_size_scaling.py#L115-L125

Borda · 2021-08-26T06:54:25Z

I assume that all would run in the before fit hook...

tchaton · 2021-08-26T08:55:16Z

IMO, I trust each entrypoint to the trainer to be independent as it makes sure features are fully encapsultated and reduce risks from bugs.

move constructor arguments to specific method they apply to (applies to tune, fit, validate, test, predict as well)

I think this is fine for tune which is pretty particular, but It don't think we should do this for other functions.

Best,
T.C

SkafteNicki · 2021-08-26T09:16:59Z

I am also fine with moving the arguments to the tune method.
Tuning is the odd one out of tune, fit, validate, test, predict so I think we can do it without having to move other arguments.

Borda · 2021-10-15T14:42:30Z

I model practical use-case is combination with FineTune which would unfreeze backbone on 10th epoch
so on:

0.epoch - freeze backbone
0.epoch - run BatchSize -> 45
0.epoch - run LRFinder -> 0.005
9.epoch - unfreeze backbone
9.epoch - run BatchSize -> 14
9.epoch - run LRFinder -> 0.001

this gives more freedom to combine these methods during the training cycle
so the callback will be executed on epoch start...

Borda added feature Is an improvement or enhancement help wanted Open to be worked on discussion In a discussion stage Important design Includes a design discussion tuner labels Aug 25, 2021

Borda assigned Borda and SkafteNicki Aug 26, 2021

Borda changed the title ~~remove Tune arguments from Trainer init~~ prune Tune arguments from Trainer init - LearnRate & BatchSize as callback Sep 30, 2021

carmocca removed the Important label Nov 14, 2021

rohitgr7 mentioned this issue Nov 15, 2021

[RFC] Move Trainer's loop-affecting arguments to fit, validate, test, and predict #10444

Closed

Borda pinned this issue Nov 16, 2021

Borda mentioned this issue Nov 16, 2021

LR finder returns None #10557

Closed

kaushikb11 added the priority: 1 Medium priority task label Nov 17, 2021

kaushikb11 unpinned this issue Nov 17, 2021

rohitgr7 mentioned this issue Jan 6, 2022

Add BatchSizeFinder callback #11089

Merged

12 tasks

awaelchli mentioned this issue Jan 27, 2023

Decouple Tuner from Trainer #16462

Merged

11 tasks

awaelchli closed this as completed in #16462 Jan 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prune Tune arguments from Trainer init - LearnRate & BatchSize as callback #9103

prune Tune arguments from Trainer init - LearnRate & BatchSize as callback #9103

Borda commented Aug 25, 2021

ananthsub commented Aug 25, 2021

SkafteNicki commented Aug 25, 2021

Borda commented Aug 25, 2021

ananthsub commented Aug 26, 2021

Borda commented Aug 26, 2021

tchaton commented Aug 26, 2021

SkafteNicki commented Aug 26, 2021

Borda commented Oct 15, 2021

prune Tune arguments from Trainer init - LearnRate & BatchSize as callback #9103

prune Tune arguments from Trainer init - LearnRate & BatchSize as callback #9103

Comments

Borda commented Aug 25, 2021

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

ananthsub commented Aug 25, 2021

SkafteNicki commented Aug 25, 2021

Borda commented Aug 25, 2021

ananthsub commented Aug 26, 2021

Borda commented Aug 26, 2021

tchaton commented Aug 26, 2021

SkafteNicki commented Aug 26, 2021

Borda commented Oct 15, 2021