make `extract_batch_size` a `LightningModule.hook` #10635

rohitgr7 · 2021-11-19T09:38:22Z

🚀 Feature

Motivation

Copy pasting context from here: #10408 (review)

A generic idea from @ninginthecloud point of view, they think extract_data_batch() is this kind function we could convert as hook and delegate to user to implement, because they know their batch, it will be easily for them to define the batch_size or how to get batch_size. While here we had a lot assumptions about what the batch is going to look like, but we still can only extract valid batch_size under very limited situations.

Some comments:
#10408 (comment)
#10408 (comment)

Pitch

Alternatives

Additional context

If you enjoy Lightning, check out our other projects! ⚡

Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @Borda @justusschock @awaelchli @ninginthecloud @tchaton @ananthsub @carmocca @SeanNaren Any thoughts ?

awaelchli · 2021-11-19T12:41:55Z

Can you comment on the differences/advantages this would bring vs. self.log(..., batch_size=...) which to me seems to offer the same today?

rohitgr7 · 2021-11-19T13:29:46Z

I am fine with the current implementation: #10408 (comment)
just wanted to discuss this over an issue rather than on that PR.

awaelchli · 2021-11-19T13:41:29Z

if we do this, then we will have two different ways by which user can define their batch size. One will be this hook and another one will be self.log(..., batch_size=)

This was exactly my thought!
A major complaint about Lightning was that there are multiple ways of doing the same. This was raised recently several times by FB (#7740).

self.log(..., batch_size=...) seems to be the most flexible at the moment.

carmocca · 2021-11-19T14:29:14Z

Pros:

The place that defines the data should be able to define this (DataHooks)

Cons:

Since the batch size is only used for logging, it does make sense to have it there.
Perhaps the user applied further transformations to their batch and they need to customize the value at *_step time.

Leaning towards not doing it.

stale · 2021-12-27T09:25:21Z

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

dnnspark · 2022-03-02T23:00:15Z

Hello, I'm facing similar issue while I'm using trainer.track_grad_norm=2. It's calling extract_batch_size() which does not work with my dataloader.

rohitgr7 · 2022-03-07T08:34:51Z

it calls extract_batch_size on the current batch. By default, it can handle normal tensors but in case it's a custom batch or something then you can specify your batch_size using self.log(..., batch_size). You can try it with Trainer(fast_dev_run=True) to check whether it throws any warning. If it does then I recommend specifying the batch_size as I mentioned above, if it doesn't then it's good to go :)

dnnspark · 2022-03-10T22:29:08Z

Thanks for replying and detailed explanation, @rohitgr7! One last question: in case of distributed training, does the batch_size has to be the "effective" batch size (i.e. len(batch) * trainer.gpus * trainer.num_nodes) or the batch size per GPU (i.e. len(batch)?

carmocca · 2022-03-15T13:31:36Z

Per GPU

rubencart · 2022-11-17T14:11:35Z

I am getting a ...MisconfigurationException: We could not infer the batch_size from the batch... error when running with a custom batch structure and track_grad_norm=2, since tracking the grad norm makes a call to log_dict without batch_size set.

.../site-packages/pytorch_lightning/core/module.py", line 565, in log_grad_norm
    self.log_dict(grad_norm_dict, on_step=True, on_epoch=True, prog_bar=False, logger=True)

carmocca · 2022-12-12T14:38:02Z

What's the structure specifically? Could you make changes to it so that it's infer-able?

The relevant code is in: https://github.com/Lightning-AI/lightning/blob/38acba08fc009caac32b6b0ab8984d8935569db9/src/pytorch_lightning/utilities/data.py#L50-L96

rohitgr7 added the feature Is an improvement or enhancement label Nov 19, 2021

rohitgr7 changed the title ~~make extract_batch_size a hook~~ make extract_batch_size a LightningModule.hook Nov 19, 2021

rohitgr7 mentioned this issue Nov 19, 2021

Fix an issue that caused Lightning to extract the batch size even though it was set by the user in LightningModule.log #10408

Merged

12 tasks

rohitgr7 added the data handling Generic data-related topic label Nov 19, 2021

stale bot added the won't fix This will not be worked on label Dec 27, 2021

rohitgr7 closed this as completed Dec 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make `extract_batch_size` a `LightningModule.hook` #10635

make `extract_batch_size` a `LightningModule.hook` #10635

rohitgr7 commented Nov 19, 2021 •

edited by github-actions bot

Loading

awaelchli commented Nov 19, 2021

rohitgr7 commented Nov 19, 2021

awaelchli commented Nov 19, 2021

carmocca commented Nov 19, 2021 •

edited

Loading

stale bot commented Dec 27, 2021

dnnspark commented Mar 2, 2022 •

edited

Loading

rohitgr7 commented Mar 7, 2022

dnnspark commented Mar 10, 2022

carmocca commented Mar 15, 2022

rubencart commented Nov 17, 2022

carmocca commented Dec 12, 2022

make extract_batch_size a LightningModule.hook #10635

make extract_batch_size a LightningModule.hook #10635

Comments

rohitgr7 commented Nov 19, 2021 • edited by github-actions bot Loading

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

If you enjoy Lightning, check out our other projects! ⚡

awaelchli commented Nov 19, 2021

rohitgr7 commented Nov 19, 2021

awaelchli commented Nov 19, 2021

carmocca commented Nov 19, 2021 • edited Loading

stale bot commented Dec 27, 2021

dnnspark commented Mar 2, 2022 • edited Loading

rohitgr7 commented Mar 7, 2022

dnnspark commented Mar 10, 2022

carmocca commented Mar 15, 2022

rubencart commented Nov 17, 2022

carmocca commented Dec 12, 2022

make `extract_batch_size` a `LightningModule.hook` #10635

make `extract_batch_size` a `LightningModule.hook` #10635

rohitgr7 commented Nov 19, 2021 •

edited by github-actions bot

Loading

carmocca commented Nov 19, 2021 •

edited

Loading

dnnspark commented Mar 2, 2022 •

edited

Loading