Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trainer option to opt-out of warnings which are more like linting than legit issues #10644

Closed
jlperla opened this issue Nov 19, 2021 · 14 comments · Fixed by #10675
Closed

Trainer option to opt-out of warnings which are more like linting than legit issues #10644

jlperla opened this issue Nov 19, 2021 · 14 comments · Fixed by #10675
Labels
discussion In a discussion stage feature Is an improvement or enhancement

Comments

@jlperla
Copy link

jlperla commented Nov 19, 2021

🚀 Feature

An option on the trainer (or maybe logger?), called ignore_minor_warnings or, even better for me, invert that and have runtime_linting_warnings or something like that.

Then in the cases where a PL warning is more of a suggestion for something to check, it would only show it in the runtime if ignore_minor_warnings = False or runtime_linting_warnings = True.

For example, #7734 introduced a warning on the log_every_n_steps which triggers with Trainer default arguments if you ever have < 50 batches. The sugestion would be to make the code https://github.com/PyTorchLightning/pytorch-lightning/blob/44f62435c8d5860b542a1cb2cca66afd9e75cbcc/pytorch_lightning/trainer/data_loading.py#L306-L311 become

if self.logger and self.logger.runtime_linting_warnings and self.num_training_batches < self.log_every_n_steps:
    rank_zero_warn(
        f"The number of training samples ({self.num_training_batches}) is smaller than the logging interval"
        f" Trainer(log_every_n_steps={self.log_every_n_steps}). Set a lower value for log_every_n_steps if"
        f" you want to see logs for the training epoch."
    )

etc.

There are a number of these in PL that shouldn't be shown everytime someone runs the model. The ones that cause me the most grief:

Those are the only ones I have run into that are clearly a "linting" sort of situation, but I suspect there are many others. Just to be clear, I don't think this should be a global way to turn off warnings - just a way to turn off warnings that have a high probability of being a false positive.

Motivation

Everytime I run my code I see the following.

C:\Users\jesse\anaconda3\lib\site-packages\pytorch_lightning\trainer\data_loading.py:110: UserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 20 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
C:\Users\jesse\anaconda3\lib\site-packages\pytorch_lightning\trainer\data_loading.py:393: UserWarning: The number of training samples (8) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
  rank_zero_warn(
C:\Users\jesse\anaconda3\lib\site-packages\pytorch_lightning\trainer\data_loading.py:110: UserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 20 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
C:\Users\jesse\anaconda3\lib\site-packages\pytorch_lightning\trainer\data_loading.py:110: UserWarning: The dataloader, test_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 20 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(

and even more if I have a GPU accessible on my machine and don't want to use it (as if frequent). I am also concerned since there seems to be even more of these cropping into PL so many it will get bigger and bigger. The signal-to-noise ratio of the warnings is limited here because I have no idea if something is a legit problem or not.

Pitch

To me there are two types of warnigns involved here: (1) a "true warning" which is something that you would want to know everytime the code runs, rather than just show it once and let me decide if it is relevant; and (2) more like "linting" done at runtime. Tips and tricks that may or may not be helpful depending on the exact model. More likely to have false positives than true warnings, but the biggest difference is that it is a one-time decision: "Look at this warning and decide if it is relevant to your setup, then never look at it again". That is why I am calling this more of a linting situation.

I think it is great that PL has more of the "linting" style stuff since it probably helps in some cases, but it is an annoyance and noise if it always shows - even after I have made a decision on its relevance.

Alternatives

Alternative one is just to ignore the warnings. This is tricky because it masks real warnings. New users would have a lot of difficulty figuring out if issues are relevant or not. What generally happens is that the user tries the suggestion (e.g. num_workers = 4) to see if it helps speed. If it doesn't, they would rather never see it again.

The second alternative is for a user to use a hack to manually turn off the warnings wiht the import warnings in their code.

For example, I used to have the following code at the top of my file:

warnings.filterwarnings(
    "ignore",
    category=UserWarning,
    module="pytorch_lightning.trainer.data_loading",
    lineno=102,
)

to ignore one of these.

The problems with this are very simple:

  1. It is highly version specific. So lineno=102 may only apply to a particular point release of PL. Even if I pin the PL version in my requirements.txt file, anytime things are upgraded I need to go through the source in PL and change line numbers. Forget about supporting multiple versions of PL.... and if I forget to change things, there is a chance that the lineno will now point to a totally different warning (as the warnings are usually grouped together) which might be a legit one I want to know.
  2. The users have to maintain this themselves, which is a lot of work.

If you enjoy Lightning, check out our other projects! ⚡

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.

  • Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.

  • Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @Borda

@jlperla jlperla added the feature Is an improvement or enhancement label Nov 19, 2021
@carmocca
Copy link
Contributor

carmocca commented Nov 19, 2021

The second alternative is for a user to use a hack to manually turn off the warnings wiht the import warnings in their code.

This is not a hack but the native mechanism provided by Python to disable warnings.

It is highly version specific

You can ignore based on a regex like this:

import warnings

warnings.filterwarnings("ignore", ".*does not have many workers.*")

This is documented here: https://pytorch-lightning.readthedocs.io/en/stable/guides/speed.html#num-workers

@jlperla
Copy link
Author

jlperla commented Nov 19, 2021

THanks for the help on the regex. But this really is a hack. You are suggesting that users literally find segments of the sourcecode to avoid a false-positive suggestion for their code from showing everytime. That isn't very accessible for new users, and requires a high degree of sophistication.

That is a little less fragile, but what if you change the text between versions? Or what if you add in another warning with the same text that is a legit issue and also gets ignored.

@jlperla
Copy link
Author

jlperla commented Nov 19, 2021

@awaelchli made the suggestion on slack that there may be is a difference between a "info" and a "warning" instead of my "linting" interpretation. A way to ignore the "info" ones would be fine as well, where to me the bar for setting something as "info" vs. "warning" to me is whether it is just a one-time thing to check and act upon vs. something you want to know everytime you run it even after doing a sanity check.

@jlperla
Copy link
Author

jlperla commented Nov 19, 2021

@awaelchli also asked (but wasn't suggesting he would support!) whether I was suggesting that we only show these in a fast_dev_run = True . My feeling is that the the warnings are all pretty useful to see at least once - and maybe people don't use fast_dev_run often enough. I also think it is best to have that switch default to showing the "info" style warnings so that manual intervention is required.

@jlperla
Copy link
Author

jlperla commented Nov 19, 2021

@carmocca I tried the text you gave on the stable PL docs https://pytorch-lightning.readthedocs.io/en/stable/guides/speed.html#num-workers and it doesn't work right now because the text changed and the regex is no longer valid. You will want to update those docs (see the new text in https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/data_loading.py#L106-L112 ) but I hope this also is helpful for considering whether this is a maintainable workaround.

@carmocca
Copy link
Contributor

carmocca commented Nov 22, 2021

You are suggesting that users literally find segments of the sourcecode to avoid a false-positive suggestion for their code from showing everytime. That isn't very accessible for new users, and requires a high degree of sophistication.

Users would not find "segments of the sourcecode" - they would filter them on-demand after the warning appears in their console or logs. The user does not have to find it in the code to filter it.

whether I was suggesting that we only show these in a fast_dev_run = True

This could work for the num workers warning because the DataLoaders will get instantiated in fast_dev_run, but I'm not sure this is a valid solution for all maybe-false-positive warnings, which are the ones this issue tries to address. Additionally, some might disable fast_dev_run

but what if you change the text between versions?

This is the only downside I see, and it's on us the maintainers to avoid changes. A solution for this would be to expose the message itself:

# user code
from pytorch_lightning.somewhere import num_workers_warning_message
warnings.filterwarnings("ignore", num_workers_warning_message)

The flag runtime_linting_warnings is not a good solution because a user might set it to ignore a specific warning, but then other warnings which could be useful won't appear.

I stand by suggesting filterwarnings, as it's the most flexible solution that works for all warnings.

@carmocca carmocca added the discussion In a discussion stage label Nov 22, 2021
@marcm-ml
Copy link
Contributor

marcm-ml commented Nov 22, 2021

Similar discussion in #10182

@jlperla
Copy link
Author

jlperla commented Nov 22, 2021

Users would not find "segments of the sourcecode" -

But I had to! @carmocca You told me to just follow https://pytorch-lightning.readthedocs.io/en/stable/guides/speed.html#num-workers

I tried it and it didn't work. I had to figure out what file that was in, look at the tagged branch for the release that I was currently using (because it might be different on the main branch than the latest released branch), and update the code you posted. Moreover, with a regex now if there are any other warnings that are legit which overlap with that text, I will now miss them. False negatives are scary to me, especially for legit warnigns as opposed to info on tips for better performance.

This is the only downside I see, and it's on us the maintainers to avoid changes

But that means you need to write warnings so the text doesn't overlap? That is a difficult process to do because it means you can't write the text without looking through every other warning there, and maybe coming up with a table of regex to ignore warnings which uniquely identify a single warning and don't accidentally knock out other ones. Then you need to put that in the docs and keep it up to date. A huge amount of work, and I am not sure to what end.

But the biggest downside here is simple. These are not warnings in the standard python sense.

This are more of a "hey, pro-tip: here is something which may help performance on your model" output. Very useful when people want to see them, but especially annoying if you already checked them as it creates noise making it harder to see legitimate warnings.

Divide by zero, that is a warning that should always show. Singular matrices in linear algebra is a warning that should always show. Those are true warnings. If someone wnts to ignore those, then it shouldn't be easy.

Telling people that the with the defaults in pl my model is only going to log at the end of the epoch and not in the middle of it is not*something that should always show on every run. I would say it shouldn't show at all because I think it is almost always deliberate or the right default behavior, but it certainly shouldn't show every time you run the code.

This could work for the num workers warning because the DataLoaders will get instantiated in fast_dev_run,

Sorry if I misstated my feeling here, I was passing on that idea but don't especially like the fast_dev_run approach. It is something to consider though. But if that is something that would make them disappear I would support it. That way I can do a fast_dev_run once and see if there are any useful advice, and then not run it again.

The flag runtime_linting_warnings is not a good solution because a user might set it to ignore a specific warning, but then other warnings which could be useful won't appear.

To be clear, I think that all of the info/protip/linting style warnings should all go in there. Anything which is a legit warning should not. People turn the flag on, get your protips, decide if they want to apply any of it, then turn it back on again. Almost all of those things are one-time changes and considerations anyways. Even when they are helpful (and they sometimes are) those tips are not something you want to see every time.

@carmocca
Copy link
Contributor

carmocca commented Nov 22, 2021

But I had to! @carmocca You told me to just follow

That's not a segment of the source code but a docs snippet.

Then you need to put that in the docs and keep it up to date. A huge amount of work, and I am not sure to what end.

We just added that one because it's one of the most false-positive warnings but we don't plan to show in the docs how to filter each possible warning

But that means you need to write warnings so the text doesn't overlap?

The point is, this decision is on the user as he/she is the person disabling the warning. Personally, I suggest writing a minimal meaningful message that differentiates the warning. Conflicts should be very rare if done that way. Anything you can code in a regex works.

These are not warnings in the standard python sense.

This issue is mixing separate topics which are not necessarily linked:

  1. Should we add a mechanism to disable all warnings?
  2. Should warning X be a warning or a logging message?

My answers are responses to (1) as it's the title of the issue. But I would suggest you open separate issues for each instance of (2) you think should be updated.

@jlperla
Copy link
Author

jlperla commented Nov 22, 2021

That's not a segment of the source code but a docs snippet.

Yes. And the docs were wrong. To fix it, the user had to find the particular branch for the particular released version within the actual sourcecode. Now this workaround works only for certain versions of PL, and there is a chance that somewhere you add in new warning in the future which is legit and the regex ends up skipping.

The point is, this decision is on the user as he/she is the person disabling the warning. Personally, I suggest writing a minimal meaningful message that differentiates the warning. Conflicts should be very rare if done that way.

Having a workaround that is whitespace sensitive on source code and makes it so that the developers of the package need to be careful how they write the underlying messages to (1) ensure they don't change strings between versions; and (2) avoid overlap when creating new warnings by permuting the warning text in order to avoid false-negatives from pre-existing code skipping warnings doesn't sound like a robust workaround for anybody.

This issue is mixing separate topics which are not necessarily linked:
Should we add a mechanism to disable all warnings?
Should warning X be a warning or a logging message?
My answers are responses to (1) as it's the title of the issue. But I would suggest you open separate issues for each instance of (2) you think should be updated.

No, this issue is extremely specific. There are things which are currently warnings that shouldn't be. They are "pro-tips" which may or may not be useful to see once, which you are using the python warning system to display. The issue says to find a way to not show those pro-tips/linting/etc. things.

I emphatically am NOT saying that we should skip all warnings as you suggest. Just that "info/protip/linting" ones are different and should be skippable. I don't see my python linting everytime I run my code, regardless of whether it is useful tips or not.

And there is no way for me to add in a separate issue for removing warnings because at this point there is no alternative. The purpose of this issue is to suggest a place that particular "warnings" can be moved to "info/pro-tip". I identified 3 that should not be warnings as an example, but others could have separate issues.

@carmocca
Copy link
Contributor

If using the string regex is not an option for you, another thing we can do is add a custom warning class for these warnings that might be false positives

class PossibleUserWarning(UserWarning):
    ...

So that you can do

# user code
from pytorch_lightning.utilities.warnings import PossibleUserWarning
warnings.filterwarnings("ignore", category=PossibleUserWarning)

@jlperla
Copy link
Author

jlperla commented Nov 22, 2021

@carmocca great idea, and thanks for considering this. My apologies on being impatient and frustrated here.

I think this would be great. Furthermore, I think it opens it up for you to add all sorts of additional protip/linting style warnings. With this, you could add far more suggestions which may be false positives since users can turn them off after consideration.

@carmocca
Copy link
Contributor

Can you explicitly list here the warnings you think would fit under this new category?

@jlperla
Copy link
Author

jlperla commented Nov 22, 2021

Thanks. The only ones I have run into are:

The first in this list is the most recent one which took us a while to check to see if it was a legit error or a false positive. Hence part of my frustration here and why I wrote the issue :-). I won't comment on others but I suspect many (even most?) of the current warnings are more of the "protip" style - and I don't think that is a bad thing since many of them are helpful as a one-shot "hey, check this" in different circumstances to different people,

ethanwhite added a commit to ethanwhite/DeepForest that referenced this issue Aug 10, 2024
There are a number of warnings in PyTorch that represent advice for
improving model, data loader, and training decisions. However these
warnings are confusing to end users and often aren't clearly relevant
even for advanced users. As a result, PyTorch supports disabling these
class of warnings, PossibleUserWarnings.

See:
https://lightning.ai/docs/pytorch/2.4.0/advanced/warnings.html
Lightning-AI/pytorch-lightning#10644

This change follows the recommended approach to hide these warnings
from users.
ethanwhite added a commit to ethanwhite/DeepForest that referenced this issue Aug 11, 2024
There are a number of warnings in PyTorch that represent advice for
improving model, data loader, and training decisions. These warnings
are often triggered during testing. PyTorch supports disabling this
class of warnings, PossibleUserWarnings.

See:
https://lightning.ai/docs/pytorch/2.4.0/advanced/warnings.html
Lightning-AI/pytorch-lightning#10644

This change follows the recommended approach to hide these warnings
during testing to avoid cluttering the logs. We were filtering several
of them manually, but those filters had stopped working due to changes
in the wording of the warning.

As of fa8f1df these warnings are
suppressed for users, but pytest appears to override warnings filtered
by the package itself so we still need this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion In a discussion stage feature Is an improvement or enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants