-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trainer option to opt-out of warnings which are more like linting than legit issues #10644
Comments
This is not a hack but the native mechanism provided by Python to disable warnings.
You can ignore based on a regex like this: import warnings
warnings.filterwarnings("ignore", ".*does not have many workers.*") This is documented here: https://pytorch-lightning.readthedocs.io/en/stable/guides/speed.html#num-workers |
THanks for the help on the regex. But this really is a hack. You are suggesting that users literally find segments of the sourcecode to avoid a false-positive suggestion for their code from showing everytime. That isn't very accessible for new users, and requires a high degree of sophistication. That is a little less fragile, but what if you change the text between versions? Or what if you add in another warning with the same text that is a legit issue and also gets ignored. |
@awaelchli made the suggestion on slack that there may be is a difference between a "info" and a "warning" instead of my "linting" interpretation. A way to ignore the "info" ones would be fine as well, where to me the bar for setting something as "info" vs. "warning" to me is whether it is just a one-time thing to check and act upon vs. something you want to know everytime you run it even after doing a sanity check. |
@awaelchli also asked (but wasn't suggesting he would support!) whether I was suggesting that we only show these in a |
@carmocca I tried the text you gave on the stable PL docs https://pytorch-lightning.readthedocs.io/en/stable/guides/speed.html#num-workers and it doesn't work right now because the text changed and the regex is no longer valid. You will want to update those docs (see the new text in https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/data_loading.py#L106-L112 ) but I hope this also is helpful for considering whether this is a maintainable workaround. |
Users would not find "segments of the sourcecode" - they would filter them on-demand after the warning appears in their console or logs. The user does not have to find it in the code to filter it.
This could work for the num workers warning because the DataLoaders will get instantiated in
This is the only downside I see, and it's on us the maintainers to avoid changes. A solution for this would be to expose the message itself: # user code
from pytorch_lightning.somewhere import num_workers_warning_message
warnings.filterwarnings("ignore", num_workers_warning_message) The flag I stand by suggesting |
Similar discussion in #10182 |
But I had to! @carmocca You told me to just follow https://pytorch-lightning.readthedocs.io/en/stable/guides/speed.html#num-workers I tried it and it didn't work. I had to figure out what file that was in, look at the tagged branch for the release that I was currently using (because it might be different on the main branch than the latest released branch), and update the code you posted. Moreover, with a regex now if there are any other warnings that are legit which overlap with that text, I will now miss them. False negatives are scary to me, especially for legit warnigns as opposed to info on tips for better performance.
But that means you need to write warnings so the text doesn't overlap? That is a difficult process to do because it means you can't write the text without looking through every other warning there, and maybe coming up with a table of regex to ignore warnings which uniquely identify a single warning and don't accidentally knock out other ones. Then you need to put that in the docs and keep it up to date. A huge amount of work, and I am not sure to what end. But the biggest downside here is simple. These are not warnings in the standard python sense. This are more of a "hey, pro-tip: here is something which may help performance on your model" output. Very useful when people want to see them, but especially annoying if you already checked them as it creates noise making it harder to see legitimate warnings. Divide by zero, that is a warning that should always show. Singular matrices in linear algebra is a warning that should always show. Those are true warnings. If someone wnts to ignore those, then it shouldn't be easy. Telling people that the with the defaults in pl my model is only going to log at the end of the epoch and not in the middle of it is not*something that should always show on every run. I would say it shouldn't show at all because I think it is almost always deliberate or the right default behavior, but it certainly shouldn't show every time you run the code.
Sorry if I misstated my feeling here, I was passing on that idea but don't especially like the fast_dev_run approach. It is something to consider though. But if that is something that would make them disappear I would support it. That way I can do a
To be clear, I think that all of the info/protip/linting style warnings should all go in there. Anything which is a legit warning should not. People turn the flag on, get your protips, decide if they want to apply any of it, then turn it back on again. Almost all of those things are one-time changes and considerations anyways. Even when they are helpful (and they sometimes are) those tips are not something you want to see every time. |
That's not a segment of the source code but a docs snippet.
We just added that one because it's one of the most false-positive warnings but we don't plan to show in the docs how to filter each possible warning
The point is, this decision is on the user as he/she is the person disabling the warning. Personally, I suggest writing a minimal meaningful message that differentiates the warning. Conflicts should be very rare if done that way. Anything you can code in a regex works.
This issue is mixing separate topics which are not necessarily linked:
My answers are responses to (1) as it's the title of the issue. But I would suggest you open separate issues for each instance of (2) you think should be updated. |
Yes. And the docs were wrong. To fix it, the user had to find the particular branch for the particular released version within the actual sourcecode. Now this workaround works only for certain versions of PL, and there is a chance that somewhere you add in new warning in the future which is legit and the regex ends up skipping.
Having a workaround that is whitespace sensitive on source code and makes it so that the developers of the package need to be careful how they write the underlying messages to (1) ensure they don't change strings between versions; and (2) avoid overlap when creating new warnings by permuting the warning text in order to avoid false-negatives from pre-existing code skipping warnings doesn't sound like a robust workaround for anybody.
No, this issue is extremely specific. There are things which are currently warnings that shouldn't be. They are "pro-tips" which may or may not be useful to see once, which you are using the python warning system to display. The issue says to find a way to not show those pro-tips/linting/etc. things. I emphatically am NOT saying that we should skip all warnings as you suggest. Just that "info/protip/linting" ones are different and should be skippable. I don't see my python linting everytime I run my code, regardless of whether it is useful tips or not. And there is no way for me to add in a separate issue for removing warnings because at this point there is no alternative. The purpose of this issue is to suggest a place that particular "warnings" can be moved to "info/pro-tip". I identified 3 that should not be warnings as an example, but others could have separate issues. |
If using the string regex is not an option for you, another thing we can do is add a custom warning class for these warnings that might be false positives class PossibleUserWarning(UserWarning):
... So that you can do # user code
from pytorch_lightning.utilities.warnings import PossibleUserWarning
warnings.filterwarnings("ignore", category=PossibleUserWarning) |
@carmocca great idea, and thanks for considering this. My apologies on being impatient and frustrated here. I think this would be great. Furthermore, I think it opens it up for you to add all sorts of additional protip/linting style warnings. With this, you could add far more suggestions which may be false positives since users can turn them off after consideration. |
Can you explicitly list here the warnings you think would fit under this new category? |
Thanks. The only ones I have run into are:
The first in this list is the most recent one which took us a while to check to see if it was a legit error or a false positive. Hence part of my frustration here and why I wrote the issue :-). I won't comment on others but I suspect many (even most?) of the current warnings are more of the "protip" style - and I don't think that is a bad thing since many of them are helpful as a one-shot "hey, check this" in different circumstances to different people, |
There are a number of warnings in PyTorch that represent advice for improving model, data loader, and training decisions. However these warnings are confusing to end users and often aren't clearly relevant even for advanced users. As a result, PyTorch supports disabling these class of warnings, PossibleUserWarnings. See: https://lightning.ai/docs/pytorch/2.4.0/advanced/warnings.html Lightning-AI/pytorch-lightning#10644 This change follows the recommended approach to hide these warnings from users.
There are a number of warnings in PyTorch that represent advice for improving model, data loader, and training decisions. These warnings are often triggered during testing. PyTorch supports disabling this class of warnings, PossibleUserWarnings. See: https://lightning.ai/docs/pytorch/2.4.0/advanced/warnings.html Lightning-AI/pytorch-lightning#10644 This change follows the recommended approach to hide these warnings during testing to avoid cluttering the logs. We were filtering several of them manually, but those filters had stopped working due to changes in the wording of the warning. As of fa8f1df these warnings are suppressed for users, but pytest appears to override warnings filtered by the package itself so we still need this.
🚀 Feature
An option on the trainer (or maybe logger?), called
ignore_minor_warnings
or, even better for me, invert that and haveruntime_linting_warnings
or something like that.Then in the cases where a PL warning is more of a suggestion for something to check, it would only show it in the runtime if
ignore_minor_warnings = False
orruntime_linting_warnings = True
.For example, #7734 introduced a warning on the
log_every_n_steps
which triggers with Trainer default arguments if you ever have < 50 batches. The sugestion would be to make the code https://github.com/PyTorchLightning/pytorch-lightning/blob/44f62435c8d5860b542a1cb2cca66afd9e75cbcc/pytorch_lightning/trainer/data_loading.py#L306-L311 becomeetc.
There are a number of these in PL that shouldn't be shown everytime someone runs the model. The ones that cause me the most grief:
Those are the only ones I have run into that are clearly a "linting" sort of situation, but I suspect there are many others. Just to be clear, I don't think this should be a global way to turn off warnings - just a way to turn off warnings that have a high probability of being a false positive.
Motivation
Everytime I run my code I see the following.
and even more if I have a GPU accessible on my machine and don't want to use it (as if frequent). I am also concerned since there seems to be even more of these cropping into PL so many it will get bigger and bigger. The signal-to-noise ratio of the warnings is limited here because I have no idea if something is a legit problem or not.
Pitch
To me there are two types of warnigns involved here: (1) a "true warning" which is something that you would want to know everytime the code runs, rather than just show it once and let me decide if it is relevant; and (2) more like "linting" done at runtime. Tips and tricks that may or may not be helpful depending on the exact model. More likely to have false positives than true warnings, but the biggest difference is that it is a one-time decision: "Look at this warning and decide if it is relevant to your setup, then never look at it again". That is why I am calling this more of a linting situation.
I think it is great that PL has more of the "linting" style stuff since it probably helps in some cases, but it is an annoyance and noise if it always shows - even after I have made a decision on its relevance.
Alternatives
Alternative one is just to ignore the warnings. This is tricky because it masks real warnings. New users would have a lot of difficulty figuring out if issues are relevant or not. What generally happens is that the user tries the suggestion (e.g.
num_workers = 4
) to see if it helps speed. If it doesn't, they would rather never see it again.The second alternative is for a user to use a hack to manually turn off the warnings wiht the
import warnings
in their code.For example, I used to have the following code at the top of my file:
to ignore one of these.
The problems with this are very simple:
lineno=102
may only apply to a particular point release of PL. Even if I pin the PL version in myrequirements.txt
file, anytime things are upgraded I need to go through the source in PL and change line numbers. Forget about supporting multiple versions of PL.... and if I forget to change things, there is a chance that the lineno will now point to a totally different warning (as the warnings are usually grouped together) which might be a legit one I want to know.If you enjoy Lightning, check out our other projects! ⚡
Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.
cc @Borda
The text was updated successfully, but these errors were encountered: