Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set precision=16 when use_amp is passed as True #1145

Merged
merged 11 commits into from
Apr 6, 2020
Merged

Set precision=16 when use_amp is passed as True #1145

merged 11 commits into from
Apr 6, 2020

Conversation

rmrao
Copy link
Contributor

@rmrao rmrao commented Mar 14, 2020

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

What does this PR do?

Fixes #1143

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@Borda Borda added bug Something isn't working feature Is an improvement or enhancement labels Mar 14, 2020
@Borda Borda added this to the 0.7.2 milestone Mar 14, 2020
Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather define deprecated API

CHANGELOG.md Show resolved Hide resolved
pytorch_lightning/trainer/trainer.py Outdated Show resolved Hide resolved
@rmrao
Copy link
Contributor Author

rmrao commented Mar 14, 2020

Made the requested changes as far as I can tell. If I missed something, let me know!

@rmrao
Copy link
Contributor Author

rmrao commented Mar 14, 2020

Hmm - it looks like trainer sets use_amp at some point and maybe this causes conflicts. I will debug and test locally.

@rmrao
Copy link
Contributor Author

rmrao commented Mar 15, 2020

Ok, I have found and fixed the previous issue and run tests locally, so I believe things should work.

One concern is that it looks like the trainer uses self.use_amp in a number of places. The deprecation strategy will cause this to raise deprecation warnings though, even if the user passes in precision=16. So I see two options:

  1. Change all checks from self.use_amp to self.precision == 16 or maybe self.precision < 32.
  2. Have the use_amp argument be deprecated, but still have the trainer set the use_amp attribute internally, without deprecated api (basically revert back to original solution)

Do you have a preference / other suggestions?

@Borda
Copy link
Member

Borda commented Mar 15, 2020

in such case, it would make sense to have use_amp as a standard property and make deprecated jut the setter to it... as we want to reduce this kind duplicated setting via use_amp and precision

@rmrao
Copy link
Contributor Author

rmrao commented Mar 15, 2020

Right now, the trainer has this line

if self.precision == 16 and self.num_tpu_cores is None:
    use_amp = True
self.init_amp(use_amp)

Then in init_amp we have

def init_amp(self, use_amp):
     self.use_amp = use_amp and APEX_AVAILABLE

That means that the trainer, by default, accesses the setter for use_amp. The easiest way I see to solve this is to simply set precision = 16 when use_amp is passed in as an argument, and avoid the setter/getter property. Alternatively, the property should read something like

@property
def use_amp(self):
    return self.precision < 32 and self.num_tpu_cores is None and APEX_AVAILABLE

@use_amp.setter
def use_amp(self, use_amp):
    warnings.warn("`use_amp` has been deprecated in favor of `precision` since v0.7.0"
                             " and will be removed in v0.9.0", DeprecationWarning)
    self.precision = 16 if use_amp else 32

But I'm not sure what to do about init_amp in that case. Is this is a part of the API that can't / shouldn't be changed?

Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good approach, just a bit cleaning according to your comment that we need use_amp elsewhere
pls do changes as you suggest in #1145 (comment)

pytorch_lightning/trainer/deprecated_api.py Outdated Show resolved Hide resolved
pytorch_lightning/trainer/deprecated_api.py Outdated Show resolved Hide resolved
pytorch_lightning/trainer/deprecated_api.py Outdated Show resolved Hide resolved
@rmrao
Copy link
Contributor Author

rmrao commented Mar 15, 2020

Can make those changes. What should I do about init_amp? Right now it looks like this:

    def init_amp(self, use_amp):
        self.use_amp = use_amp and APEX_AVAILABLE
        if self.use_amp:
            log.info('Using 16bit precision.')


        if use_amp and not APEX_AVAILABLE:  # pragma: no cover
            msg = """
            You set `use_amp=True` but do not have apex installed.
            Install apex first using this guide and rerun with use_amp=True:
            https://github.com/NVIDIA/apex#linux

            this run will NOT use 16 bit precision
            """
            raise ModuleNotFoundError(msg)

That means that the trainer access the self.use_amp setter regardless of whether the user actually passes in the use_amp parameter. I can change the method signature, but this may cause problems for users if they have overridden this method. I am not sure if overriding this method is supported, but if it is, then I assume the signature needs to be kept the same.

One option is to make the following change:

    def init_amp(self, precision):
        if isinstance(precision, bool):
              # raise deprecation warning
              self.use_amp = precision

        if self.use_amp:
            log.info('Using 16bit precision.')

@codecov
Copy link

codecov bot commented Mar 16, 2020

Codecov Report

Merging #1145 into master will decrease coverage by 0%.
The diff coverage is 93%.

@@          Coverage Diff           @@
##           master   #1145   +/-   ##
======================================
- Coverage      92%     92%   -0%     
======================================
  Files          63      63           
  Lines        3323    3330    +7     
======================================
+ Hits         3051    3057    +6     
- Misses        272     273    +1     

@Borda
Copy link
Member

Borda commented Mar 17, 2020

@rmrao could you pls update from master? we have fixed the failure in #1163

@Borda
Copy link
Member

Borda commented Mar 23, 2020

@rmrao I have applied all previous suggestions, but we still need to as test for this one

@mergify
Copy link
Contributor

mergify bot commented Mar 24, 2020

This pull request is now in conflict... :(

@Borda Borda added the priority: 0 High priority task label Mar 27, 2020
@Borda
Copy link
Member

Borda commented Mar 27, 2020

@rmrao how is it going, could I help?

@mergify mergify bot requested a review from a team March 30, 2020 22:33
@pep8speaks
Copy link

pep8speaks commented Mar 30, 2020

Hello @rmrao! Thanks for updating this PR.

Line 24:111: E501 line too long (115 > 110 characters)

Comment last updated at 2020-04-06 07:42:59 UTC

@williamFalcon
Copy link
Contributor

@Borda can we merge this?

Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀 @williamFalcon ^^

@Borda Borda added the ready PRs ready to be merged label Mar 30, 2020
@Borda Borda requested review from a team, justusschock and tullie March 31, 2020 07:40
@Borda
Copy link
Member

Borda commented Mar 31, 2020

it seems that there is incompatibility with master... @rmrao ^^

@Borda Borda removed the ready PRs ready to be merged label Mar 31, 2020
@Borda Borda assigned Borda and unassigned jeremyjordan Apr 4, 2020
@rmrao
Copy link
Contributor Author

rmrao commented Apr 4, 2020

Ok I just went back to the easy solution. Tests passed locally, so should be good.

@Borda
Copy link
Member

Borda commented Apr 4, 2020

@rmrao it seems that I do have permission to push to your master to edit this PR, do you have enabled edit for maintained?
https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/allowing-changes-to-a-pull-request-branch-created-from-a-fork

@rmrao
Copy link
Contributor Author

rmrao commented Apr 4, 2020

I definitely have edit access for maintained enabled, just double checked.

@Borda Borda requested a review from jeremyjordan April 4, 2020 23:14
@Borda
Copy link
Member

Borda commented Apr 4, 2020

just curious, there is also use_amp in the model... it is kind of duplication @jeremyjordan?

@mergify
Copy link
Contributor

mergify bot commented Apr 5, 2020

This pull request is now in conflict... :(

@Borda
Copy link
Member

Borda commented Apr 5, 2020

@rmrao did some more cleaning, could you please check and resolve conflicts...

@Borda Borda added the ready PRs ready to be merged label Apr 5, 2020
@rmrao
Copy link
Contributor Author

rmrao commented Apr 5, 2020

It looks good to me!

@williamFalcon
Copy link
Contributor

@rmrao mind rebasing?

@Borda
Copy link
Member

Borda commented Apr 5, 2020

@williamFalcon rebased ^^

CHANGELOG.md Show resolved Hide resolved
@williamFalcon williamFalcon merged commit 4ed3027 into Lightning-AI:master Apr 6, 2020
tullie pushed a commit to tullie/pytorch-lightning that referenced this pull request Jun 7, 2020
* Set precision=16 when use_amp is passed as True

* Update CHANGELOG.md

* add use_amp to deprecated API

* Update trainer.py

* Update trainer.py

* move the use_amp attribute to deprecated API

* move use_amp deprecation back to Trainer's __init__

* drop unsed

* drop deprecated

* reorder imports

* typing

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
@Borda Borda modified the milestones: v0.7., v0.7.x Apr 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feature Is an improvement or enhancement priority: 0 High priority task ready PRs ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

use_amp is broken in 0.7.0
7 participants