Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify attribute finding logic, fix not using dataloader when hparams present #4559

Merged
merged 21 commits into from
Jan 26, 2021

Conversation

rnett
Copy link
Contributor

@rnett rnett commented Nov 6, 2020

Simple PR to fix #4558.

Unifies attribute finding logic into lightning_get_attr_holder, which does what it says and returns the object or dict with the target attribute, or none if it can't find one. The lightning get/set/has attribute methods then use it.

Changes the logic of attribute finding (in lightning_get_attr_holder) to check model, model.hparams, and dataload, and return the first one that has the attribute. Previously, if the model didn't have the attribute, but did have hparams, it would always return the result of checking hparams and never fall through to dataloader (unless hparams didn't exist, but it's a property). This change needs reviewing.

This successfully fixes the issue in my code.

I'm having issues getting the tests/utilities/parsing.py tests to pass because the test model objects don't have trainer attributes. As far as I can see, this would have caused test failures in the original code too, what's the solution here? It does not seem to be an actual issue.

@pep8speaks
Copy link

pep8speaks commented Nov 6, 2020

Hello @rnett! Thanks for updating this PR.

Line 225:121: E501 line too long (145 > 120 characters)

Comment last updated at 2021-01-25 23:47:24 UTC

@Borda Borda added the refactor label Nov 8, 2020
@Borda Borda added this to the 1.1 milestone Nov 8, 2020
@Borda Borda modified the milestones: 1.1, 1.1.x Nov 30, 2020
@Borda
Copy link
Member

Borda commented Nov 30, 2020

@rnett how is it going, can we merge it next week?

@rnett
Copy link
Contributor Author

rnett commented Nov 30, 2020

This is finals week, but I should at a minimum be able to finish it over the weekend.

@rnett
Copy link
Contributor Author

rnett commented Dec 7, 2020

Should be good to go.

@carmocca
Copy link
Contributor

carmocca commented Dec 7, 2020

Can you merge/rebase master?

@rnett
Copy link
Contributor Author

rnett commented Dec 7, 2020

Rebase done.

@codecov
Copy link

codecov bot commented Dec 7, 2020

Codecov Report

Merging #4559 (553251f) into master (c76cc23) will decrease coverage by 0%.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #4559   +/-   ##
======================================
- Coverage      93%     93%   -0%     
======================================
  Files         135     135           
  Lines       10051   10048    -3     
======================================
- Hits         9383    9380    -3     
  Misses        668     668           

@rnett
Copy link
Contributor Author

rnett commented Dec 7, 2020

Tests are passing again, the code check failures are 1) an error in PyRight and 2) not in my code.

Copy link
Contributor

@carmocca carmocca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! This is shaping up nicely 👍

One more comment:

  • Can you add a test with the case that motivated this change?

Previously, if the model didn't have the attribute, but did have hparams, it would always return the result of checking hparams and never fall through to dataloader (unless hparams didn't exist, but it's a property).

pytorch_lightning/utilities/parsing.py Show resolved Hide resolved
@rnett
Copy link
Contributor Author

rnett commented Dec 7, 2020

Something else I noticed: the old getattr updates attr each time it finds a matching attribute, and will return the last one. I.e. it will prefer dataloder.attr over model.attr. I'm fixing my function to match it, but can someone verify this is the intended behavior? It seems possibly backwards (although I could see the rational either way).

@carmocca
Copy link
Contributor

carmocca commented Dec 7, 2020

Something else I noticed: the old getattr updates attr each time it finds a matching attribute, and will return the last one. I.e. it will prefer dataloder.attr over model.attr. I'm fixing my function to match it, but can someone verify this is the intended behavior? It seems possibly backwards (although I could see the rational either way).

I'm not sure if this is intended or just an implementation detail. You could add a comment to indicate that the last is taken for backwards compatibility

Copy link
Contributor

@carmocca carmocca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mergify mergify bot requested a review from a team December 12, 2020 14:59
@SeanNaren SeanNaren added bug Something isn't working priority: 1 Medium priority task labels Jan 7, 2021
Copy link
Contributor

@SeanNaren SeanNaren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, took me some time to grok but the fix cleans a lot of the attribute handling. Thanks @rnett!

@SeanNaren SeanNaren enabled auto-merge (squash) January 7, 2021 16:45
def lightning_hasattr(model, attribute):
""" Special hasattr for lightning. Checks for attribute in model namespace,
the old hparams namespace/dict, and the datamodule. """
def lightning_get_all_attr_holders(model, attribute):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be careful, this is API change
we are free to change anything which is protected (starting with _) but the remaining is public API
so even when we are adding new features, make a decision about what is meant to the public and protected api
cc: @PyTorchLightning/core-contributors

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lightning_hasattr is still there, the diff is just not rendering properly

attr = attribute in model.hparams
else:
attr = hasattr(model.hparams, attribute)
if hasattr(model, 'hparams'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hparams is property so what is the case it would be missing? some old model versions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is just collectin attr candidates. If there is no hparams, one of the previous/next candidates will be used

# Check if the attribute in datamodule (datamodule gets registered in Trainer)
if not attr and trainer is not None:
attr = hasattr(trainer.datamodule, attribute)
if trainer is not None and trainer.datamodule is not None and hasattr(trainer.datamodule, attribute):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

datamodule is also quite new

Suggested change
if trainer is not None and trainer.datamodule is not None and hasattr(trainer.datamodule, attribute):
if trainer is not None and hasattr(trainer, 'datamodule') is not None and hasattr(trainer.datamodule, attribute):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against being extra-safe but this should not fail. Unless a user is using this code with their own Trainer that does not inherit PL's

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Borda I think we don't need this, since we only expect this code to be used with the current version of the trainer and not some old.

Plus, if you want to add this, instead of hasattr(trainer, 'datamodule') you'd have to do getattr(trainer, 'datamodule', None) sinde hasattr` only return True or False.

return holders


def lightning_get_first_attr_holder(model, attribute):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need this? is also not very consistent as the name says "first" but you take last as [-1]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in line 229, the last is taken to preserve backwards compatibility. In case somebody is relying on the attribute candidate order

Comment on lines +233 to +236
def lightning_hasattr(model, attribute):
""" Special hasattr for lightning. Checks for attribute in model namespace,
the old hparams namespace/dict, and the datamodule. """
return lightning_get_first_attr_holder(model, attribute) is not None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure of the usage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate?

@@ -20,6 +20,9 @@ def _get_test_cases():
class TestHparamsNamespace:
learning_rate = 1

def __contains__(self, item):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind add context for this state? (as a docstring)

if len(holders) == 0:
return None
# using the last holder to preserve backwards compatibility
return holders[-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if an attribute key is in both pl_module and datamodule, the datamodule attr will be used? Is this intended?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say for sure if that was the original intention but it's not a bad idea. Usually, if a datamodule is given it's because you want to use it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The edge case is where the pl_module and datamodule contain the same key but different values. I guess this ties in with #3792

# Check if the attribute in datamodule (datamodule gets registered in Trainer)
if not attr and trainer is not None:
attr = hasattr(trainer.datamodule, attribute)
if trainer is not None and trainer.datamodule is not None and hasattr(trainer.datamodule, attribute):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Borda I think we don't need this, since we only expect this code to be used with the current version of the trainer and not some old.

Plus, if you want to add this, instead of hasattr(trainer, 'datamodule') you'd have to do getattr(trainer, 'datamodule', None) sinde hasattr` only return True or False.

@SeanNaren SeanNaren merged commit eee3b1a into Lightning-AI:master Jan 26, 2021
Borda pushed a commit that referenced this pull request Feb 4, 2021
…present (#4559)

* Rebase onto master

* indent fix

* Remove duplicated logic

* Use single return

* Remove extra else

* add `__contains__` to TestHparamsNamespace to fix tests

* Fix lightning_setattr to set all valid attributes

* update doc

* better names

* fix holder order preference

* tests for new behavior

* Comment about using the last holder

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

(cherry picked from commit eee3b1a)
Borda pushed a commit that referenced this pull request Feb 4, 2021
…present (#4559)

* Rebase onto master

* indent fix

* Remove duplicated logic

* Use single return

* Remove extra else

* add `__contains__` to TestHparamsNamespace to fix tests

* Fix lightning_setattr to set all valid attributes

* update doc

* better names

* fix holder order preference

* tests for new behavior

* Comment about using the last holder

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

(cherry picked from commit eee3b1a)
Borda pushed a commit that referenced this pull request Feb 4, 2021
…present (#4559)

* Rebase onto master

* indent fix

* Remove duplicated logic

* Use single return

* Remove extra else

* add `__contains__` to TestHparamsNamespace to fix tests

* Fix lightning_setattr to set all valid attributes

* update doc

* better names

* fix holder order preference

* tests for new behavior

* Comment about using the last holder

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

(cherry picked from commit eee3b1a)
Borda pushed a commit that referenced this pull request Feb 4, 2021
…present (#4559)

* Rebase onto master

* indent fix

* Remove duplicated logic

* Use single return

* Remove extra else

* add `__contains__` to TestHparamsNamespace to fix tests

* Fix lightning_setattr to set all valid attributes

* update doc

* better names

* fix holder order preference

* tests for new behavior

* Comment about using the last holder

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

(cherry picked from commit eee3b1a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority: 1 Medium priority task refactor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Attribute finder doesn't check datamodule when hparams is set
7 participants