Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IPU] Call accelerator hooks regardless if LM hook overridden 1/n #7826

Merged
merged 6 commits into from
Jun 4, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions pytorch_lightning/trainer/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -1237,11 +1237,13 @@ def call_hook(self, hook_name: str, *args, **kwargs) -> Any:
hook_fx = getattr(model_ref, hook_name)
output = hook_fx(*args, **kwargs)

# if the PL module doesn't have the hook then call the accelerator
# used to auto-reduce things for the user with Results obj
Comment on lines -1240 to -1241
Copy link
Contributor

@awaelchli awaelchli Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one example is training_step_end for DP which is handled in the plugin. If user doesn't do the reduction we do it in the accelerator. But now with the change, if the user has it overridden you still call training_step_end on the DP plugin which will do another automatic reduction.

elif hasattr(self.accelerator, hook_name):
# call the accelerator hook
if hasattr(self.accelerator, hook_name):
Copy link
Contributor

@ananthsub ananthsub Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel less comfortable adding more dependencies to call_hook. this glues together calling code across 3 distinct interfaces: callback hooks, model hooks, and accelerator hooks. what happens if one takes different arguments? or if we need to call them in different orders?

separately (unrelated to this PR) we're aggregating all of these hooks together in the profiling results. the model/callbacks/accelerator could all have different behavior. i think it's worth splitting out these into separate profiling sections to give provide more fine-grained visibility

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, the accelerator hooks shouldn't be a part of call_hook.

Also agree on the profiling comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @ananthsub!

For the IPU integration we'll be moving as is, but after I want to remove this logic for the accelerators from this call, and call them explicitly through the loop functions as this can foster some nasty bugs as you suggested. Does that sound fair?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SeanNaren that sounds good!

accelerator_hook = getattr(self.accelerator, hook_name)
output = accelerator_hook(*args, **kwargs)
accelerator_output = accelerator_hook(*args, **kwargs)
# Rely on the accelerator output if lightningModule hook returns nothing
if output is None:
output = accelerator_output
carmocca marked this conversation as resolved.
Show resolved Hide resolved

if not skip:
self._cache_logged_metrics()
Expand Down
5 changes: 2 additions & 3 deletions pytorch_lightning/trainer/training_loop.py
Original file line number Diff line number Diff line change
Expand Up @@ -624,9 +624,8 @@ def _on_train_epoch_end_hook(self, processed_epoch_output) -> None:
else:
model_ref.on_train_epoch_end()

# if the PL module doesn't have the hook then call the accelerator
# used to auto-reduce things for the user with Results obj
elif hasattr(self.trainer.accelerator, hook_name):
# call the accelerator hook
if hasattr(self.trainer.accelerator, hook_name):
accelerator_hook = getattr(self.trainer.accelerator, hook_name)
accelerator_hook()

Expand Down