-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make default tqdm dict overridable #749
Make default tqdm dict overridable #749
Conversation
@kuynzereb fails on GPU tests...
|
I think I have fixed it. But actually I don't understand how did you get these fails, because travis shows no errors and I was not able to reproduce it locally... |
these are GPU tests. Travis doesn't run GPU tests |
Ah, okay, got it. Moreover several tests require more than 1 gpu, so it turned out that I can't run the full test-suite because now I have only 1 gpu on my machine :( |
tackling #486 |
tqdm_dict['gpu'] = '{}'.format(torch.cuda.current_device()) | ||
|
||
return tqdm_dict | ||
return dict(**ref_model.get_tqdm_dict(), **self.tqdm_metrics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would go with
if self.tqdm_metrics == None:
return ref_model.get_tqdm_dict()
else:
return self.tqdm_metrics
so that that the default dict can be completely overloaded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it is already can be completely overloaded :)
self.tqdm_metrics
is not the default, it is totally determined by progress_bar
dict from the training_end()
and validation_end()
. And model.get_tqdm_dict()
adds some additional information like version number, gpu, etc. With this PR it is also totally controlled by the user.
So if you return an empty dict in model.get_tqdm_dict()
and have no progress_bar
dicts you will have a tqdm progress bar without any additional info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, sorry for the noise, I thought that get_tqdm_dict
was not exposed. Is it discoverable? Should something be added somewhere in the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, now it can be found as one of the methods of LightningModule
. Maybe we can add some information about how progress bar is composed, but I don't understand where it can be inserted in the current docs.
@kuynzereb @CarloLucibello what's the verdict? @Borda |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -1211,6 +1211,25 @@ def on_save_checkpoint(self, checkpoint): | |||
""" | |||
pass | |||
|
|||
def get_tqdm_dict(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it as property like a tqdm_params
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, at first I made it property. But then I accidentally found out that it will cause misleading error messages, if there is an attribute error in the user code. What I mean, if you make
@property
def tqdm_dict(self):
_ = self.trainer.some_non_existing_attribute
return {}
Then after
return dict(**ref_model.tqdm_dict, **self.tqdm_metrics)
you will get
AttributeError: 'CoolSystem' object has no attribute 'tqdm_dict'
I don't know how to fix it, but it seems to be the known issue: https://stackoverflow.com/questions/15348331/why-does-property-decorator-show-object-has-no-attribute
Overall, I decided to stay without property, so the user gets the clear error message if there is some error in his code.
looks fine |
Resolves #629.
This PR moves default
tqdm_dict
definition fromTrainer
toLightningModule
, so it can be overridden by the user. Now you just need to overwriteget_tqdm_dict()
on the module to define your own default tqdm items.Also I have removed
batch_idx
andgpu
from the default tqdm dict and changed allbatch/s
toit/s
as was suggested by @CarloLucibello.@tullie, @williamFalcon, @Borda