-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is hparams
really a good practice?
#1735
Comments
What I like about the |
@awaelchli I agree, but |
You raise multiple issues in your first post and I just commented on one. You can't agree with me and then in the same sentence disagree with everything I said xD That doesn't make sense haha. Anyway, I will give my thoughts on all the remaining points: Your point (1): One possibility would be to add the same helper functions from Trainer to the callbacks and loggers, e.g., allowing to construct it with "callback = Callback.from_argparse_args". Your point (2): What would happen is that the argparser would complain that you are trying to add the same argument a second time.
Yes I agree, and it is already possible. Lightning does not prevent us from doing it. I do this in my script so that I can only pass the true hparams to my model and don't have any Trainer args in there. This can be achieved by two parsers and then using Your point (3): We store the hparams in a dict (or Namespace) because otherwise it would not be so easy to restore the model for the user (e.g. when loading from checkpoint). We typically want to save the hyperparameters with the model. I don't understand the rest about the IDE stuff you mention. I think what you want to say is that it is better to just plug all hyperparameters as arguments to the module, but with many hyperparams this becomes a bit hard to manage. Again, how would Lightning know where to grab the hparams? Don't get me wrong, I'm open to changes. I just also see many advantages of the current setup (best practice?) |
@awaelchli Seems like we misunderstood each other? 😉
First of all, I don't really understand why do you need different training scripts for the same module. I thought the reasons behind having You can even take a look at
What I mentioned is expressing a model definition in the Take in mind that you can always initialize the object using
Perhaps a good point, but I don't know PTL pretty well. Can you elaborate more on what the problem is? ThoughtsI think the topic itself relates very much to how the training takes place. Although it relates to Hydra integration, a similar discussion can be found here #807. @yukw777 looks for the cleanest approach to integrate this, ending up with something considered a bad practice in PTL docs:
In this case, I don't think it's a good idea to override In addition:
seems very alike, meaning that |
@mateuszpieniak I don't quite get why you think specifying |
When it comes to
It's fair to point that the docs refer only to
So if you need to "construct an "empty" model before loading the weights." use a constructor for that, hence the name in object-oriented programming 😉 For example, there is a reason behind As you said My whole point with that initialization is that I find it an anti-pattern, not a good practice. You can also see that under this link. PL basically does the same, having an unknown dynamic structure (dict), that you dynamically read from. |
@mateuszpieniak ah i see what you mean about the way I initialize Another thing to consider about Having said that, I do think that I believe this discussion is also related to: #1755 |
Also related: #1766 |
Played around with Maybe we need to have a new abstract method on class LightningModule(nn.Module):
def __init__(self, *args, **kwargs):
super().__init__()
self.hparams = {...} # we can still have hparams initialized here for logging.
@property
@abstractmethod
def constructor_args(self) -> Tuple[Tuple[Any, ...], Dict[str, Any]]:
"""Everything returned by this needs to be picklable"""
return (args, kwargs)
@classmethod
def _load_model_state(cls, checkpoint: Dict[str, Any]) -> 'LightningModule':
args, kwargs = checkpoint['constructor_args']
model = cls(*args, **kwargs)
model.load_state_dict(checkpoint['state_dict'])
# give model a chance to load something
model.on_load_checkpoint(checkpoint)
return model
class TrainerIOMixin(ABC):
def dump_checkpoint(self):
# add the hparams and state_dict from the model
model = self.get_model()
checkpoint['constructor_args'] = model.constructor_args()
checkpoint['state_dict'] = model.state_dict()
# save native amp scaling
if self.use_amp and self.use_native_amp:
checkpoint['native_amp_scaling_state'] = self.scaler.state_dict()
# give the model a chance to add a few things
model.on_save_checkpoint(checkpoint)
return checkpoint |
@awaelchli @Borda @williamFalcon thoughts? |
The same logic above also applies to Now we come to the question of what can we do about it?
I imagined that for each
Sure, we can still have this, but the logger object has to be created on the same "code level" as the model object is. For example |
these aren't really constraints to make PL work... you can use PL all day long without hparams. The problems you'll eventually run into are:
So, as a best practice we usually store these details into a csv or into the checkpoint itself. Do you have any other ways that might be better? Ideally we don't need hparams but instead read what defaults the model was init with and save those. However, i don't think python has that ability. |
@williamFalcon what do you think about my proposal? Basically have people define how their models should be (de)serialized, and PL simply follows that. We can also have people specify the hyperparameters they want tracked with the loggers themselves as @mateuszpieniak suggested. |
My point is that you cannot call "a good practice" something that is the necessary condition for the functionality to work. We should at least specify that in docs that some functionality won't work unless you design your class in a certain way.
Now I see the reason! 😉 Currently I see two solutions to make it more OOP friendly. Frankly speaking I don't know whether they are better.
Eventually I think that |
@mateuszpieniak nice suggestions! I like your second suggestion too. I guess I thought it'd be better to have the user be more explicit on what they want to save rather than trying to save them automatically, hence my suggestion.. but i guess we can be more clever since we're basically just saving what needs to be passed to |
Thinking about it more.. I don't think Option 2 would work, @mateuszpieniak, when there are multiple levels of inheritance. Consider this:
Then, My suggestion and your first suggestion would work b/c it saves the constructor arguments at the leaf level. |
@yukw777 You're right 😿
What do you think? I personally like both approaches, but 4) might be a bit risky since we fully rely on one fairly new technology. If we need to control the object's creation process, then we should consider a factory design pattern? I mean we don't just create the object explicitly, but we have some different object that controls that. |
fixed on master |
❓ Questions and Help
I am a bit confused about good practices in PyTorchLightning, having in mind
hparams
in particular. I will provide some of my thoughts about this topic. The docs says:EarlyStopping
callback e.g. patience? The only way I can think of is using bad practice.Please let me know how to do it in a nice way.
Trainer
andLitModel
both use the same argument name for different purposes? In the docs, they are combined into a parent parser.Perhaps a good practice is to parse them separately?
add_model_specific_args
static method & pass onlyhparams
into__init__
is a good idea.a) If I properly assigned
__init__
arguments intoself
, then IDE would suggest to me theUnresolved reference
error before running the code. It happened to me many times to have some leftovers hanging out because of thishparams
.b) Basically
add_model_specific_args
requires me to track all occurrences ofself.hparams
in the code visually! If I properly assigned__init__
arguments intoself
, I would need just to take a simple look at the__init__
section.I think I know the reasons why
hparams
was introduced (easy tracking in tensorboard or rapid research), but I believe that this concept should be reconsidered.The text was updated successfully, but these errors were encountered: