-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weight decay causes loaded model to not match saved one #1201
Comments
Is there a way I can force my existing trained models to use the same weight decay as when they were saved, so I can reproduce their results? |
Ouch, this was probably broken when we switched to the new model loading format. There are two ways to fix this (cc @xunzhang ):
Before we did number 1 because we didn't want to trigger a heavy operation unrelated to saving every time we save the model, but I think 2 is probably conceptually simpler and wouldn't require changing the model saving format. |
But since I'm setting the weight decay globally to the same value, shouldn't it apply at model load time too? |
@neubig will take a look. |
Thanks. So if I know how many updates have been done to a model and what was the weight decay rate, can I use this information to recover the exact model (as a workaround for now)? |
Yes. The current weight decay value will be |
Thanks, it worked! |
Addressed the issue in above pull request. |
Fixed by #1206 |
This has caused me a lot of frustration until I finally figured out why my saved models' results don't match when I load them.
After training a model and saving it, I expect it to produce exactly the same results as just before it was saved (assuming no updates were done in between, of course). However, this is not the case when using weight decay. Looks like the weight decay does not apply to the loaded model, even though it is set globally in
dynet_config
.Minimal working example:
Changing
weight_decay
to 0 fixes the problem.Related to #917.
The text was updated successfully, but these errors were encountered: