Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About copying a param with shape torch.Size([13380,468]) from checkpoint,the shape in current model is torch.Size([13475,468])]) #32

Closed
maoao686868 opened this issue Dec 8, 2020 · 4 comments

Comments

@maoao686868
Copy link

maoao686868 commented Dec 8, 2020

@hobincar
Hello, when we download pretrained Recnet_Global model and pretreated features from MSVD dataset to python run.py. I come across this problem:
Traceback (most recent call last):
File "run.py", line 96, in
run('/root/Workspace/rn/checkpoint/RecNet-global_MSVD(1).ckpt')
File "run.py", line 44, in run
decoder.load_state_dict(checkpoint['decoder'])
File "/root/anaconda3/envs/rn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Decoder:
size mismatch for embedding.weight: copying a param with shape torch.Size([13380, 468]) from checkpoint, the shape in current model is torch.Size([13475, 468]).
size mismatch for out.weight: copying a param with shape torch.Size([13380, 512]) from checkpoint, the shape in current model is torch.Size([13475, 512]).
size mismatch for out.bias: copying a param with shape torch.Size([13380]) from checkpoint, the shape in current model is torch.Size([13475]).
Please help me, I feel very anxious about this problem.

@maoao686868
Copy link
Author

@hobincar
plus.I use environment as follows : using anaconda3 create python3.6, pytorch 1.1,torchvision0.3.0

@hobincar
Copy link
Owner

Hi. The error message is saying that the model has 13475 vocabs but the checkpoint has 13380 vocabs. I guess the checkpoint is saved using the old implementation. I don't have much time for handling this for now, but I'll look at it. Maybe you can train the model by yourself instead of using my checkpoint file.

@maoao686868
Copy link
Author

@hobincar Sorry , this problem is because of my wrong virtual enviroment. I have sovled this problem, thanks for your reply.

@hobincar
Copy link
Owner

That's great :) Thank you for the information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants