Translate Error #65

qtxue · 2019-04-25T03:08:03Z

hello,thanks for your code,
I run the train.sh as you writed in README, the command is:

CUDA_VISIBLE_DEVICES=0 nohup > nohup_3.log 2>&1 python3 train.py \
--exp_name test_enfr_mlm \
--dump_path ./dumped2/ \
--data_path ./data/processed/en-fr/ \
--lgs 'en-fr' \
--clm_steps '' \
--mlm_steps 'en,fr' \
--emb_dim 512 \
--n_layers 4 \
--n_heads 8 \
--dropout 0.1 \
--attention_dropout 0.1 \
--gelu_activation true \
--batch_size 32 \
--bptt 256 \
--optimizer adam,lr=0.0001 \
--epoch_size 200000 \
--validation_metrics _valid_mlm_ppl \
--stopping_criterion _valid_mlm_ppl,3 &

and then,I want to use the saved model(The program has not finished running，the model is saved during the running of the program) to translate some sentences, the command is:

head -n 10 /home/qtxue/dqxu/data/para/dev/newstest2014-fren-src.fr.60000 | \
CUDA_VISIBLE_DEVICES=4 python3 translate.py --exp_name translate \
--src_lang fr --tgt_lang en \
--model_path /home/qtxue/best-valid_mlm_ppl.pth --output_path /home/qtxue/output.en

some error appears:
INFO - 04/25/19 10:48:43 - 0:00:00 - ============ Initialized logger ============
INFO - 04/25/19 10:48:43 - 0:00:00 - batch_size: 32
command: python translate.py --exp_name translate --src_lang fr --tgt_lang en --model_path '/home/qtxue/checkpoint.pth' --output_path '/home/qtxue/output.en' --exp_id "19njy282kc"
dump_path: ./dumped/translate/19njy282kc
exp_id: 19njy282kc
exp_name: translate
fp16: False
model_path: /home/qtxue/checkpoint.pth
output_path: /home/qtxue/output.en
src_lang: fr
tgt_lang: en
INFO - 04/25/19 10:48:43 - 0:00:00 - The experiment will be stored in ./dumped/translate/19njy282kc

INFO - 04/25/19 10:48:43 - 0:00:00 - Running command: python translate.py --exp_name translate --src_lang fr --tgt_lang en --model_path '/home/qtxue/checkpoint.pth' --output_path '/home/qtxue/output.en'

INFO - 04/25/19 10:48:48 - 0:00:05 - Supported languages: en, fr
Traceback (most recent call last):
File "translate.py", line 150, in
main(params)
File "translate.py", line 80, in main
encoder.load_state_dict(reloaded['encoder'])
KeyError: 'encoder'
is there some place need to modify or some thing is wrong in my operation?

The text was updated successfully, but these errors were encountered:

qtxue · 2019-04-25T05:26:45Z

sorry, I made a naive mistake.

glample · 2019-04-25T08:17:39Z

What was the mistake? Did you figure it out?

odel-odel · 2019-05-06T10:35:03Z

Hi,
I get the same error , can you help me save this ?
In the command i loaded "best-valid_en-fr_mt_bleu.pth" for model file

Traceback (most recent call last):
File "translate.py", line 150, in
main(params)
File "translate.py", line 80, in main
encoder.load_state_dict(reloaded['encoder'])
File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for TransformerModel:
Missing key(s) in state_dict: "pred_layer.proj.weight", "pred_layer.proj.bias".

glample · 2019-05-06T14:23:29Z

This means that you are trying to reload a component that requires to have an output layer, when the reloaded model does not have any output layer. Can you try to set with_output=False for the encoder here: https://github.com/facebookresearch/XLM/blob/master/src/model/__init__.py#L127 and see if this helps?

odel-odel · 2019-05-07T06:08:47Z

Hi,
I have already set this parameter to False in the init file...
in translate.py there is another loading command of the model

I will paste the full error :

Traceback (most recent call last):
File "translate.py", line 153, in
main(params)
File "translate.py", line 80, in main
encoder.load_state_dict(reloaded['encoder'])
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for TransformerModel:
Missing key(s) in state_dict: "position_embeddings.weight", "lang_embeddings.weight", "embeddings.weight", "layer_norm_emb.bias", "layer_norm_emb.weight", "attentions.0.q_lin.bias", "attentions.0.q_lin.weight", "attentions.0.k_lin.bias", "attentions.0.k_lin.weight", "attentions.0.v_lin.bias", "attentions.0.v_lin.weight", "attentions.0.out_lin.bias", "attentions.0.out_lin.weight", "attentions.1.q_lin.bias", "attentions.1.q_lin.weight", "attentions.1.k_lin.bias", "attentions.1.k_lin.weight", "attentions.1.v_lin.bias", "attentions.1.v_lin.weight", "attentions.1.out_lin.bias", "attentions.1.out_lin.weight", "attentions.2.q_lin.bias", "attentions.2.q_lin.weight", "attentions.2.k_lin.bias", "attentions.2.k_lin.weight", "attentions.2.v_lin.bias", "attentions.2.v_lin.weight", "attentions.2.out_lin.bias", "attentions.2.out_lin.weight", "attentions.3.q_lin.bias", "attentions.3.q_lin.weight", "attentions.3.k_lin.bias", "attentions.3.k_lin.weight", "attentions.3.v_lin.bias", "attentions.3.v_lin.weight", "attentions.3.out_lin.bias", "attentions.3.out_lin.weight", "attentions.4.q_lin.bias", "attentions.4.q_lin.weight", "attentions.4.k_lin.bias", "attentions.4.k_lin.weight", "attentions.4.v_lin.bias", "attentions.4.v_lin.weight", "attentions.4.out_lin.bias", "attentions.4.out_lin.weight", "attentions.5.q_lin.bias", "attentions.5.q_lin.weight", "attentions.5.k_lin.bias", "attentions.5.k_lin.weight", "attentions.5.v_lin.bias", "attentions.5.v_lin.weight", "attentions.5.out_lin.bias", "attentions.5.out_lin.weight", "layer_norm1.0.bias", "layer_norm1.0.weight", "layer_norm1.1.bias", "layer_norm1.1.weight", "layer_norm1.2.bias", "layer_norm1.2.weight", "layer_norm1.3.bias", "layer_norm1.3.weight", "layer_norm1.4.bias", "layer_norm1.4.weight", "layer_norm1.5.bias", "layer_norm1.5.weight", "ffns.0.lin1.bias", "ffns.0.lin1.weight", "ffns.0.lin2.bias", "ffns.0.lin2.weight", "ffns.1.lin1.bias", "ffns.1.lin1.weight", "ffns.1.lin2.bias", "ffns.1.lin2.weight", "ffns.2.lin1.bias", "ffns.2.lin1.weight", "ffns.2.lin2.bias", "ffns.2.lin2.weight", "ffns.3.lin1.bias", "ffns.3.lin1.weight", "ffns.3.lin2.bias", "ffns.3.lin2.weight", "ffns.4.lin1.bias", "ffns.4.lin1.weight", "ffns.4.lin2.bias", "ffns.4.lin2.weight", "ffns.5.lin1.bias", "ffns.5.lin1.weight", "ffns.5.lin2.bias", "ffns.5.lin2.weight", "layer_norm2.0.bias", "layer_norm2.0.weight", "layer_norm2.1.bias", "layer_norm2.1.weight", "layer_norm2.2.bias", "layer_norm2.2.weight", "layer_norm2.3.bias", "layer_norm2.3.weight", "layer_norm2.4.bias", "layer_norm2.4.weight", "layer_norm2.5.bias", "layer_norm2.5.weight", "pred_layer.proj.bias", "pred_layer.proj.weight".
Unexpected key(s) in state_dict: "module.position_embeddings.weight", "module.lang_embeddings.weight", "module.embeddings.weight", "module.layer_norm_emb.weight", "module.layer_norm_emb.bias", "module.attentions.0.q_lin.weight", "module.attentions.0.q_lin.bias", "module.attentions.0.k_lin.weight", "module.attentions.0.k_lin.bias", "module.attentions.0.v_lin.weight", "module.attentions.0.v_lin.bias", "module.attentions.0.out_lin.weight", "module.attentions.0.out_lin.bias", "module.attentions.1.q_lin.weight", "module.attentions.1.q_lin.bias", "module.attentions.1.k_lin.weight", "module.attentions.1.k_lin.bias", "module.attentions.1.v_lin.weight", "module.attentions.1.v_lin.bias", "module.attentions.1.out_lin.weight", "module.attentions.1.out_lin.bias", "module.attentions.2.q_lin.weight", "module.attentions.2.q_lin.bias", "module.attentions.2.k_lin.weight", "module.attentions.2.k_lin.bias", "module.attentions.2.v_lin.weight", "module.attentions.2.v_lin.bias", "module.attentions.2.out_lin.weight", "module.attentions.2.out_lin.bias", "module.attentions.3.q_lin.weight", "module.attentions.3.q_lin.bias", "module.attentions.3.k_lin.weight", "module.attentions.3.k_lin.bias", "module.attentions.3.v_lin.weight", "module.attentions.3.v_lin.bias", "module.attentions.3.out_lin.weight", "module.attentions.3.out_lin.bias", "module.attentions.4.q_lin.weight", "module.attentions.4.q_lin.bias", "module.attentions.4.k_lin.weight", "module.attentions.4.k_lin.bias", "module.attentions.4.v_lin.weight", "module.attentions.4.v_lin.bias", "module.attentions.4.out_lin.weight", "module.attentions.4.out_lin.bias", "module.attentions.5.q_lin.weight", "module.attentions.5.q_lin.bias", "module.attentions.5.k_lin.weight", "module.attentions.5.k_lin.bias", "module.attentions.5.v_lin.weight", "module.attentions.5.v_lin.bias", "module.attentions.5.out_lin.weight", "module.attentions.5.out_lin.bias", "module.layer_norm1.0.weight", "module.layer_norm1.0.bias", "module.layer_norm1.1.weight", "module.layer_norm1.1.bias", "module.layer_norm1.2.weight", "module.layer_norm1.2.bias", "module.layer_norm1.3.weight", "module.layer_norm1.3.bias", "module.layer_norm1.4.weight", "module.layer_norm1.4.bias", "module.layer_norm1.5.weight", "module.layer_norm1.5.bias", "module.ffns.0.lin1.weight", "module.ffns.0.lin1.bias", "module.ffns.0.lin2.weight", "module.ffns.0.lin2.bias", "module.ffns.1.lin1.weight", "module.ffns.1.lin1.bias", "module.ffns.1.lin2.weight", "module.ffns.1.lin2.bias", "module.ffns.2.lin1.weight", "module.ffns.2.lin1.bias", "module.ffns.2.lin2.weight", "module.ffns.2.lin2.bias", "module.ffns.3.lin1.weight", "module.ffns.3.lin1.bias", "module.ffns.3.lin2.weight", "module.ffns.3.lin2.bias", "module.ffns.4.lin1.weight", "module.ffns.4.lin1.bias", "module.ffns.4.lin2.weight", "module.ffns.4.lin2.bias", "module.ffns.5.lin1.weight", "module.ffns.5.lin1.bias", "module.ffns.5.lin2.weight", "module.ffns.5.lin2.bias", "module.layer_norm2.0.weight", "module.layer_norm2.0.bias", "module.layer_norm2.1.weight", "module.layer_norm2.1.bias", "module.layer_norm2.2.weight", "module.layer_norm2.2.bias", "module.layer_norm2.3.weight", "module.layer_norm2.3.bias", "module.layer_norm2.4.weight", "module.layer_norm2.4.bias", "module.layer_norm2.5.weight", "module.layer_norm2.5.bias".

odel-odel · 2019-05-07T07:45:51Z

I try to change line 80 in translate.py to

if all([k.startswith('module.') for k in reloaded['encoder'].keys()]):
enc_reload = {k[len('module.'):]: v for k, v in reloaded['encoder'].items()}
encoder.load_state_dict(enc_reload)
else:
encoder.load_state_dict(reloaded['encoder'])

And now , I get , again, shorter error ;
Traceback (most recent call last):
File "translate.py", line 156, in
main(params)
File "translate.py", line 85, in main
encoder.load_state_dict(enc_reload)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for TransformerModel:
Missing key(s) in state_dict: "pred_layer.proj.bias", "pred_layer.proj.weight".

glample · 2019-05-07T16:38:15Z

Indeed, the problem is that you are trying to reload a checkpoint I think (because the .module are there everywhere). This part of the code takes care of it: https://github.com/facebookresearch/XLM/blob/master/src/model/__init__.py#L155-L156 but this is separated from the translate script. What you did is something similar that should take care of it.

Did you also set with_output=False here? https://github.com/facebookresearch/XLM/blob/master/translate.py#L78

odel-odel · 2019-05-08T07:47:59Z

Now it works , Thanks !!!

tuyu95 · 2019-07-25T23:29:23Z

sorry, I made a naive mistake.

I meet the same problem with you. I use the pretrained model and then get the same error. Would you tell me your solution?

glample closed this as completed May 8, 2019

saikoneru mentioned this issue Aug 13, 2020

running translate.py #273

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translate Error #65

Translate Error #65

qtxue commented Apr 25, 2019 •

edited

Loading

qtxue commented Apr 25, 2019

glample commented Apr 25, 2019

odel-odel commented May 6, 2019 •

edited

Loading

glample commented May 6, 2019

odel-odel commented May 7, 2019 •

edited

Loading

odel-odel commented May 7, 2019 •

edited

Loading

glample commented May 7, 2019

odel-odel commented May 8, 2019

tuyu95 commented Jul 25, 2019

Translate Error #65

Translate Error #65

Comments

qtxue commented Apr 25, 2019 • edited Loading

qtxue commented Apr 25, 2019

glample commented Apr 25, 2019

odel-odel commented May 6, 2019 • edited Loading

glample commented May 6, 2019

odel-odel commented May 7, 2019 • edited Loading

I will paste the full error :

odel-odel commented May 7, 2019 • edited Loading

glample commented May 7, 2019

odel-odel commented May 8, 2019

tuyu95 commented Jul 25, 2019

qtxue commented Apr 25, 2019 •

edited

Loading

odel-odel commented May 6, 2019 •

edited

Loading

odel-odel commented May 7, 2019 •

edited

Loading

odel-odel commented May 7, 2019 •

edited

Loading