Question about the Training Strategy #21

cooelf · 2019-10-29T03:36:12Z

Hi! Thanks for your nice work. I am interested in the training strategy shown in the paper,

"we first fine-tune the BERT model, then freeze BERT to fine-tune the glyph layer,and finally jointly tune both layers until convergence. "

Could you give more details? I am not sure how you start the training.
Do you firstly fine-tune the BERT model via freezing glyph layer in the glyce_bert model or just fine-tune a BERT-only model and then load the weights and freeze them in the glyce_bert model to fine-tune the glyph layer? And how many epochs do you train for each stage?

Looking forward to your reply!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the Training Strategy #21

Question about the Training Strategy #21

cooelf commented Oct 29, 2019

Question about the Training Strategy #21

Question about the Training Strategy #21

Comments

cooelf commented Oct 29, 2019