You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! Thanks for your nice work. I am interested in the training strategy shown in the paper,
"we first fine-tune the BERT model, then freeze BERT to fine-tune the glyph layer,and finally jointly tune both layers until convergence. "
Could you give more details? I am not sure how you start the training.
Do you firstly fine-tune the BERT model via freezing glyph layer in the glyce_bert model or just fine-tune a BERT-only model and then load the weights and freeze them in the glyce_bert model to fine-tune the glyph layer? And how many epochs do you train for each stage?
Looking forward to your reply!
The text was updated successfully, but these errors were encountered:
Hi! Thanks for your nice work. I am interested in the training strategy shown in the paper,
"we first fine-tune the BERT model, then freeze BERT to fine-tune the glyph layer,and finally jointly tune both layers until convergence. "
Could you give more details? I am not sure how you start the training.
Do you firstly fine-tune the BERT model via freezing glyph layer in the glyce_bert model or just fine-tune a BERT-only model and then load the weights and freeze them in the glyce_bert model to fine-tune the glyph layer? And how many epochs do you train for each stage?
Looking forward to your reply!
The text was updated successfully, but these errors were encountered: