You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am in the process of reproducing the experiment results presented in the BERT paper. More specifically, I have tried to improve the accuracy of BERT-Large model for SQuAD v1.1 dataset by first fine-tuning on TriviaQA then fine-tuning on SQuAD sequentially. Unfortunately, I was unable to reproduce the same results as presented in the paper. Instead, I saw a decline in accuracy after fine-tuning on TriviaQA, as shown below.
Model
Exact Match
F1
BERT-Large (SQuAD v1.1 (2 epochs))
84.06
90.84
BERT-Large(TriviaQA wiki (1 epoch) + SQuAD v1.1 (2 epochs))
83.53
90.35
BERT-Large(TriviaQA web (1 epoch) + SQuAD v1.1 (2 epochs))
83.30
90.36
For your reference, I have used SQuAD v1.1 and each of the Wikipedia and Web subsets for TriviaQA. Training hyperparameters are as below.
Batch Size : 12
Learning Rate : 3e-5
Num Training Epochs : 2
Could you help me check if the above method is correct and also provide me some guidance on how I can reproduce the same results as presented in the BERT paper?
Thank you for the great work and I would appreciate any help.
The text was updated successfully, but these errors were encountered:
Hi,
I am in the process of reproducing the experiment results presented in the BERT paper. More specifically, I have tried to improve the accuracy of BERT-Large model for SQuAD v1.1 dataset by first fine-tuning on TriviaQA then fine-tuning on SQuAD sequentially. Unfortunately, I was unable to reproduce the same results as presented in the paper. Instead, I saw a decline in accuracy after fine-tuning on TriviaQA, as shown below.
For your reference, I have used SQuAD v1.1 and each of the Wikipedia and Web subsets for TriviaQA. Training hyperparameters are as below.
Batch Size : 12
Learning Rate : 3e-5
Num Training Epochs : 2
Could you help me check if the above method is correct and also provide me some guidance on how I can reproduce the same results as presented in the BERT paper?
Thank you for the great work and I would appreciate any help.
The text was updated successfully, but these errors were encountered: