This is a conclusion of Feedback Prize - English Language Learning here.\
During the competition I only used deberta-v3-base and deberta-v3-large with mean pooling and attention pooling.
This time I'm going to train more model
- deberta-v3-base
- deberta-v3-large
- deberta-v2-xlarge
- roberta-large
- distilbert-base-uncased
Apply different loss rates per target.
info:
- deberta-v3-base
- attention head
- layerwise learning rate decay
- last layer reinitialization(kaiming normal)
- Different loss rates per target
{'cohesion':0.21, 'syntax':0.16, 'vocabulary':0.10, 'phraseology':0.16, 'grammar':0.21, 'conventions':0.16}
- Finetuned with optuna
cv:0.4502 pb:0.4408 pb:0.4396
- deberta-v3-base
- mean head
- layerwise learning rate decay
- last layer reinitialization(kaiming normal)
- Different loss rates per target
{'cohesion':0.21, 'syntax':0.16, 'vocabulary':0.10, 'phraseology':0.16, 'grammar':0.21, 'conventions':0.16}
- Finetuned with optuna
cv:0.4501
- base
- weighted layer
- large
- mean
- large
- attention
- large
- weighted layer
During the competition, I got 129/2654 on public lb,but got only 591/2654 on private lb.So I want to do this competition again and make a late submission.