distilroberta-base-sst-2-distilled

This is the distilled version of the RoBERTa model fine-tuned on the SST-2 part of the GLUE dataset. It was obtained from the "teacher" RoBERTa model by using task-specific knowledge distillation. Since it was fine-tuned on the SST-2, the final model is ready to be used in sentiment analysis tasks.

Comparison to the original RoBERTa model:

The final distilled model was able to achieve 92% accuracy on the SST-2 dataset. Given the original RoBERTa achieves 94.8% accuracy on the same dataset with much more parameters (125M) and that this distilled version is nearly twice as fast as it is, the accuracy is impressive.

Final Training Results after Hyperparameter Tuning

Epoch	Training Loss	Validation Loss	Accuracy
1	0.144000	0.379220	0.907110
2	0.108500	0.466671	0.911697
3	0.078600	0.359551	0.915138
4	0.057400	0.358214	0.920872

Usage

To use the model from the 🤗/transformers library

# !pip install transformers

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("azizbarank/distilroberta-base-sst2-distilled")

model = AutoModelForSequenceClassification.from_pretrained("azizbarank/distilroberta-base-sst2-distilled")

Notes:

The link to the model: https://huggingface.co/azizbarank/distilroberta-base-sst2-distilled

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
README.md		README.md
knowledge_distillation.ipynb		knowledge_distillation.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

distilroberta-base-sst-2-distilled

Comparison to the original RoBERTa model:

Final Training Results after Hyperparameter Tuning

Usage

Notes:

About

Releases

Packages

Languages

License

azizbarank/distilroberta-base-sst-2-distilled

Folders and files

Latest commit

History

Repository files navigation

distilroberta-base-sst-2-distilled

Comparison to the original RoBERTa model:

Final Training Results after Hyperparameter Tuning

Usage

Notes:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages