Skip to content

IdSarcasm: Benchmarking and Evaluating Language Models for Indonesian Sarcasm Detection (Suhartono et al., 2024)

License

Notifications You must be signed in to change notification settings

w11wo/id_sarcasm

Repository files navigation

IdSarcasm: Benchmarking and Evaluating Language Models for Indonesian Sarcasm Detection

This project aims to benchmark and evaluate various language models for sarcasm detection in Indonesian. We experiment with classical machine learning models, fine-tuned transformer models, and zero-shot classification using large language models. All of our models, datasets, and results are openly available via HuggingFace Hub.

Pre-trained Models

Base Model #params Reddit Twitter
IndoNLU IndoBERT Base 124M IndoNLU IndoBERT Base Reddit IndoNLU IndoBERT Base Twitter
IndoNLU IndoBERT Large 335M IndoNLU IndoBERT Large Reddit IndoNLU IndoBERT Large Twitter
IndoLEM IndoBERT Base 111M IndoLEM IndoBERT Base Reddit IndoLEM IndoBERT Base Twitter
mBERT Base 178M mBERT Base Reddit mBERT Base Twitter
XLM-R Base 278M XLM-R Base Reddit XLM-R Base Twitter
XLM-R Large 560M XLM-R Large Reddit XLM-R Large Twitter

Dataset

We used two datasets for training and evaluation, including a novel dataset of Reddit comments and a Twitter dataset. The Reddit dataset consists of 14,116 comments, while the Twitter dataset consists of 12,861 tweets.

Dataset Link
Reddit Indonesia Sarcastic HuggingFace
Twitter Indonesia Sarcastic HuggingFace

Results

We compared the performance of various models on both the Reddit and Twitter datasets. The evaluation metric used is the F1-score.

Model Reddit F1-score Twitter F1-score
Classical
Logistic Regression 0.4887 0.7142
Naive Bayes 0.4591 0.6721
SVC 0.4467 0.6782
Fine-tuning
IndoBERT Base (IndoNLU) 0.6100 0.7273
IndoBERT Large (IndoNLU) 0.6184 0.7160
IndoBERT Base (IndoLEM) 0.5671 0.6462
mBERT 0.5338 0.6467
XLM-R Base 0.5690 0.7386
XLM-R Large 0.6274 0.7692
Zero-shot
BLOOMZ-560M 0.3870 0.3916
BLOOMZ-1.1B 0.3944 0.3987
BLOOMZ-1.7B 0.3758 0.3885
BLOOMZ-3B 0.4000 0.3847
BLOOMZ-7.1B 0.4036 0.3968
mT0 Small 0.4000 0.3988
mT0 Base 0.3990 0.3985
mT0 Large 0.3998 0.3989
mT0 XL 0.4001 0.3988

Citation

If you use this work in your research, please cite:

@article{10565877,
  author = {Suhartono, Derwin and Wongso, Wilson and Tri Handoyo, Alif},
  journal = {IEEE Access}, 
  title = {IdSarcasm: Benchmarking and Evaluating Language Models for Indonesian Sarcasm Detection}, 
  year = {2024},
  volume = {12},
  number = {},
  pages = {87323-87332},
  keywords = {Social networking (online);Blogs;Machine learning;Feature extraction;Accuracy;Deep learning;Electronic mail;Natural language processing;Sentiment analysis;Low-resource data;low-resource languages;Indonesian sarcasm detection;natural language processing;sarcasm detection;sentiment analysis},
  doi = {10.1109/ACCESS.2024.3416955}
}

Author

GitHub Profile

References

@inproceedings{10.1145/3406601.3406624,
    author = {Khotijah, Siti and Tirtawangsa, Jimmy and Suryani, Arie A.},
    title = {Using LSTM for Context Based Approach of Sarcasm Detection in Twitter},
    year = {2020},
    isbn = {9781450377591},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3406601.3406624},
    doi = {10.1145/3406601.3406624},
    booktitle = {Proceedings of the 11th International Conference on Advances in Information Technology},
    articleno = {19},
    numpages = {7},
    keywords = {context, Sarcasm detection, paragraph2vec, lstm, deep learning},
    location = {, Bangkok, Thailand, },
    series = {IAIT '20}
}

@article{Ranti2020IndonesianSD,
    title={Indonesian Sarcasm Detection Using Convolutional Neural Network},
    author={Kiefer Stefano Ranti and Abba Suganda Girsang},
    journal={International Journal of Emerging Trends in Engineering Research},
    year={2020},
    url={https://doi.org/10.30534/ijeter/2020/10892020}
}

@article{academicReddit,
    title= {Reddit comments/submissions 2005-06 to 2023-09},
    journal= {},
    author= {stuck_in_the_matrix, Watchful1, RaiderBDev},
    year= {},
    url= {},
    abstract= {Reddit comments and submissions from 2005-06 to 2023-09 collected by pushshift and u/RaiderBDev. These are zstandard compressed ndjson files. Example python scripts for parsing the data can be found here https://github.com/Watchful1/PushshiftDumps},
    keywords= {reddit},
    terms= {},
    license= {},
    superseded= {}
}

@inproceedings{abu-farha-etal-2022-semeval,
    title = "{S}em{E}val-2022 Task 6: i{S}arcasm{E}val, Intended Sarcasm Detection in {E}nglish and {A}rabic",
    author = "Abu Farha, Ibrahim  and
      Oprea, Silviu Vlad  and
      Wilson, Steven  and
      Magdy, Walid",
    editor = "Emerson, Guy  and
      Schluter, Natalie  and
      Stanovsky, Gabriel  and
      Kumar, Ritesh  and
      Palmer, Alexis  and
      Schneider, Nathan  and
      Singh, Siddharth  and
      Ratan, Shyam",
    booktitle = "Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)",
    month = jul,
    year = "2022",
    address = "Seattle, United States",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.semeval-1.111",
    doi = "10.18653/v1/2022.semeval-1.111",
    pages = "802--814",
}