Italian Hate Speech Corpus (IHSC)

Corpus description

This is a Twitter corpus built with the aim of representing and analyzing hate speech against some minority groups in Italy: immigrants in particular, but also Muslims and Roma.

Similar to the one provided by Wasseem and Hovy (2016), the corpus released here only contains the tweets' ID and their annotation. The content of each tweet can thus be retrieved using the Twitter APIs and querying the corresponding ID.

The corpus development forms part of the Hate Speech Monitoring program coordinated by the Computer Science Department of the University of Turin (Italy).

References

If you use the resource, please cite:

@InProceedings{SanguinettiEtAlLREC2018,
  author    = {Manuela Sanguinetti and Fabio Poletto and Cristina Bosco and Viviana Patti and Marco Stranisci},
  title     = {An Italian Twitter Corpus of Hate Speech against Immigrants},
  booktitle = {Proceedings of the 11th Conference on Language Resources and Evaluation (LREC2018), May 2018, Miyazaki, Japan},
  month     = {},
  year      = {2018},
  address   = {},
  publisher = {},
  pages     = {2798--2895},
  url       = {}
}

Other references:

Poletto F., Stranisci M.,Sanguinetti M., Patti V., Bosco C. (2017) Hate speech annotation: Analysis of an Italian Twitter corpus. In: Proceedings of the 4th Italian Conference on Computational Linguistics (CLiC-it 2017), Rome, Italy.

Acknowledgements

The work is funded by Progetto di Ateneo/CSP 2016 (Immigrants, Hate and Prejudice in Social Media, project S1618_L2_BOSC_01) and by Fondazione CRT (Hate Speech and Social Media, project n. 2016.0688).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Italian Hate Speech Corpus (IHSC)

Corpus description

References

Other references:

Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

Italian Hate Speech Corpus (IHSC)

Corpus description

References

Other references:

Acknowledgements