Toxic-Comment-Classification-Challenge

The notebook is the work I have done to compete in the kaggle Natural Language Processing Challenge : Jigsaw Toxic Comment Classification Challenge.
https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge

The task was to classify online comments into 6 categories: toxic, severve_toxic, obscene, threat, insult, identity_hate. The competition metric was the average of the individual AUCs of each predicted class.

The code achives a score of 0.9823 on the private Leaderbord.

I used the concatenation of two pre-trained embeddings : fastText crawl-300d-2M.vec and glove.840B.300d.txt. The first can be found here: https://fasttext.cc/docs/en/english-vectors.html.
The second can be found here : https://nlp.stanford.edu/projects/glove/

You can download the data from kaggle here.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
Toxic Comment Classification Challenge.ipynb		Toxic Comment Classification Challenge.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Toxic-Comment-Classification-Challenge

About

Releases

Packages

Languages

ppontisso/Toxic-Comment-Classification-Challenge

Folders and files

Latest commit

History

Repository files navigation

Toxic-Comment-Classification-Challenge

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages