Skip to content

Latest commit

 

History

History
12 lines (10 loc) · 1.38 KB

README.md

File metadata and controls

12 lines (10 loc) · 1.38 KB

Toxic Comment Classification


This repo contains the Toxic Comment Classification project as part of my data science portfolio. [Google](https://www.technologyreview.com/s/603735/its-easy-to-slip-toxic-language-past-alphabets-toxic-comment-detector) defines a toxic comment as "a rude, disrespectful, or unreasonable comment that is likely to make you leave a discussion." The objective of this project is to identify and classify toxic comments to help online discussion become more productive and respectful. The toxicity scores are generated by a machine learning model trained using a dataset of comments from Wikipedia’s talk page edits downloaded from [Kaggle](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge). The scores are akin to probabilities and range from 0 (non-toxic) to 1 (highly toxic). The model is deployed as a REST API using Flask. Flask is a micro web framework written in Python. It can create a REST API that allows you to send data, and receive a prediction as a response.

Examples of toxic comments in the dataset:

  • "Fuck you, block me, you faggot pussy!"
  • "Stupid peace of shit stop deleting my stuff asshole go die and fall in a hole go to hell!"
  • "Well I dont give a fuck what you think you bitch ass motherfucker"
  • "Mine dispeared, somebody wax my ass."
  • "You are a raging faggot. Kill yourself."

URL: https://toxic-comment-gcp.appspot.com/