Toxic Comment Classification

This repo contains the Toxic Comment Classification project as part of my data science portfolio. [Google](https://www.technologyreview.com/s/603735/its-easy-to-slip-toxic-language-past-alphabets-toxic-comment-detector) defines a toxic comment as "a rude, disrespectful, or unreasonable comment that is likely to make you leave a discussion." The objective of this project is to identify and classify toxic comments to help online discussion become more productive and respectful. The toxicity scores are generated by a machine learning model trained using a dataset of comments from Wikipedia’s talk page edits downloaded from [Kaggle](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge). The scores are akin to probabilities and range from 0 (non-toxic) to 1 (highly toxic). The model is deployed as a REST API using Flask. Flask is a micro web framework written in Python. It can create a REST API that allows you to send data, and receive a prediction as a response.

Examples of toxic comments in the dataset:

"Fuck you, block me, you faggot pussy!"
"Stupid peace of shit stop deleting my stuff asshole go die and fall in a hole go to hell!"
"Well I dont give a fuck what you think you bitch ass motherfucker"
"Mine dispeared, somebody wax my ass."
"You are a raging faggot. Kill yourself."

URL: https://toxic-comment-gcp.appspot.com/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Toxic Comment Classification

Files

README.md

Latest commit

History

README.md

File metadata and controls

Toxic Comment Classification