DBA3803 Project

Project done for NUS module, Predictive Analytics in Business, in Spring 2019
Done by: Kai Wei Tan, Daniel Lee, Eugene Ng, Tan Quan Hao, Rayner Tay, Kenny Chuen
Supervised by: Professor He Long

Part of a natural language processing and classification project using latent Dirichlet allocation (LDA) for topic modelling ("LDA.ipynb") and logistic regression for text classification ("/flask/model.py"). This project aims to use machine learning to identify hate speech by classifying text under certain pre-defined categories e.g. 'racist', 'xenophobic', 'sexual', etc. Text data ("text.csv") was scraped from Singaporean forums and Facebook comments on Singaporean pages, and manually tagged by category.

I also created a Flask application ("/flask/main.py") which uses the trained logistic regression model to classify any new text input. If this input is classified under any of our pre-defined categories, the application identifies it as hate speech.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
flask		flask
LDA.ipynb		LDA.ipynb
README.md		README.md
text.csv		text.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DBA3803 Project

About

Releases

Packages

Languages

kaiwei-tan/DBA3803_project

Folders and files

Latest commit

History

Repository files navigation

DBA3803 Project

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages