Skip to content

Project done for module Predictive Analytics in Business

Notifications You must be signed in to change notification settings

kaiwei-tan/DBA3803_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DBA3803 Project

Project done for NUS module, Predictive Analytics in Business, in Spring 2019
Done by: Kai Wei Tan, Daniel Lee, Eugene Ng, Tan Quan Hao, Rayner Tay, Kenny Chuen
Supervised by: Professor He Long

Part of a natural language processing and classification project using latent Dirichlet allocation (LDA) for topic modelling ("LDA.ipynb") and logistic regression for text classification ("/flask/model.py"). This project aims to use machine learning to identify hate speech by classifying text under certain pre-defined categories e.g. 'racist', 'xenophobic', 'sexual', etc. Text data ("text.csv") was scraped from Singaporean forums and Facebook comments on Singaporean pages, and manually tagged by category.

I also created a Flask application ("/flask/main.py") which uses the trained logistic regression model to classify any new text input. If this input is classified under any of our pre-defined categories, the application identifies it as hate speech.

About

Project done for module Predictive Analytics in Business

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published