SentimentAnalysis

Training classifiers on offline dataset from Stanford 'Sentiment140' for classifying tweets into classes of emotions.
Aim is to test it online on real-time tweets.
Motivation: This project is applicable to be used for market research, product review summaries, campaign analysis to help make better business decisions.

Python & pySpark

TechStack:

Python libraries
- NLTK
- BeautifulSoup
- sklearn
- pyspark
- tweepy
- textblob
- matplotlib
- pandas
- numpy

Phases completed:

Data cleaning, tokenizing
Word Vectorizing
Performing NLP
Feature extraction
N-gram testing using Logistic Regression
Training and evaluting using Multinomial Naive Bayes, Bernoulli Naive Bayes, Ridge Classifier and AdaBoost Classifier

Ongoing project *

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
file_cleaning2.py		file_cleaning2.py
filecleaning.py		filecleaning.py
pyspark_test		pyspark_test
twitter.py		twitter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SentimentAnalysis

About

Releases

Packages

Languages

trisha-p-malhotra/SentimentAnalysis

Folders and files

Latest commit

History

Repository files navigation

SentimentAnalysis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages