Sentiment analysis with Pumpkin

The workflow contains three workers (seeds):

tweetinject.py - reads a file with tweets and sends them to the queue
filter.py - performs sentiment analysis on incoming tweets
collector.py - counts tweets and writes result into a file

Install missing dependencies if required

pip install pika

Prepare pumpkin environment

You should copy required files to you VM (as it is explained in the assignment).

Workers (seeds)

You should copy desired workers to ~/pmk-seeds directory - or directory that you specify with --taskdir option (see below in section Run Pumpkin).

Do the following if you want to run all three workers on one VM:

scp *.py pumpkin:pmk-seeds/

Classifier

Copy trained classifier to required directory.

mkdir -p /home/pumpkin/nltk_data/classifiers

scp movie_reviews_NaiveBayes.pickle pumpkin1:nltk_data/classifiers/

Configuration file

Complete (see slides) and copy pumpkin configuration file into working directory

scp pumpkin.cfg pumpkin1:pumpkin/

You should also change a group name. With the same value for different groups, your workers might communicate with workers of other people since the messages are identified in the system by a pair: message type and a group.

Run Pumpkin

In order to start pumpkin, execute: Replace $taskdir by a chosen directory such as '~/pmk-seeds'

python DRHarness.py --supernode --taskdir $taskdir --broadcast --endpoints="tcp://*:*" --gonzales

Prepare final result

Perform the computations as it is explained in the assignment. Then, you plot a graph that shows how the number of positive and negative tweets changes in time. First, if needed, sort the data:

sort --output=tweetstats.data tweetstats.data

and plot a graph (using gnuplot):

gnuplot plot.gnu

open the file tweetstats.png.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment analysis with Pumpkin

Install missing dependencies if required

Prepare pumpkin environment

Workers (seeds)

Classifier

Configuration file

Run Pumpkin

Prepare final result

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md
collector.py		collector.py
filter.py		filter.py
movie_reviews_NaiveBayes.pickle		movie_reviews_NaiveBayes.pickle
plot.gnu		plot.gnu
pumpkin.cfg		pumpkin.cfg
tweetinject.py		tweetinject.py

SOA-cloud-course/sentiment-analysis-pumpkin

Folders and files

Latest commit

History

Repository files navigation

Sentiment analysis with Pumpkin

Install missing dependencies if required

Prepare pumpkin environment

Workers (seeds)

Classifier

Configuration file

Run Pumpkin

Prepare final result

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages