Sentiment-Analysis-MK1

A sentiment analysis project based on Sentiment140 training data. This First attempt utilises some basic rules for creating quality input data from the raw data which is fed into a LSTM neural network.

Demo Available Here: https://www.ericsciberras.com/portfolio/#LSTM-Sentiment-Analysis

Getting Started

Play With Already Trained Neural Network

Go to html folder and and open nn.html

Train Neural Network Yourself

Download the sentiment140 data from here. Alternatively you may use your own dataset however the process_raw_data() function may need to be altered so it is in the correct format.
Run
pip3 install -r requirements.txt
To install all the python3 libraries required for use.
Run
python3 processData /path/to/dataset/'
To create 3 files:
- processed_data.csv : which is the data in correct format for converting sentences to arrays
- processed_data_2.csv : which is data that is ready to be fed into the neural network
- tokeniser.pickle : holds the dictionary which is useful for converting the words to numbers and vice versa
Run
python3 neuralnet
To start training. Note: it is recommended to use tensorflow-gpu for this as running on a cpu will be very slow. This will create a .hdf5 file for each epoch e.g. sentiment-ai-04-0.74. The format is sentiment-ai--<val_acc> and a file called my-model.hdf5 which is the last epoch.

To play with the model:

Python

Run
python3 runmodel /path/to/model.hdf5

OR

Web Browser

Install tfjs-converter
pip install tensorflowjs
Convert the keras model to TF.js Layers format
tensorflowjs_converter --input_format keras path/to/my_model.h5 path/to/tfjs_target_dir
Run
dictionary2json ./tokeniser.pickle ./dict.json
To turn the dictionary pickle to json.
Replace all new created files dict.json,group1-shard1of1 and model.json with the current ones in the html folder.
Open nn.html.

Possible Improvements For MK2

Use more or better datasets (create my own ???)
Research Word2vec and its possible advantages
Allow neural network to accept arbitrarily sized sentences
Use more sophisticated techniques for processing raw datasets (i.e: identifying words that don't contribute to sentiment)
Optimise parameters for LSTM neural network or use different neural network

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
html		html
README.md		README.md
dictionary2json.py		dictionary2json.py
neuralnet.py		neuralnet.py
processData.py		processData.py
requirements.txt		requirements.txt
runmodel.py		runmodel.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment-Analysis-MK1

Getting Started

Play With Already Trained Neural Network

Train Neural Network Yourself

To play with the model:

Python

Web Browser

Possible Improvements For MK2

About

Releases

Packages

Languages

eric-sciberras/Sentiment-Analysis-MK1

Folders and files

Latest commit

History

Repository files navigation

Sentiment-Analysis-MK1

Getting Started

Play With Already Trained Neural Network

Train Neural Network Yourself

To play with the model:

Python

Web Browser

Possible Improvements For MK2

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages