Sentiment-Analysis-of-Movie-review-dataset-using-Ensemble-Learning

Overview:

The background of the project is around the area of Sentiment analysis. Sentiment analysis is one of the emerging tasks in the field of Natural Language Processing and Data Science. Sentiment analysis, in simple terms, helps to find the author's attitude towards a topic. Sentiment analysis tools categorize pieces of writing as positive, neutral, or negative.

Problem Statement:

Rotten Tomatoes contains reviews of movies and thus, is a great place to get data from. As a part of our project, our NLP model will be categorizing the reviews provided by users into three sections, positive, negative, and neutral upon processing the labeled data provided by nlp.stanford.edu. The Rotten Tomatoes dataset is a large labeled dataset consisting of movie reviews. Some of the obstacles that could be a potential issue include sentence negation, sarcasm, language ambiguity, etc that make sentence prediction more difficult.

Exploring the data:

The dataset contains tab and | separated .txt files with phrases from the Rotten Tomatoes dataset. The dataset includes four different files - datasetSentences, datasetSplit, dictionary, and sentiment_labels.

datasetSentences - It contains 11855 different sentences which have been further divided into different phrases.
dictionary - It contains 239231 different phrases from the above datasetSentences. Each phrase has been given a unique phrase identifier called the PhraseID.
sentiment_labels - It contains the sentiment value of each above-mentioned phrases corresponding to their phraseID. The values range from 0.00 (most negative) to 1.00 (least positive).
datasetSplit - It is an optional file that has just been made to divide the data into test, train, and validation datasets.

Applied Approach to the problem:

Dataset visualization and data analysis.
Splitting the data into train, test, and validation sets.
Tokenization of the dictionary.
Word embedding of the reviews.
Training 3 different models: LSTM, Bidirectional LSTM, and Feedforward neural network
Integrated stacking of three different neural networks into an ensemble model.

Various Techniques used in the model:

Feedforward neural networks • LSTM • Bidirectional LSTM • early stopping • tokenization • padding • integrated stacking (meta learner's neural network) • word embeddings

Author Information:

Update History:

final-code-v2.ipynb contains the most recent code with a slight increase in accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
dataset		dataset
final-code-files		final-code-files
README.md		README.md
Report.pptx		Report.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment-Analysis-of-Movie-review-dataset-using-Ensemble-Learning

Overview:

Problem Statement:

Exploring the data:

Applied Approach to the problem:

Various Techniques used in the model:

Author Information:

Update History:

About

Releases

Packages

Contributors 3

Languages

imgreattt/Sentiment-Analysis-using-Ensemble-Learning

Folders and files

Latest commit

History

Repository files navigation

Sentiment-Analysis-of-Movie-review-dataset-using-Ensemble-Learning

Overview:

Problem Statement:

Exploring the data:

Applied Approach to the problem:

Various Techniques used in the model:

Author Information:

Update History:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages