Aspect Based Sentiment Analysis

1. Introduction

Aspect based Sentiment Analysis is also known as Feature or Attribute based sentence Analysis. Aspect based sentiment analysis is used to analyze different features/attributes/aspects of product. For example smartphones, can have different features like camera, battery life, touch screen etc. So you analyze sentiments for these features for a given product.

It must also be considered that sometimes, it is not enough to say whether a post or a product review has a "positive" or a "negative" sentiment. The provider may want to know what aspects were positive or negative.

For example, let's say the customer review for a restaurant is as follows: "The food was great but the service was lousy."

The overall sentiment from a machine's perspective is "neutral" in this particular example which makes the review of no use. But if sentiment determination is done based on the aspects/features as below then the importance of the review is far more than it was in previous case:
Aspect: "food", Polarity: "positive"
Aspect: "service", Polarity: "negative"

2. Brief description about the implemetation steps

Read the data as pandas dataframes
Perform data preprocessing a. replace '[comma]' with actual ','; replace multiple spaces, special characters with single space; convert into lower case; remove stopwords and tokenize b. consider words of sentence with window size = 5 i.e consider the sentence composed of 5 words left and right of the aspect term in the sentence along with aspect term. In case of multiple occurences of aspect term, consider 5 words left of first occurence of aspect term and five words right of last occurence of aspect term and all the words in between. In case the aspect term is missing then consider the entire sentence as it is. c. Learn the vocabulary dictionary and return term-document matrix using fit_transform on bigram count vectorizer
Sample a training set from 'training dataset' while holding out 25% of this 'training dataset' for testing (evaluating) our classifier.
Train the classifiers one by one on this train and test data of the training dataset and obtain classification report using 10-fold cross validation
Pick the classifier which performed the best on this training dataset and use it to predict the class label for the given test dataset.
The labels generated will be written in a new text file separated by ";;" aside the "example_id" (unique id for each sentence provided as part of dataset). Also a graph representing the number of positice, negative, and neutral class predicted is plotted.

3. Software installations required to run code

Anaconda Python distribution - prefereably Anaconda3 with python 3.6
pip packages imported at the top of the notebook file

4. Findings on the dataset used

Following classification report along with overall accuracies were obtained for the classifiers used to fit the training data:

After training, the model/classifier which performs the best on the given dataset is then used to predict class labels on the test dataset. In this case (of "Laptop" datset provided), Multinomial Naive Bayes outperformed other classifiers so prediction of labels for test dataset is done using this classifier.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitattributes		.gitattributes
README.md		README.md
SentimentAnalysis.ipynb		SentimentAnalysis.ipynb
data_output.txt		data_output.txt
data_test.csv		data_test.csv
data_train.csv		data_train.csv
report.png		report.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aspect Based Sentiment Analysis

1. Introduction

2. Brief description about the implemetation steps

3. Software installations required to run code

4. Findings on the dataset used

5. References

About

Releases

Packages

Languages

prats13bag/AspectBasedSentimentAnalysis

Folders and files

Latest commit

History

Repository files navigation

Aspect Based Sentiment Analysis

1. Introduction

2. Brief description about the implemetation steps

3. Software installations required to run code

4. Findings on the dataset used

5. References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages