Sentiment Analysis for Amazon-Reviews

Umberto Cocca - 807191

Introduction

In the last years more and more researches have broadened the understanding of textual resources, leading to the growth of online services that changed the face of shopping. E-commerce applications like Amazon acquire a disproportionate amount of data through their transactions and users, a substantial part is indeed given by the contents generated by users who evaluate the products purchased and share their experience with numerical evaluations and/or reviews.

Network Analysis

The goal of this project was to extract insights that may turn helpful for business purposes. In particular, the question I want to answer by using network analysis is: Which are the most recommended books? This can be useful to understand how to sort the products, for example within a website, in order to show to users first the ones they are most likely looking for.

Sentiment Analysis

Sentiment Analysis is used to interpret natural language and identify subjective information that denote opinions, emotions and feelings, determining the corresponding polarity (positive, negative or neutral) and finally summarizing this data so that it can be of value for a company. In this way, decisions can be made based on meaningful data rather than from simple intuitions that are not always correct. Sentiment Analysis is important because companies want their brand to be positively perceived. In this regard, the focus can be on positive or negative comments, as well as customers’ feedbacks, to evaluate both strenghts and point on which to improve. In order to apply Sentiment Analysis in this project, first the textual parts of the reviews are systematically analyzed to extract an opinion. A preliminary pre-processing phase will prepare the dataset and finally, ASUM (Aspect Sentiment Unification Model) is used to extract set of topics that refer to positive and negative sentiments from a document made of sentences.

Existing Software and Tools used

For the preprocessing and network analysis I used Python, due to the large amount of open source tools and libraries available. In particular, the following libraries were used:

Python\

Pandas: to load and manipulate the dataset;
iGraph: is a collection of network analysis tools with the emphasis on efficiency, portability and ease of use;
NLTK: to split every review in a list of sentences;
re: to perform a partial cleaning of the data , for example deleting words composed by inadequate characters

ASUM
Using Python, the ad-hoc input for the Java version of ASUM was built The program input consists of two mandatory files and an optional one:

BagOfSentences.txt (mandatory)
This file is a representation of the word list of documents in the corpus. For each document, the first line is the number of sentences, from the next line and on there is a list of indexes that refer to the relative position of a word in the WordList file.
WordList.txt (mandatory)
The file maps words with indexes. It is assumed that the first word has index 0, the second has index 1 and so on.
SentiWords-0.txt, SentiWords-1.txt, . . . (optional )*
These files are composed of words called "semi-sentimental". The files enumeration should start from 0 and then gradually increase, until the number of searched sentiments is reached. In the ASUM model it is possible to help the sampling process by making use of this a priori information. If, for example, we know that a given word is positive because it belongs to the lexicon of positives, then its probability of being positive is known. For this project two sentiments were searched, one positive and one negative

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Asum Project		Asum Project
Latex		Latex
.gitignore		.gitignore
CoccaUmberto_DataAnalytics.pdf		CoccaUmberto_DataAnalytics.pdf
README.md		README.md
check_ASUM_results.xlsx		check_ASUM_results.xlsx
main.py		main.py
negative_reviews.txt		negative_reviews.txt
net.cys		net.cys
positive_reviews.txt		positive_reviews.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis for Amazon-Reviews

Introduction

Network Analysis

Sentiment Analysis

Existing Software and Tools used

About

Releases

Packages

Languages

okamiRvS/Sentiment-Analysis-for-Amazon-Reviews

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis for Amazon-Reviews

Introduction

Network Analysis

Sentiment Analysis

Existing Software and Tools used

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages