Sentiment Analysis

in this repo, a twitter dataset is going to analysed. dataset contains twitts and label of them. labels are "positiv, negative, neutral". but there is a imbalance that neutral twitts are more than others and the negative sentences are least. here i'm going to describe some things.

preprocessing

removing stop words: stop words are those who has no special information and not important. but sometime this get important.
tokenize: 2.1: word tokenize: splitting a sentense to words 2.2: splitt a text to sentences. it creates list of tokenized objects.
lemmatize: converting verbs like "went, running" to "go, run".

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.gitignore		.gitignore
README.md		README.md
SemEval2017-task4-dev.subtask-A.english.INPUT.txt		SemEval2017-task4-dev.subtask-A.english.INPUT.txt
glove_embedding.ipynb		glove_embedding.ipynb
semeval_deep.ipynb		semeval_deep.ipynb
subTaskA.ipynb		subTaskA.ipynb
test.tsv		test.tsv
train.tsv		train.tsv
word2vec_gensim.ipynb		word2vec_gensim.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis

preprocessing

About

Releases

Packages

Languages

MahdiEsrafili/nlp_sentiment_analysis

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis

preprocessing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages