GitHub - yonigottesman/nlp_yelpreview_techniques: Try different techniques and frameworks to solve YelpReviewPolarity dataset

Different techniques and frameworks to solve sentiment of the YelpReviewPolarity dataset. https://www.yelp.com/dataset/challenge

fasttext

https://fasttext.cc/
Parameters as used in https://github.com/facebookresearch/fastText/blob/master/classification-results.sh

SimpleNN

Simple neural network implemented in pytorch.
word embeddings -> avarage on words -> fully connected

rnn

Some rnn networks inplemented in pytorch. gru, bi-lstm

tfidf

sklearn tfidf + logistic

bert

Fine tuning the whole bert model with an additional linear layer.
To save money (google cloud run time...) I use only 100K train examples.
To fit in gpu memory i use only 128 tokens (long examples get cut).

Current results:

model	precision	train time [mm:ss]
fasttext	0.956	00:26
SimpleNN	0.94	03:47
gru	0.962	08:00
bi-lstm * 2	0.967	25:00
tfidf	0.939	00:16
BERT - fine tuning	###	224:00

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.cache		.cache
.data		.data
.gitignore		.gitignore
README.md		README.md
bert.ipynb		bert.ipynb
fasttext.ipynb		fasttext.ipynb
requirements.txt		requirements.txt
rnn.ipynb		rnn.ipynb
simple_nn.ipynb		simple_nn.ipynb
tf-idf.ipynb		tf-idf.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fasttext

SimpleNN

rnn

tfidf

bert

Current results:

About

Releases

Packages

Contributors 2

Languages

yonigottesman/nlp_yelpreview_techniques

Folders and files

Latest commit

History

Repository files navigation

fasttext

SimpleNN

rnn

tfidf

bert

Current results:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages