tensorflow-cnn-text-classification-without-vocabulary

A CNN for text classification.

This is a special CNN text classification that takes input label as usual and sentenes in form of vectors, this model does not do embedding lookup. Embedding lookup is done by train.py and predict.py

This has been done to be able to receive vectors for all the words in glove pre-trained data, this will help with similar words giving same predictions. If we were to use vocabulary, then normally the words in a sentence is converted to array of ids and the words not in vocabuary are assigned 0, this 0 has no speialized vector in glove so lets assume that vector has all 0 items, so they contribute nothing to the prediction and hence they are not open to different forms of same sentence.

The downside of this approach is that there are high chances of prediction completely going off the rails when we become creative with the words in the sentence.

*** Experiment with it.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
data_helpers.py		data_helpers.py
predict.py		predict.py
text_cnn.py		text_cnn.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tensorflow-cnn-text-classification-without-vocabulary

About

Releases

Packages

Languages

sarweshsuman/tensorflow-cnn-text-classification-without-vocabulary

Folders and files

Latest commit

History

Repository files navigation

tensorflow-cnn-text-classification-without-vocabulary

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages