Skip to content

Quora question duplicate detection using Siamese Manhattan LSTM.

Notifications You must be signed in to change notification settings

bronwynbiro/QuoraQuestionDuplicates

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Quora Question Duplicates

Quora question duplicate detection in Keras based off off the paper "Siamese Recurrent Architectures for Learning Sentence Similarity".

Prerequisites

  • Python 3
  • Install necessary packages (gensim, tensorflow, pandas, numpy)
  • Download word2vec model from Google
  • Download Quora datasets
  • Place the downloads in a folder called input

Todo

Pre-processing:

  • augment data with spell-checking
  • augment data with thesauraus

Embeddings:

  • train embeddings on questions

Model:

  • pretrain LSTM to choose better weights

About

Quora question duplicate detection using Siamese Manhattan LSTM.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published