Skip to content
/ NLI Public

Models for Nature Language Inference (Tensorflow Version), including 'A Decomposable Attention Model for Natural Language Inference', ..., to be continued.

Notifications You must be signed in to change notification settings

cxncu001/NLI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Models for Nature Language Inference (NLI)

We are trying to reproduce some classical models in literal papers for Nature Language Inferece, and report performance on the Stanford Natural Language Inference data set (SNLI).

Models

Environments

  • TensorFlow 1.3 or higher
  • Python 3.5
  • Numpy
  • Sklearn

Data preparation

nliutils.py can be used for data preparation.

  • build_vocab(): Build vocabulary according the training data.
  • load_vocab(): Load vocabulary from file.
  • convert_data(): Convert NLI data from 'JSON' format to the following 'TXT' format: gold_label ||| sentence1 ||| sentence2.
  • process_file(): Prepare data for model, including converting words into indexes according the vocabulary, padding sentences into fix length, creating the corresponding mask arrays, and loading the classification labels of data into a 1-D array.
  • batch_iter(): Generate a batch of data.
  • convert_embeddings(): Convert embeddings from TXT (one word embedding per line) to a easy-to-use format in Python, which consists of a 2-d numpy array for embeddings and a dictionary for vocabulary.
  • pre-trained word embeddings: You can download pre-trained word embeddings from GloVe, and use convert_embeddings() to get needed format in the code.

Hyper-parameters

  • decompose:

Train model: python3 decompose/train.py --embeddings ../../res/embeddings/glove.840B.300d.we --train_em 0 -op adagrad -lr 0.05 --require_improvement 50000000 --vocab ../cdata/snli/vocab.txt -ep 300 --normalize 1 -l2 0.0 -bs 4 --report 16000 --save_per_batch 16000 -cl 100

Test model: python3 decompose/test.py -m modelfile -d testdata

Results

Model Acc reported in papers Our Acc
decompose 86.3% 86.28%

About

Models for Nature Language Inference (Tensorflow Version), including 'A Decomposable Attention Model for Natural Language Inference', ..., to be continued.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages