Skip to content
This repository has been archived by the owner on Mar 19, 2021. It is now read-only.

Latest commit

 

History

History
66 lines (57 loc) · 1.79 KB

README.md

File metadata and controls

66 lines (57 loc) · 1.79 KB

NeuSum

This repository contains code for the ACL 2018 paper "Neural Document Summarization by Jointly Learning to Score and Select Sentences"

About this code

PyTorch version: This code requires PyTorch v0.3.x.

Python version: This code requires Python3.

How to run

Prepare the dataset and code

Make a folder for the code and data:

NEUSUM_HOME=~/workspace/neusum
mkdir -p $NEUSUM_HOME/code
cd $NEUSUM_HOME/code
git clone --recursive https://github.com/magic282/NeuSum.git

After preparation, the workspace looks like:

neusum
├── code
│   └── NeuSum
│       └── neusum_pt
│           ├── neusum
│           └── PyRouge
└── data
    └── cnndm
        ├── dev
        ├── glove
        ├── models
        └── train

The paper used CNN / Daily Mail dataset.

About the CNN Daily Mail Dataset

About the CNN Daily Mail Dataset 2

Setup the environment

Package Requirements:

nltk numpy pytorch

Warning: Older versions of NLTK have a bug in the PorterStemmer. Therefore, a fresh installation or update of NLTK is recommended.

A Docker image is also provided.

Docker image

docker pull magic282/pytorch:0.3.0

Run training

The file run.sh is an example. Modify it according to your configuration.

Without Docker

bash $NEUSUM_HOME/code/NeuSum/neusum_pt/run.sh $NEUSUM_HOME/data/cnndm $NEUSUM_HOME/code/NeuSum/neusum_pt

With Docker

nvidia-docker run --rm -ti -v $NEUSUM_HOME:/workspace magic282/pytorch:0.3.0

Then inside the docker:

bash code/NeuSum/neusum_pt/run.sh /workspace/data/cnndm /workspace/code/NeuSum/neusum_pt