Lexical Knowledge Internalization for Neural Dialog Generation

This repo contains code needed to replicate our findings in the ACL’2022 paper as titled. Our implementation is based on FairSeq.

Setup conda environment (recommanded)

conda create --name ki python=3.7
conda activate ki
conda install pytorch -c pytorch
cd KI/
pip install --editable ./

Resources

File Name	Description	Download
`knowledge_embedding.hdf5`	Pre-extracted knowledge features. Please put this file under "data/"	Link
`transformer.pt`	checkpoint to replicate results of a transformer baseline on WoW dataset	Link
`transformer_ki.pt`	checkpoint to replicate results of a transformer+ki model on WoW dataset	Link

Example Usage

Here we demonstrate how to run the code on Wizard of WikiPedia (WoW) dataset.

Data format

# We have included pre-processed (BPE, knowledge retrieval) raw data from the  dataset in the repo, 
# with the following format (take train set as an example):

train.src # source utterance
train.tgt # target response
train.voken.src # knowledge associated with each token in the source utterance, each knowledge is represented using an ID, which can be used to obtain its representation. You need a retriever to generate this file (see below).

1. Preprocess and training

# Please download the knowledge_embedding.h5py file above before training. 
bash run_wow.sh

2. Evaluate

bash generate.sh -b 5 -d data-bin/wow/ -c checkpoint_last10_avg.pt -s test -p checkpoints/wow_transformer_ki/  # inference 
bash evaluate.sh -p checkpoints/wow_transformer_ki/ -s test  # evaluate generated responses

evaluation script parameters:

-b beam size
-g gpu id to be used
-d data sir
-c checkpoint name 
-s test split {valid/test/test1} 
-p checkpoint dir

Run the evaluation commanda above, you are supposed to see:

Method	PPL	wikiF1	BLEU4	ROUGE-l	Distinc-1	Distinc-2	%safe
Transformer+KI	51.03	14.78	2.74	12.95	5.94	21.18	35.42
Transformer	49.92	13.56	2.33	12.88	4.13	12.71	59.19

| Run bash run_baseline.sh to get results for the transformer baseline.

Notes: These numbers are slightly different from those reported in the paper, since the experiments are replicated on different machines and python environments. To replicate results in the paper, you can download the trained checkpoints from the links above.

Retriever

The code for training and inference of retriever will be released in another repo.

I cannot spare hands to clean these codes recently, but if you need them in your work, please do not hesitate to email me to get an uncleaned version (with a basic doc on how to run the exp).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
config		config
data/wow		data/wow
docs		docs
examples		examples
experiments		experiments
fairseq		fairseq
fairseq_cli		fairseq_cli
scripts		scripts
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
config.json		config.json
evaluate.sh		evaluate.sh
generate.sh		generate.sh
hubconf.py		hubconf.py
pyproject.toml		pyproject.toml
run_baseline.sh		run_baseline.sh
run_wow.sh		run_wow.sh
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lexical Knowledge Internalization for Neural Dialog Generation

Setup conda environment (recommanded)

Resources

Example Usage

Data format

1. Preprocess and training

2. Evaluate

Retriever

About

Releases

Packages

Languages

License

LividWo/KI

Folders and files

Latest commit

History

Repository files navigation

Lexical Knowledge Internalization for Neural Dialog Generation

Setup conda environment (recommanded)

Resources

Example Usage

Data format

1. Preprocess and training

2. Evaluate

Retriever

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages