Code for the paper Visually Grounded Compound PCFGs by Yanpeng Zhao and Ivan Titov @ EMNLP 2020.
Update (06/02/2022): Please look in vpcfg_text.py, vpcfg.py, and vpcfg_image.py for parsing MSCOCO and Flickr30k captions, creating data splits, and encoding images, respectively, for VC-PCFG. NOTE that VG-NSL and VC-PCFG used a pre-trained ResNet-152 to encode images but mistakenly reported it as ResNet-101. We have corrected it in the image encoding script.
Update (12/12/2021): VC-PCFG has been integrated into a repo dedicated for CFG-focused models.
The processed data can be downloaded here.
The best checkpoints of VC-PCFG can be downloaded
here (MD5 checksum:
d7de50d06590004061d24a69db0ac64c
).
python train.py \
--data_path $DATA_PATH \
--logger_name $SAVE_PATH
python train.py \
--visual_mode "O" \
--data_path $DATA_PATH \
--logger_name $SAVE_PATH
In both cases please specify DATA_PATH
and SAVE_PATH
before running (DATA_PATH
is where the downloaded data resides; SAVE_PATH
is where your model will be saved).
Remember to specify MODEL_FILE
and DATA_PATH
first.
python eval.py \
--model $MODEL_FILE \
--data_path $DATA_PATH
It requires a tailored Torch-Struct.
git clone --branch beta https://github.com/zhaoyanpeng/vpcfg.git
cd vpcfg
virtualenv -p python3.7 ./pyenv/oops
source ./pyenv/oops/bin/activate
pip install --upgrade pip
pip install --upgrade setuptools
pip install -r requirements.txt
git clone --branch infer_pos_tag https://github.com/zhaoyanpeng/pytorch-struct.git
cd pytorch-struct
pip install -e .
This repo is developed based on VGNSL, C-PCFGs, and Torch-Struct.
MIT