Skip to content

Latest commit

 

History

History
19 lines (14 loc) · 618 Bytes

README.md

File metadata and controls

19 lines (14 loc) · 618 Bytes

Learning Distinct and Representative Modes for Image Captioning (Neurips 2022)

This repo provides the implemetation of the paper Learning Distinct and Representative Modes for Image Captioning.

Install

pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu116
pip install transformers yacs scipy

Data

Follow the instructions in VLP.

Run

python -m modecap.train data_dir PATH_TO_DATA
python -m modecap.inference data_dir PATH_TO_DATA model_path PATH_TO_MODEL