Dataset Preparation

Prepare the datasets before running experiments.

The project directory is $ROOT，and current directory is located at $ROOT/data to generate annotations.

Download the cleaned referring expressions datasets and extract them into $ROOT/data folder:

Dataset	Download URL
RefCOCO	http://bvisionweb1.cs.unc.edu/licheng/referit/data/refcoco.zip
RefCOCO+	http://bvisionweb1.cs.unc.edu/licheng/referit/data/refcoco+.zip
RefCOCOg	http://bvisionweb1.cs.unc.edu/licheng/referit/data/refcocog.zip
RefClef	https://bvisionweb1.cs.unc.edu/licheng/referit/data/refclef.zip

Prepare mscoco train2014 images, original Flickr30K images, ReferItGame images and Visual Genome images, and unzip the annotations. Then the file structure should look like:

$ROOT/data
|-- refcoco
    |-- instances.json
    |-- refs(google).p
    |-- refs(unc).p
|-- refcoco+
    |-- instances.json
    |-- refs(unc).p
|-- refcocog
    |-- instances.json
    |-- refs(google).p
    |-- refs(umd).p
|-- refclef
    |-- instances.json
    |-- refs(berkeley).p
    |-- refs(unc).p
|-- images
    |-- train2014
    |-- refclef
    |-- flickr
    |-- VG

Run data_process.py to generate the annotations. For example, running the following code to generate the annotations for RefCOCO:

cd $ROOT/data
python data_process.py --data_root $ROOT/data --output_dir $ROOT/data --dataset refcoco --split unc --generate_mask

--dataset={'refcoco', 'refcoco+', 'refcocog', 'refclef'} to set the dataset to be processd.

For Flickr and merged pre-training data, we provide the pre-processed json files: flickr.json, merge.json.

Note: The merged pre-training data contains the training data from RefCOCO train, RefCOCO+ train, RefCOCOg train, Referit train, Flickr train and VG. We also remove the images appearing the validation and testing set of RefCOCO, RefCOCO+ and RefCOCOg.

At this point the directory $ROOT/data should look like:

$ROOT/data
|-- refcoco
    |-- instances.json
    |-- refs(google).p
    |-- refs(unc).p
|-- refcoco+
    |-- instances.json
    |-- refs(unc).p
|-- refcocog
    |-- instances.json
    |-- refs(google).p
    |-- refs(umd).p
|-- anns
    |-- refcoco
        |-- refcoco.json
    |-- refcoco+
        |-- refcoco+.json
    |-- refcocog
        |-- refcocog.json
    |-- refclef
        |-- refclef.json
    |-- flickr
        |-- flickr.json
    |-- merge
        |-- merge.json
|-- masks
    |-- refcoco
    |-- refcoco+
    |-- refcocog
    |-- refclef
|-- images
    |-- train2014
    |-- refclef
    |-- flickr
    |-- VG       
|-- weights
    |-- pretrained_weights

Pretrained Weights

We provide the pretrained weights of visual backbones on MS-COCO. We remove all images appearing in the val+test splits of RefCOCO, RefCOCO+ and RefCOCOg. Please download the following weights into $ROOT/data/weights.

Pretrained Weights of Backbone	Link
DarkNet53-coco	OneDrive , Baidu Cloud
CSPDarkNet-coco	OneDrive , Baidu Cloud
Vgg16-coco	OneDrive , Baidu Cloud
DResNet101-voc	OneDrive , Baidu Cloud

We also provide the weights of SimREC that are pretrained on 0.2M images.

Pretrained Weights of REC Models	Link
SimREC (merge)	OneDrive , Baidu Cloud

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DATA_PRE_README.md

DATA_PRE_README.md

Dataset Preparation

Pretrained Weights

Files

DATA_PRE_README.md

Latest commit

History

DATA_PRE_README.md

File metadata and controls

Dataset Preparation

Pretrained Weights