Dataset

VG150

The dataset excludes VG_100K is provided here, which includes

    ├─ vg_stats.pt # frequency bias 
    ├─ zeroshot_triplet.pytorch
    ├─ stanford_filtered/
        ├─ image_data.json
        ├─ VG-SGG-dicts.json # add split_GLIPunseen 
        ├─ VG-SGG.h5

steps

download vg150 following Scene-Graph-Benchmark.pytorch

The following is adapted from [Danfei Xu](https://github.com/danfeiX/scene-graph-TF-release/blob/master/data_tools/README.md) and [neural-motifs](https://github.com/rowanz/neural-motifs).

Note that our codebase intends to support attribute-head too, so our ```VG-SGG.h5``` and ```VG-SGG-dicts.json``` are different with their original versions in [Danfei Xu](https://github.com/danfeiX/scene-graph-TF-release/blob/master/data_tools/README.md) and [neural-motifs](https://github.com/rowanz/neural-motifs). We add attribute information and rename them to be ```VG-SGG-with-attri.h5``` and ```VG-SGG-dicts-with-attri.json```. The code we use to generate them is located at ```datasets/vg/generate_attribute_labels.py```. Although, we encourage later researchers to explore the value of attribute features, in our paper "Unbiased Scene Graph Generation from Biased Training", we follow the conventional setting to turn off the attribute head in both detector pretraining part and relationship prediction part for fair comparison, so does the default setting of this codebase.

### Download:
1. Download the VG images [part1 (9 Gb)](https://cs.stanford.edu/people/rak248/VG_100K_2/images.zip) [part2 (5 Gb)](https://cs.stanford.edu/people/rak248/VG_100K_2/images2.zip). Extract these images to the file `datasets/vg/VG_100K`. If you want to use other directory, please link it in `DATASETS['VG_stanford_filtered']['img_dir']` of `maskrcnn_benchmark/config/paths_catelog.py`. 
2. Download the [scene graphs](https://1drv.ms/u/s!AmRLLNf6bzcir8xf9oC3eNWlVMTRDw?e=63t7Ed) and extract them to `datasets/vg/VG-SGG-with-attri.h5`, or you can edit the path in `DATASETS['VG_stanford_filtered_with_attribute']['roidb_file']` of `maskrcnn_benchmark/config/paths_catalog.py`.

Since GroundingDINO pre-training has seen part of VG150 test images, we remove these images and generate a new split split_GLIPunseen as VS3 did (please refer to tools/cleaned_split_GLIPunseen.ipynb).

The dataset is organized as

data/
│
└─ visual_genome/
    ├─ VG_100K/
    ├─ vg_stats.pt # frequency bias 
    ├─ zeroshot_triplet.pytorch
    ├─ stanford_filtered/
        ├─ image_data.json
        ├─ VG-SGG-dicts.json # add split_GLIPunseen 
        ├─ VG-SGG.h5

helpful links:

zeroshot_triplet.pytorch;
vg_stats.pt is generated by https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch/blob/master/maskrcnn_benchmark/data/build.py#L21

COCO

Download the original COCO data into data folder. The dataset is organized as

data/
│
└─ coco/
    ├─ train2017
    ├─ val2017
    ├─ annotations
        ├─ instances_train2017.json
        ├─ instances_val2017.json
        ├─ captions_train2017.json
        ├─ captions_val2017.json 
        ├─ captions_train2017_triple.json # generated by sg parser
        ├─ captions_val2017_triple.json  # generated by sg parser

you can download captions_train2017_triple.json , captions_val2017_triple.json, coco_nouns.txt, and coco_relations.txt from huggingface.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data.md

data.md

Dataset

VG150

steps

COCO

Files

data.md

Latest commit

History

data.md

File metadata and controls

Dataset

VG150

steps

COCO