Data

Data Overview

Data name	Json file	Json_Size	Images_Size
VersaD	list_pretrain.json	1.02 GB	140 GB
VHM_SFT	list_sft.json	1.5 GB	124 GB
VHM_Eval	-	-	7.35 GB

VersaD Dataset

This dataset is curated from crowdai, cvact, cvusa, fmow, loveda, millionAID, etc, resulting in total 140M high-quality image-text pairs with the help of powerful Gemini-Vision. The dataset is used for the pretraining of VHM.

SFT Dataset

This dataset contains VersaD-Instruct，HnstD and VariousRS-Instruct sub datasets. The images in these datasets come from public datasets such as BANDON,DOIR,DOTA,FBP,METER-ML,MSAR,Mts-WH,NWPU-RESISC45,RSITMD,RSVQA,UCM,crowdAI,deepglobe,fair1M and fmow, the instruction portion is based on their original labels.

VHM_Eval Dataset

This dataset is a collection of all evaluation data in the paper, including Table 5-9 and Table 11. The specific relationship between tasks and corresponding files is as follows:

Task name	Question type	Json file
Honst Tasks
Honst-presence	open-end	presence_mo_dota.json
Honst-object absolute position	multi-choice	abspos_dota-test_mc.json
Honst-object absolute position-false	multi-choice	abspos_false_dota-test_adversarial_mc.json abspos_false_dota-test_popular_mc.json abspos_false_dota-test_random_mc.json
Honst-object relative position	multi-choice	relpos_dota-test_mc.json
Honst-object relative position-false	multi-choice	relpos_false_dota-test_adversarial_mc.json relpos_false_dota-test_popula_mc.json relpos_false_dota-test_random_mc.json
Honst-clolor	open-end	color_dota-test_fair1m-val_open.json
Honst-color-false	open-end	color_false_dota-test_random_open.json color_false_dota-test_popular_open.json color_false_dota-test_adversarial_open.json
Honst-clolor-pan false	open-end	color_false_pan_dota-test.json
VariousRS Tasks
Scene Classification	open-end	cls_WHU_RS19.json cls_SIRI_WHU.json cls_NWPU_RESISC45.json cls_METER_ML.json cls_AID.json
Building footprint vectorization	open-end	bfv_crowdai_val.json
Counting	open-end	counting_dota-test_open.json
Image Resolution	open-end	gsd_dota_fbp.json
Image Modality	multi-choice	imgType_mcq.json
Multi-label Classification	open-end	mlc_fbp_test.json mlc_gid_test.json
Geometric Measurement	open-end	obj_meas_dota_test.json
RSVQA-HR*	open-end	RSVQA_HR-comp_RSVQA.json RSVQA_HR-presence_RSVQA.json
RSVQA-LR*	open-end	RSVQA_LR-presence_RSVQA.json RSVQA_LR-presence_RSVQA.json RSVQA_LR-rural_urban_RSVQA.json
Visual Grounding	open-end	VG_DOIR_RSVG_test.json

* means that the dataset is a randomly sampled subset; you need to download the entire dataset yourself.

Data preparation

Pretrain stage dataset preparation

Please download the VersaD dataset.
Prepare the datasets according to the file structure shown below, where pretrain_base denotes the root directory of the entire pretrain dataset.

{pretrain_base}/
    # image dirs
    crowdai/
        image0.jpg
        image1.jpg
        ...
        imagexx.jpg
    cvusa/
    ...

    # json files
    list_pretrain.json

SFT stage dataset preparation

Please download the VHM_SFT dataset.
Prepare the datasets according to the file structure shown below, where sft_base denotes the root directory of the entire SFT dataset.

{sft_base}/
    # image dirs
    BANDON/
        image0.jpg
        image1.jpg
        ...
        imagexx.jpg
    DOTA-train/
    ...

    # json files
    list_sft.json

Important notice: For the convenience, we provide a zip file for web data. These images must be used for academic purpose.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data.md

Data.md

Data

Data Overview

VersaD Dataset

SFT Dataset

VHM_Eval Dataset

Data preparation

Pretrain stage dataset preparation

SFT stage dataset preparation

Files

Data.md

Latest commit

History

Data.md

File metadata and controls

Data

Data Overview

VersaD Dataset

SFT Dataset

VHM_Eval Dataset

Data preparation

Pretrain stage dataset preparation

SFT stage dataset preparation