By Yuxiang Ji*,Β Boyong He*,Β Zhuoyue Tan,Β Liaoni Wu
Localization in flight trajectory after pre-trained on GTA-UAV dataset.
- Part I: Dataset
- Part II: Train and Test
- Part III: Pre-trained Checkpoints
- [Dec 10, 2024]: Game4Loc is accepted by AAAI'25 π
- [Sep 28, 2024]: Official GTA-UAV dataset release π§
- Dataset Highlights
- Dataset Access
- Dataset Structure
- Train and Test
- More Features
- Pre-trained Checkpoints
- License
- Acknowledgments
- Citation
GTA-UAV dataset provides a large continuous area dataset (covering 81.3km2) for UAV visual geo-localization, expanding the previously aligned drone-satellite pairs to arbitrary drone-satellite pairs to better align with real-world application scenarios. Our dataset contains:
-
33,763 simulated drone-view images, from multiple altitudes (80-650m), multiple attitudes, multiple scenes (urban, mountain, coast, forest, etc.).
-
14,640 tiled satellite-view images from 4 zoom levels for arbitrarily pairing.
-
Overlap (in IoU) of FoV for each drone-satellite pair.
-
Drone (camera) 6-DoF labels for each drone image.
The dataset is released in two versions: low resolution (512x384, 12.8G) and high resolution (1920x1440, 133.6G).
Low Resolution Version | High Resolution Version |
---|---|
HuggingFaceπ€ | Released soon |
BaiduDisk | Released soon |
The high resolution dataset will be released soon.
ββ GTA-UAV
| βββ drone/
| | βββ images/
| | βββ 200_0001_0000000001.png
| | βββ 200_0001_0000000002.png
| | βββ ...
| βββ satellite/
| | βββ 6_0_0_0.png
| | βββ 6_0_0_1.png
| | βββ ...
| βββ cross-area-drone2sate-train.json
| βββ cross-area-drone2sate-test.json
| βββ same-area-drone2sate-train.json
βββ βββ same-area-drone2sate-test.json
This entry provides a detailed description and paired satellite images for a single drone image in the training/test dataset:
{
"drone_img_dir": "drone/images",
"drone_img_name": "500_0001_0000025682.png",
"drone_loc_x_y": [4472.2036708036, 9460.91532053518],
"sate_img_dir": "satellite",
"pair_pos_sate_img_list": [
"4_0_6_13.png"
],
"pair_pos_sate_weight_list": [
0.47341428718085427
],
"pair_pos_sate_loc_x_y_list": [
[4492.8, 9331.2]
],
"pair_pos_semipos_sate_img_list": [
"4_0_6_13.png",
"5_0_12_27.png",
"5_0_13_27.png"
],
"pair_pos_semipos_sate_weight_list": [
0.47341428718085427,
0.27864086433392504,
0.2149980058725643
],
"pair_pos_semipos_sate_loc_x_y_list": [
[4492.8, 9331.2],
[4320.0, 9504.0],
[4665.6, 9504.0]
],
"drone_metadata": {
"height": 494.70794677734375,
"drone_roll": -0.2723846435546875,
"drone_pitch": 1.981452226638794,
"drone_yaw": 84.99999237060547,
"cam_roll": -90.27238464355469,
"cam_pitch": 1.981452226638794,
"cam_yaw": 84.99999237060547
}
},
-
drone_loc_x_y
: Provides the 2D location for the centre of drone-view image. -
pair_pos_sate_img(weight/loc_x_y)_list
: Provides the positive paired satellite image / weight(IOU) / 2D location list. -
pair_pos_semipos_sate_img(weight/loc_x_y)_list
: Provides the positive & semi-positive paired satellite image / weight(IOU) / 2D location list. -
drone_metadata
: Provides the height (altitude above ground level), drone pose (roll, pitch, yaw), and camera pose (roll, pitch, yaw) information.
You may want to collect your own data from simulated game environments, if so, you could refer here.
To configure the simulation and collection environment, please refer DeepGTA.
Notice that the compiled DeepGTA
plugin for our GTA-UAV data simulation and collection is located at here.
To pre-process the raw UAV-VisLoc data into a similar format as GTA-UAV, you can refer this script. What's more, you can also refer to it and modify (extend) it to fit your custom similar datasets.
Proposed training and test pipeline
First, install dependencies
cd Game4Loc
# install project
pip install -e .
pip install -r requirements.txt
Then you could simply run the training experiments on GTA-UAV cross-area setting by
# run experiment (example: GTA-UAV cross-area setting)
python train_gta.py \
--data_root <The directory of the GTA-UAV dataset> \
--train_pairs_meta_file "cross-area-drone2sate-train.json" \
--test_pairs_meta_file "cross-area-drone2sate-test.json" \
--model "vit_base_patch16_rope_reg1_gap_256.sbb_in1k" \
--gpu_ids 0 --label_smoothing 0.05 \
--lr 0.0001 --batch_size 64 --epoch 5 \
--with_weight --k 5
Or run the training experiments on UAV-VisLoc by
# run experiment (example: UAV-VisLoc same-area setting)
python train_visloc.py \
--data_root <The directory of the UAV-VisLoc dataset> \
--train_pairs_meta_file "same-area-drone2sate-train.json" \
--test_pairs_meta_file "same-area-drone2sate-test.json" \
--model "vit_base_patch16_rope_reg1_gap_256.sbb_in1k" \
--gpu_ids 0 --label_smoothing 0.05 \
--lr 0.0001 --batch_size 64 --epoch 20 \
--with_weight --k 5
Some studies divide localization into two parts: retrieval and matching. Our work focuses on the first part.
Nevertheless, we also provide support for finer localization based on image matching (Thanks to the excellent zero-shot capabilities of GIM).
Set with_match=True
in eval script if needed.
To be released soon.
This project is licensed under the Apache 2.0 license.
This work draws inspiration from the following code as references. We extend our gratitude to these remarkable contributions:
If you find our repository useful for your research, please consider citing our paper:
@article{ji2024game4loc,
title = {Game4Loc: A UAV Geo-Localization Benchmark from Game Data},
author = {Ji, Yuxiang and He, Boyong and Tan, Zhuoyue and Wu, Liaoni},
journal= {arXiv preprint arXiv:2409.16925},
year = {2024},
}