Skip to content

schuar-iosb/mta-dataset

Repository files navigation

The MTA Dataset

The MTA (Multi Camera Track Auto) is a large multi target multi camera tracking dataset. It contains over 2,800 person identities, 6 cameras and a video length of over 100 minutes per camera. It contains a day and a night period.

Take a look at a short timelapse video:

https://www.youtube.com/watch?v=tZ5UwpkcfmE

Or a short extracted part with annotations:

https://www.youtube.com/watch?v=e6WNgCQHcCc

Associated repositories

The dataset was recorded and created by the MTA-Mod (https://github.com/koehlp/MTA-Mod). If you want to perform multi camera tracking on the dataset, you could use the WDA-Tracker (https://github.com/koehlp/wda_tracker).

Release Notes

  • Major changes will be listed here.

Getting the dataset

To obtain the Dataset, you first need the MTA-Download-URLs, so please send an email (to be determined) philipp.koehl@gmail.com (with object "MTA Dataset Download") stating:

  • Your name, title and affilation

  • Your intended use of the data

The following statement:

With this email we declare that we will use the MTA Dataset 
for research and educational purposes only, since we are aware that 
commercial use is prohibited. We also undertake to purchase a copy of Grand Theft Auto V.

We will send you the MTA-Download-URLs as fast as possible that you can use to download the following files:

Multi person multi camera tracking

MTA_videos.zip

  • Contains 12 videos (train and test videos for 6 cameras)
    and multi camera tracking annotations (frame number, person id, bounding box).
  • 41GB zipped and 42GB unzipped
  • Overall video length: 102min

MTA_videos_coords.zip

  • Full annotations e.g. body joints
  • 28.6GB zipped and 235GB unzipped.

MTA_ext_short.zip

  • Extracted short part of MTA_videos (videos and annotations)
  • 1.7GB zipped and 1.8GB unzipped
  • Overall video length: 4min

MTA_ext_short_coords.zip

  • Full annotations of the extracted short part
  • 1.1GB zipped and 8.9GB unzipped

Person re-identification

MTA_reid.zip

  • Re-id dataset based on the MTA_videos
  • 0.8GB zipped and 1.2GB unzipped

Contents

Multi person multi camera tracking

MTA_{videos,ext_short}.zip

  • 6 train and 6 test set camera videos
    • MTA_{videos,ext_short}/{train,test}/cam_{0-5}/cam_{0-5}.mp4
  • 6 train and 6 test set camera tracking annotation csv files
    • MTA_{videos,ext_short}/{train,test}/cam_{0-5}/coords_fib_cam_{0-5}.csv

MTA_{videos,ext_short}_coords.zip

  • 6 train and 6 test set camera full annotation csv files
    • MTA_{videos,ext_short}_coords/{train,test}/cam_{0-5}/coords_cam_{0-5}.csv

Person re-identification

MTA_reid.zip

  • 72301 images in re-id train set
    15165 images in the re-id query set
    60448 images in the re-id test set
    • MTA_reid/{train,query,test}/framegta_{int}_camid_{0-5}_pid_{int}.png
  • 36100 overall excluded distractor images which don't show enough to recognize a person
    • MTA_reid/distractors/{train,query,test}/framegta_{int}_camid_{0-5}_pid_{int}.png

Annotations

Multi person multi camera tracking

Annotations in coords_fib_cam_{0-5}.csv
column name Description
frame_no_cam frame number per camera starting from 0
person_id person id
x_top_left_BB x top left bounding box coordinate
y_top_left_BB y top left bounding box coordinate
x_bottom_right_BB x bottom right bounding box coordinate
y_bottom_right_BB y bottom right bounding box coordinate
Annotations in coords_cam_{0-5}.csv
column name Description
frame_no_gta ingame native frame number
frame_no_cam frame number per camera starting from 0
person_id person id
appearance_id allows to recreate a person with the same appearance
joint_type joint type
x_2D_joint x 2D joint position
y_2D_joint y 2D joint position
x_3D_joint x 3D joint position
y_3D_joint y 3D joint position
z_3D_joint z 3D joint position
joint_occluded 1 if the joint is occluded; 0 otherwise
joint_self_occluded 1 if the joint is occluded by its owner; 0 otherwise
x_3D_cam x 3D camera position
y_3D_cam y 3D camera position
z_3D_cam z 3D camera position
x_rot_cam x camera rotation
y_rot_cam y camera rotation
z_rot_cam z camera rotation
fov camera field of view
x_3D_person x 3D person position
y_3D_person y 3D person position
z_3D_person z 3D person position
x_2D_person x 2D person position
y_2D_person y 2D person position
ped_type pedestrian type
wears_glasses 1 if the person is wearing glasses; 0 otherwise
yaw_person person yaw
hours_gta ingame time hours
minutes_gta ingame time minutes
seconds_gta ingame time seconds
x_top_left_BB x top left bounding box coordinate
y_top_left_BB y top left bounding box coordinate
x_bottom_right_BB x bottom right bounding box coordinate
y_bottom_right_BB y bottom right bounding box coordinate

Note that the camera 3D coordinates are in the GTA 3D coordinate system.

Pedestrian types

ped_type Description
0 Player character Michael
1 Player character Franklin
2 Player character Trevor
29 Army
28 Animal
27 SWAT
21 Los Santos Fire Department
20 Paramedic
6 Cop
4 Male
5 Female
26 Human

Source: http://www.dev-c.com/nativedb/

Joint type

 0: head_top
 1: head_center
 2: neck
 3: right_clavicle
 4: right_shoulder
 5: right_elbow
 6: right_wrist
 7: left_clavicle
 8: left_shoulder
 9: left_elbow
10: left_wrist
11: spine0
12: spine1
13: spine2
14: spine3
15: spine4
16: right_hip
17: right_knee
18: right_ankle
19: left_hip
20: left_knee
21: left_ankle

Person re-identification

The annotations are encoded in the image filenames using the following format:

  • {annotation_name}_{value}.

As the image filename goes as follows:

  • framegta_{int}_camid_{0-5}_pid_{int}.png

The following annotations are available:

annotation name description
framegta ingame frame number aka frame_no_gta
camid camera id aka cam_id
pid person id aka person_id

These annotations come from the multi person multi camera tracking annotations in coords_cam_{0-5}.csv. This means it is possible to get more annotations for re-id images like e.g. the joint positions by linking via frame_no_gta and person_id.

Scripts

To use the following scripts it is necessary to install the python requirements:

pip install -r requirements.txt  

mta_to_coco.py

Converts the mta videos and annotations into the coco annotation format and images.
Note that just the bounding box annotations are available.

Example:

python mta_to_coco.py \
    --mta_dataset_folder /media/philipp/philippkoehl_ssd/MTA_ext_short/test \
    --coco_mta_output_folder /media/philipp/philippkoehl_ssd/coco_MTA_ext_short/test \
    --sampling_rate 41 \
    --camera_ids 0,1,2,3,4,5

draw_full_annotations.py

Draws joint annotations, bounding box annotations and person ids into the frames of the MTA data and outputs a video.

Example:

python draw_full_annotations.py \
    --coords_folder "/media/philipp/philippkoehl_ssd/MTA_ext_short_coords/test" \
    --video_folder "/media/philipp/philippkoehl_ssd/MTA_ext_short/test" \
    --output_folder "/media/philipp/philippkoehl_ssd/MTA_ext_short_annotation_videos" \
    --camera_ids "0,1,2,3,4,5"

Citation

If you use it, please cite our work. The affiliated paper was published at the CVPR 2020 VUHCS Workshop (https://vuhcs.github.io/)

@InProceedings{Kohl_2020_CVPR_Workshops,
    author = {Kohl, Philipp and Specker, Andreas and Schumann, Arne and Beyerer, Jurgen},
    title = {The MTA Dataset for Multi-Target Multi-Camera Pedestrian Tracking by Weighted Distance Aggregation},
    booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    month = {June},
    year = {2020}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages