Data download

This section will guide you on how to download the datasets, and explain about each file format exist in the datasets.

datasets

ARKitScenes includes 3 datasets,

3dod - The dataset used to train 3d object detection. The dataset includes 3 assets: low resolution RGB image, low resolution depth image and the labels (The total Size is 623.4 GB for 5047 threedod scans)
upsampling - The dataset used to train depth upsampling. The dataset includes 3 assets: high resolution RGB image, low resolution depth image and high resolution depth image
raw - This dataset includes all data available in ARKitScenes, the 3dod and depth upsampling datasets are a subset of it, the dataset includes much more assets that are not part of 3DOD or depth upsampling.

Downloading data

Each dataset has a CSV file that includes all the visit_id, video_id and fold available in the dataset.

3DOD CSV path:

ARKitScenes/threedod/3dod_train_val_splits.csv

Upsampling CSV path:

ARKitScenes/depth_upsampling/upsampling_train_val_splits.csv

Raw CSV path:

ARKitScenes/raw/raw_train_val_splits.csv

To download each one of the datasets, we added a python script - download_data.py.

To download a specific video_id or series of video_ids, download_data.py expect the first argument to be the dataset name (i.e. 3dod/upsampling/raw) the second argument the fold (i.e. Training/Validation) and video_id or series of video_ids.

python3 download_data.py [3dod/upsampling/raw] --split [Training/Validation] --video_id video_id1 video_id2 \
--download_dir YOUR_DATA_FOLDER

for example

python3 download_data.py raw --split Training --video_id 47333462 \
--download_dir /tmp/ARKitScenes/

or

python3 download_data.py raw --split Training --video_id 47333462 \
--download_dir /tmp/ARKitScenes/ --download_laser_scanner_point_cloud

to download the laser scanner point-clouds (available only for the raw dataset)

To download with CSV, download_data.py expect the first argument to be a dataset name (i.e. 3dod/upsampling/raw), and no need for the fold, because the fold information exist in the CSV file.

python3 download_data.py [3dod/upsampling/raw] --video_id_csv CSV_PATH \
--download_dir YOUR_DATA_FOLDER

for example

python3 download_data.py 3dod --video_id_csv threedod/3dod_train_val_splits.csv \
--download_dir /tmp/raw_ARKitScenes/

Please note that for raw data, you will need to specify the type(s) of data you would like to download. The choices are

mov annotation mesh confidence highres_depth lowres_depth lowres_wide.traj lowres_wide lowres_wide_intrinsics ultrawide 
ultrawide_intrinsics vga_wide vga_wide_intrinsics

for example

python3 download_data.py raw --video_id_csv raw/raw_train_val_splits.csv --download_dir /tmp/ar_raw_all/ \
--raw_dataset_assets mov annotation mesh confidence highres_depth lowres_depth lowres_wide.traj \
lowres_wide lowres_wide_intrinsics ultrawide ultrawide_intrinsics vga_wide vga_wide_intrinsics

The data folder (i.e. YOUR_DATA_DIR) will includes two directories, Training and Validation which includes all the assets belonging to training and validation bin respectively.

Dataset files formats

The dataset includes the following formats

.png - store RGB images, depth images and confidence images
- RGB images - regular uint8, 3 channel image
- depth image - uint16 png format in millimeters
- confidence - uint8 png format 0-low confidence 2-high confidence
.pincam - store the intrinsic matrix for each RGB image
- is a single-line text file, space-delimited, with the following fields: width height focal_length_x focal_length_y principal_point_x principal_point_y
.json - store the object annotation
.traj - is a space-delimited file where each line represents a camera position at a particular timestamp
- Column 1: timestamp
- Columns 2-4: rotation (axis-angle representation in radians)
- Columns 5-7: translation (in meters)
.ply - store the mesh generated by ARKit or the point-clouds generated by the Faro laser scanner
.mov - video captured with ARKit (raw dataset only)
_pose.txt - Transformation matrix to align/register multiple FARO scans - Lines 0-2 contain the rotation matrix and line 3 the translation vector

Dataset structure

To deep dive into the structure of each of the datasets please go to the documentation of each one of the datasets

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DATA.md

DATA.md

Data download

datasets

Downloading data

Dataset files formats

Dataset structure

RAW

3DOD

Depth upsampling

Files

DATA.md

Latest commit

History

DATA.md

File metadata and controls

Data download

datasets

Downloading data

Dataset files formats

Dataset structure

RAW

3DOD

Depth upsampling