Data Preparation

ScanNet v2

The preprocessing support semantic and instance segmentation for both ScanNet20, ScanNet200 and ScanNet Data Efficient.

Download the ScanNet v2 dataset.
Run preprocessing code for raw ScanNet as follows:

# RAW_SCANNET_DIR: the directory of downloaded ScanNet v2 raw dataset.
# PROCESSED_SCANNET_DIR: the directory of processed ScanNet dataset (output dir).
python ponder/datasets/preprocessing/scannet/preprocess_scannet.py --dataset_root ${RAW_SCANNET_DIR} --output_root ${PROCESSED_SCANNET_DIR}
# extract RGB-D iamges and 2D semantic labels (ScanNet20 only):
python ponder/datasets/preprocessing/scannet/reader.py --scans_path ${RAW_SCANNET_DIR}/scans --output_path ${PROCESSED_SCANNET_DIR}/rgbd --export_depth_images --export_color_images --export_poses --export_intrinsics --export_label

(Optional) Download ScanNet Data Efficient files:

# download-scannet.py is the official download script
# or follow instruction here: https://kaldir.vc.in.tum.de/scannet_benchmark/data_efficient/documentation#download
python download-scannet.py --data_efficient -o ${RAW_SCANNET_DIR}
# unzip downloads
cd ${RAW_SCANNET_DIR}/tasks
unzip limited-annotation-points.zip
unzip limited-bboxes.zip
unzip limited-reconstruction-scenes.zip
# copy files to processed dataset folder
cp -r ${RAW_SCANNET_DIR}/tasks ${PROCESSED_SCANNET_DIR}

Link processed dataset to codebase:

# PROCESSED_SCANNET_DIR: the directory of processed ScanNet dataset.
mkdir data
ln -s ${PROCESSED_SCANNET_DIR} ${CODEBASE_DIR}/data/scannet

S3DIS

Download S3DIS data by filling this Google form. Download the Stanford3dDataset_v1.2.zip file and unzip it.
Run preprocessing code for S3DIS as follows:

# S3DIS_DIR: the directory of downloaded Stanford3dDataset_v1.2 dataset.
# RAW_S3DIS_DIR: the directory of Stanford2d3dDataset_noXYZ dataset. (optional, for parsing normal)
# PROCESSED_S3DIS_DIR: the directory of processed S3DIS dataset (output dir).

# S3DIS with normal vector, RGB-D images and 2D semantic labels
python ponder/datasets/preprocessing/s3dis/preprocess_s3dis.py --dataset_root ${S3DIS_DIR} --output_root ${PROCESSED_S3DIS_DIR} --raw_root ${RAW_S3DIS_DIR} --parse_normal --parse_rgbd
# if you want S3DIS with aligned angle:
python ponder/datasets/preprocessing/s3dis/preprocess_s3dis.py --dataset_root ${S3DIS_DIR} --output_root ${PROCESSED_S3DIS_DIR} --raw_root ${RAW_S3DIS_DIR} --align_angle --parse_normal --parse_rgbd

Link processed dataset to codebase.

# PROCESSED_S3DIS_DIR: the directory of processed S3DIS dataset.
mkdir data
ln -s ${PROCESSED_S3DIS_DIR} ${CODEBASE_DIR}/data/s3dis

Structured3D

Download Structured3D panorama related and perspective (full) related zip files by filling this Google form (no need to unzip them).
Organize all downloaded zip file in one folder (${STRUCT3D_DIR}).
Run preprocessing code for Structured3D as follows:

# STRUCT3D_DIR: the directory of downloaded Structured3D dataset.
# PROCESSED_STRUCT3D_DIR: the directory of processed Structured3D dataset (output dir).
# NUM_WORKERS: Number for workers for preprocessing, default same as cpu count (might OOM).
export PYTHONPATH=./
python ponder/datasets/preprocessing/structured3d/preprocess_structured3d.py --dataset_root ${STRUCT3D_DIR} --output_root ${PROCESSED_STRUCT3D_DIR} --num_workers ${NUM_WORKERS} --grid_size 0.01 --fuse_prsp --fuse_pano --parse_rgbd

Following the instruction of Swin3D, we keep 25 categories with frequencies of more than 0.001, out of the original 40 categories.

Link processed dataset to codebase.

# PROCESSED_STRUCT3D_DIR: the directory of processed Structured3D dataset (output dir).
mkdir data
ln -s ${PROCESSED_STRUCT3D_DIR} ${CODEBASE_DIR}/data/structured3d

nuScenes

Download the official NuScene dataset (with Lidar Segmentation) and organize the downloaded files as follows:

NUSCENES_DIR
│── samples
│── sweeps
│── lidarseg
...
│── v1.0-trainval 
│── v1.0-test

Run information preprocessing code (modified from OpenPCDet) for nuScenes as follows:

# NUSCENES_DIR: the directory of downloaded nuScenes dataset.
# PROCESSED_NUSCENES_DIR: the directory of processed nuScenes dataset (output dir).
# MAX_SWEEPS: Max number of sweeps. Default: 10.
pip install nuscenes-devkit pyquaternion
python ponder/datasets/preprocessing/nuscenes/preprocess_nuscenes_info.py --dataset_root ${NUSCENES_DIR} --output_root ${PROCESSED_NUSCENES_DIR} --max_sweeps ${MAX_SWEEPS} --with_camera

Link raw dataset to processed NuScene dataset folder:

# NUSCENES_DIR: the directory of downloaded nuScenes dataset.
# PROCESSED_NUSCENES_DIR: the directory of processed nuScenes dataset (output dir).
ln -s ${NUSCENES_DIR} {PROCESSED_NUSCENES_DIR}/raw

then the processed nuscenes folder is organized as follows:

nuscene
|── raw
    │── samples
    │── sweeps
    │── lidarseg
    ...
    │── v1.0-trainval
    │── v1.0-test
|── info

Link processed dataset to codebase.

# PROCESSED_NUSCENES_DIR: the directory of processed nuScenes dataset (output dir).
mkdir data
ln -s ${PROCESSED_NUSCENES_DIR} ${CODEBASE_DIR}/data/nuscenes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data_preparation.md

data_preparation.md

Data Preparation

ScanNet v2

S3DIS

Structured3D

nuScenes

Files

data_preparation.md

Latest commit

History

data_preparation.md

File metadata and controls

Data Preparation

ScanNet v2

S3DIS

Structured3D

nuScenes