Official implementation of Splatter-360: Generalizable 360 Gaussian Splatting for Wide-baseline Panoramic Images
2024.12.15 we upload the preprocess code of HM3D and Replica dataset.
If you find this repo useful, please give me a star.
And if you use our code, please cite the following bibtex:
@article{chen2024splatter,
title={Splatter-360: Generalizable 360$\^{}$\{$$\backslash$circ$\}$ $ Gaussian Splatting for Wide-baseline Panoramic Images},
author={Chen, Zheng and Wu, Chenming and Shen, Zhelun and Zhao, Chen and Ye, Weicai and Feng, Haocheng and Ding, Errui and Zhang, Song-Hai},
journal={arXiv preprint arXiv:2412.06250},
year={2024}
}
To get started, create a conda virtual environment using Python 3.10+ and install the requirements:
conda create -n splat360 python=3.10
conda activate splat360
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
Replica: Download replica_dataset.zip
(rgb and depth files) and replica_dataset_pt.zip
(scene indices) from BaiduNetDisk or OneDrive and unzip them in the same directory. Revise dataset.roots
and dataset.rgb_roots
respectively in config/experiment/replica.yaml
according to your storage directory.
HM3D: As HM3D training set is too large(about 2~3T), we upload the preprocess code to make our training and test code. You can make your training set by yourself. (We encourage the following researchers to refine our preprocessing code to save storage space.)
We will upload our HM3D testset.
To render novel views and compute evaluation metrics from a pretrained model,
-
get the pretrained models, and save them to
/checkpoints
-
run the following:
# eval on HM3D
output_dir="./outputs/splat360_log_depth_near0.1-100k/"
checkpoint_path="./checkpoints/hm3d.ckpt"
CUDA_VISIBLE_DEVICES=0 python -m src.main \
+experiment=hm3d \
model.encoder.shim_patch_size=8\
model.encoder.downscale_factor=8\
model.encoder.depth_sampling_type="log_depth" \
output_dir=$output_dir \
dataset.near=0.1 \
mode="test" \
dataset/view_sampler=evaluation \
checkpointing.load=$checkpoint_path \
dataset.view_sampler.index_path="assets/evaluation_index_hm3d.json"\
test.eval_depth=true
- the rendered novel views will be stored under
outputs/test
To render videos from a pretrained model, run the following
# HM3D render video
output_dir="./outputs/splat360_log_depth_near0.1-100k/"
checkpoint_path="./checkpoints/hm3d.ckpt"
CUDA_VISIBLE_DEVICES=0 python -m src.main \
+experiment=hm3d \
model.encoder.shim_patch_size=8 \
model.encoder.downscale_factor=8 \
model.encoder.depth_sampling_type="log_depth" \
output_dir=$output_dir \
dataset.near=0.1 \
mode="test" \
dataset/view_sampler=evaluation\
checkpointing.load=$checkpoint_path\
dataset.view_sampler.index_path="assets/evaluation_index_hm3d_video.json" \
test.save_video=true \
test.save_image=false \
test.compute_scores=false \
test.eval_depth=true
# download the backbone pretrained weight from unimath and save to 'checkpoints/'
wget 'https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmdepth-scale1-resumeflowthings-scannet-5d9d7964.pth' -P checkpoints
# download the pretrained weight of depth-anything and save to 'checkpoints/'
wget https://huggingface.co/depth-anything/Depth-Anything-V2-Small/resolve/main/depth_anything_v2_vits.pth -P checkpoints
# Our models are trained with 8 V100 (32GB) GPU.
max_steps=100000
output_dir="./outputs/splat360_log_depth_near0.1-100k/"
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m src.main \
+experiment=hm3d data_loader.train.batch_size=1 \
model.encoder.shim_patch_size=8 \
model.encoder.downscale_factor=8 \
trainer.max_steps=$max_steps \
model.encoder.depth_sampling_type="log_depth" \
output_dir=$output_dir \
dataset.near=0.1
We use the default model trained on HM3D to conduct cross-dataset evaluations. To evaluate them, e.g., on Replica, run the following command
output_dir="./outputs/splat360_log_depth_near0.1-100k/"
# eval on Replica
checkpoint_path="./checkpoints/hm3d.ckpt"
CUDA_VISIBLE_DEVICES=0 python -m src.main \
+experiment=replica \
model.encoder.shim_patch_size=8 \
model.encoder.downscale_factor=8 \
model.encoder.depth_sampling_type="log_depth" \
output_dir=$output_dir \
dataset.near=0.1 \
mode="test" \
dataset/view_sampler=evaluation \
checkpointing.load=$checkpoint_path \
dataset.view_sampler.index_path="assets/evaluation_index_replica.json"\
test.eval_depth=true
The project is largely based on pixelSplat, MVSplat, and PanoGRF and has incorporated numerous code snippets from UniMatch. Many thanks to these projects for their excellent contributions!