GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
Yinghao Xu*, Zifan Shi*, Wang Yifan, Hansheng Chen, Ceyuan Yang, Sida Peng, Yujun Shen, Gordon Wetzstein
[Paper] [Project Page] [Blender Demo] [HF Demo][Weights]
3d-generation.mp4
- Release gradio demo code.
- Release inference code.
- Release pretrained models.
- Release training code.
- Huggingface Demo
- Replicate Demo. Thanks @camenduru for the jupyter code!
- 64-bit Python 3.10 and PyTorch 2.0.1 or higher.
- CUDA 11.8
- Users can use the following commands to install the packages
conda create -n grm python=3.10
conda activate grm
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu118
cd third_party/diff-gaussian-rasterization && pip install -e .
Pretrained weights can be downloaded from Hugging Face.
# Download weights
mkdir checkpoints && cd checkpoints
wget https://huggingface.co/justimyhxu/GRM/resolve/main/grm_u.pth -O grm_u.pth
wget https://huggingface.co/justimyhxu/GRM/resolve/main/grm_r.pth -O grm_r.pth
wget https://huggingface.co/justimyhxu/GRM/resolve/main/grm_zero123plus.pth -O grm_zero123plus.pth
cd ..
Note that we provide three checkpoints for use. We use the OpenCV coordinate system.
Checkpoint | Training settings |
---|---|
grm_u.pth | The elevations are all 20 degrees and the azimuths uniformly cover all the 360-degree information. |
grm_r.pth | The azimuths roughly cover the 360-degree information. |
grm_zero123plus.pth | Three views are with 30-degree elevations and the azimuths are evenly distributed at intervals of 120 degrees. Another view has the elevation of -20 degrees and the azimuth is 60 degrees different from one of the three. |
Besides, you need to download checkpoints for SV3D.
cd checkpoints
wget --header="Authorization: Bearer HF_TOKEN" https://huggingface.co/stabilityai/sv3d/resolve/main/sv3d_p.safetensors -O sv3d_p.safetensors && cd ..
# export cuda
export CUDA_HOME=/usr/local/cuda-11.8
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
# text-to-3D
python test.py --prompt 'a car made out of cheese'
# image-to-3D with zero123plus-v1.1
python test.py --image_path examples/dragon2.png --model zero123plus-v1.1
# image-to-3D with zero123plus-v1.2
python test.py --image_path examples/dragon2.png --model zero123plus-v1.2 --fuse_mesh True --optimize_texture True
# image-to-3D with SV3D
python test.py --image_path examples/dragon2.png --model sv3d --fuse_mesh True --optimize_texture True
Add --fuse_mesh True
if you would like to get the textured mesh.
Add --optimize_texture True
if you would like to optimize texture on extracted textured mesh.
We provide an offline gradio demo, which can be run with the following command:
python app.py
blender_demo.mp4
sparse-view.mp4
We thank all of the following amazing codes:
- gaussian-splatting, and diff-gaussian-rasterization for depth rendering
- ARF
- zero123++
- Instant3D
- SV3D
- V3D
- nvdiffrast
- MVEdit
@article{xu2024grm,
author = {Xu, Yinghao and Shi, Zifan and Yifan, Wang and Peng, Sida and Yang, Ceyuan and Shen, Yujun and Wetzstein Gordon},
title = {GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation},
journal = {arxiv: 2403.14621},
year = {2024},
}