Tao Hu, Fangzhou Hong, Ziwei Liu
S-Lab, Nanyang Technological University
Project Page · Paper · Video
We propose StructLDM, a structured latent diffusion model that learns 3D human generations from 2D images.
StructLDM generates diverse view-consistent humans, and supports different levels of controllable generation and editing, such as compositional generations by blending the five selected parts from a), and part-aware editing such as identity swapping, local clothing editing, 3D virtual try-on, etc. Note that the generations and editing are clothing-agnostic without clothing types or masks conditioning.
Generations on RenderPeople.NVIDIA GPUs are required for this project. We have trained and tested code on NVIDIA V100. We recommend using anaconda to manage the python environments.
conda create --name structldm python=3.9
conda install pytorch==1.10.1 torchvision==0.11.2 cudatoolkit=11.1 -c pytorch
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install pytorch3d -c pytorch3d
pip install -r requirements.txt
Download sample data, necessary assets, and pretrained models from OneDrive. Put them in DATA_DIR/result/trained_model and DATA_DIR/asset respectively. DATA_DIR is specified as ./data in default.
Register and download SMPL models here. Put them in the folder smpl_data.
The folder structure should look like
DATA_DIR
├── dataset
├──renderpeople/
└── asset/
├── smpl_data/
└── SMPL_NEUTRAL.pkl
├── uv_sampler/
├── uv_table.npy
├── smpl_uv.obj
├── smpl_template_sdf.npy
├── sample_data.pkl
├── result/
├── trained_model/modelname/
└──decoder_xx, diffusion_xx
├──samples/
├── test_output
Generating 3D humans via (e.g., models trained on RenderPeople).
bash scripts/renderpeople.sh gpu_ids
The generation results will be found in DATA_DIR/result/test_output.
The training script of latent diffusion can be found in struct_diffusion.
bash struct_diffusion/scripts/exec.sh "train" gpu_ids
Trained models will be stored in DATA_DIR/result/trained_model/modelname/diffusion_xx.pt.
Refer to the downloaded sample data at ./data/dataset/renderpeople to prepare your own dataset, and modify the corresponding path in the config file.
The inference script of latent diffusion can be found in struct_diffusion.
bash struct_diffusion/scripts/test.sh gpu_ids
Samples will be stored in DATA_DIR/result/trained_model/modelname/samples.
Distributed under the S-Lab License. See LICENSE
for more information.
SMPL-X related files are subject to the license of SMPL-X.
If you find our code or paper is useful to your research, please consider citing:
@misc{hu2024structldm,
title={StructLDM: Structured Latent Diffusion for 3D Human Generation},
author={Tao Hu and Fangzhou Hong and Ziwei Liu},
year={2024},
eprint={2404.01241},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
The structured diffusion model is implemented on top of the Latent-Diffusion.