GitHub

[NeurIPS2024] Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention

This is the official implementation of Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention.

Project Page | Arxiv | Weights |

teaser.mp4

Create your digital portrait from single image

result_clr_scale4_Yann_LeCun.mp4

result_clr_scale4_musk.mp4

Installation

conda create -n Era3D python=3.9
conda activate Era3D

# torch
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118

# install xformers, download from https://download.pytorch.org/whl/cu118
pip install xformers-0.0.23.post1-cp39-cp39-manylinux2014_x86_64.whl 

# for reconstruciton
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
pip install git+https://github.com/NVlabs/nvdiffrast

# other depedency
pip install -r requirements.txt

Weights

You can directly download the model from huggingface. You also can download the model in python script:

from huggingface_hub import snapshot_download
snapshot_download(repo_id="pengHTYX/MacLab-Era3D-512-6view", local_dir="./pengHTYX/MacLab-Era3D-512-6view/")

Inference

we generate multivew color and normal images by running test_mvdiffusion_unclip.py. For example,

python test_mvdiffusion_unclip.py --config configs/test_unclip-512-6view.yaml \
    pretrained_model_name_or_path='pengHTYX/MacLab-Era3D-512-6view' \
    validation_dataset.crop_size=420 \
    validation_dataset.root_dir=examples \
    seed=600 \
    save_dir='mv_res'  \
    save_mode='rgb'

You can adjust the crop_size (400 or 420) and seed (42 or 600) to obtain best results for some cases.

Typically, we use rembg to predict alpha channel. If it has artifact, try to use Clipdrop to remove the background.
Instant-NSR Mesh Extraction

cd instant-nsr-pl
bash run.sh $GPU $CASE $OUTPUT_DIR

For example,

bash run.sh 0 A_bulldog_with_a_black_pirate_hat_rgba  recon

The textured mesh will be saved in $OUTPUT_DIR.

Training

The original SD2.1-Unclip base model employs a scaled linear noise scheduler during training. We modify it to linear scheduler for 512 training. Specifically, you can add this codesnap to /path/to/diffusers/schedulers/scheduling_utils.py: Class SchedulerMixin.

@classmethod
@validate_hf_hub_args
def from_pretrained_linear(
    cls,
    pretrained_model_name_or_path: Optional[Union[str, os.PathLike]] = None,
    subfolder: Optional[str] = None,
    return_unused_kwargs=False,
    **kwargs,
):

    config, kwargs, commit_hash = cls.load_config(
        pretrained_model_name_or_path=pretrained_model_name_or_path,
        subfolder=subfolder,
        return_unused_kwargs=True,
        return_commit_hash=True,
        **kwargs,
    )
    config['beta_schedule'] = 'linear'
    return cls.from_config(config, return_unused_kwargs=return_unused_kwargs, **kwargs)

We strongly recommend using wandb for logging, so you need export your personal key by

export WANDB_API_KEY="$KEY$"

Then, we begin training by

accelerate launch --config_file node_config/8gpu.yaml train_mvdiffusion_unit_unclip.py --config configs/train-unclip-512-6view.yaml

Related projects

We collect code from following projects. We thanks for the contributions from the open-source community!
diffusers
Wonder3D
Syncdreamer
Instant-nsr-pl

License

This project is under AGPL-3.0, so any downstream solution and products that include our codes or the pretrained model inside it should be open-sourced to comply with the AGPL conditions. If you have any questions about the usage of Era3D, please feel free to contact us.

Citation

If you find this codebase useful, please consider cite our work.

@article{li2024era3d,
  title={Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention},
  author={Li, Peng and Liu, Yuan and Long, Xiaoxiao and Zhang, Feihu and Lin, Cheng and Li, Mengfei and Qi, Xingqun and Zhang, Shanghang and Luo, Wenhan and Tan, Ping and others},
  journal={arXiv preprint arXiv:2405.11616},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
assets		assets
blender		blender
configs		configs
data_lists		data_lists
examples		examples
instant-nsr-pl		instant-nsr-pl
mvdiffusion		mvdiffusion
node_config		node_config
utils		utils
.gitignore		.gitignore
LICENCE		LICENCE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
test_mvdiffusion_unclip.py		test_mvdiffusion_unclip.py
train_mvdiffusion_unit_unclip.py		train_mvdiffusion_unit_unclip.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[NeurIPS2024] Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention

Project Page | Arxiv | Weights |

Create your digital portrait from single image

Installation

Weights

Inference

Training

Related projects

License

Citation

About

Releases

Packages

Contributors 2

Languages

License

pengHTYX/Era3D

Folders and files

Latest commit

History

Repository files navigation

[NeurIPS2024] Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention

Project Page | Arxiv | Weights |

Create your digital portrait from single image

Installation

Weights

Inference

Training

Related projects

License

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages