Yicheng Yang
Pengxiang Li
Lu Zhang
Liqian Ma
Ping Hu
Siyu Du
Yunzhi Zhuge
Xu Jia
Huchuan Lu
- [24.11.27] Official release of paper and code!
Note: DreamMix requires a GPU with 24GB memory to run.
-
Clone the repository:
git clone https://github.com/mycfhs/DreamMix.git cd DreamMix
-
Prepare the environment:
conda create -n DreamMix python=3.10 conda activate DreamMix pip install -r requirements.txt
-
Download the necessary models:
- Fooocus inpaint v26 patch → Place in
models/fooocus_inpaint
. - fooocus_inpaint_head → Place in
models/fooocus_inpaint
. - Upscale model → Place in
models/upscale_mode
. - (Optional)improve quailty lora → Place in
models/lora
.
- Fooocus inpaint v26 patch → Place in
-
Install lang-segment-anything:
git clone https://github.com/mycfhs/lang-segment-anything cd lang-segment-anything pip install -e .
-
Generate regular images using the
make_img.ipynb
notebook. -
Download the DreamBooth dataset here and get
train_data
dir.
To begin training, use the following command:
CATEGORY="teapot"
INSTANCE_PROMPT="..."
accelerate launch train.py \
--category="${CATEGORY}" \
--output_dir="lora/${CATEGORY}" \
--regular_dir="./regular_${CATEGORY}" \
--regular_prob=0.3 \
--loss_reweight_object=1.5 \
--loss_reweight_background=0.6 \
--pretrained_model_name_or_path="frankjoshua/juggernautXL_v8Rundiffusion" \
--instance_data_dir="train_data/${CATEGORY}/image" \
--mixed_precision="no" \
--instance_prompt="${INSTANCE_PROMPT}" \
--resolution=1024 \
--train_batch_size=1 \
--gradient_accumulation_steps=4 \
--learning_rate=1e-4 \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--max_train_steps=1000 \
--seed="0" \
--checkpointing_steps=250 \
--resume_from_checkpoint="latest" \
--enable_xformers_memory_efficient_attention
To perform inference, follow the instructions in infer.ipynb
.
If you find our work helpful, please consider giving us a ⭐ or citing our paper:
@misc{yang2024dreammix,
title={DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting},
author={Yicheng Yang and Pengxiang Li and Lu Zhang and Liqian Ma and Ping Hu and Siyu Du and Yunzhi Zhuge and Xu Jia and Huchuan Lu},
year={2024},
eprint={2411.17223},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.17223},
}
We extend our gratitude to the incredible open-source community. Our work is based on the following resources:
-
Fooocus Thanks their fantastic inpaint method Inpaint v26 Fooocus Patch.
-
We use JuggernautXL v8 Rundiffusion as our base generator.
-
Training code is based on Diffusers SDXL DreamBooth example.
-
Image samples are collected from Pixabay and COCO Dataset.
Here's the techniques we have incorporated in Fooocus:
- Blur Guidance: Controlled with the
sharpness
parameter. - ADM Scaler: Parameters
adm_scaler_positive, adm_scaler_negative, adm_scaler_end
- Inpaint Worker: Enhanced inpainting logic.
- Prompt Style Enhancement: Improves prompt-adaptability.
- Advanced Sampler & Scheduler:
Dpmpp2mSdeGpuKarras
. - Hybrid Models: Utilizes both base and inpainting models across different timesteps (
fooocus_time
).
For questions or feedback, please reach out to us at mycf2286247133@gmail.com.