Basic Denoising Diffusion Probabilistic Model image generator implemented in PyTorch.
Reproduces Denoising Diffusion Probabilistic Models.
Developed as an educational project, with the aim of having a simpler PyTorch implementation and development setup than other DDPM implementations available. Small and lean enough to train on a commodity GPU (in this case my Geforce 4070 Ti).
The basic idea is to train a model to learn how to denoise images. Images are generated by using this trained model to iteratively remove noise from a random noise image until a coherent image forms.
Two pretrained models are provided in the checkpoints/
directory for Fashion MNIST and a 11k Pokemon dataset.
- Reproducible environment with
rye
. Get setup with a single command. - Automatic dataset download and preprocessing for certain preloaded datasets.
- Example notebook for sampling and gif generation.
- Train on your own dataset by providing image files in a
--data-dir
directory.
Fashion MNIST sample generations
Pokemon 11k sample generations
NOTE: With small images, and high training epochs, the model likely overfits and gains the capability to memorize training samples
This repo uses rye
as the package/environment manager. Make sure to install it before proceeding.
The following command will install packages and setup a virtual environment
# Install packages
rye sync
# Activate virtual enviornment
. .venv/bin/activate
Once installed, the model can be trained and used via the diffumon
command
diffumon --help
diffumon train --help
diffumon train --preloaded fashion_mnist --num-epochs 100 --checkpoint-path checkpoints/fashion_mnist_100epochs.pth
diffumon train --preloaded pokemon_11k --num-epochs 800 --img-dim 64 -- batch-size 64 --checkpoint-path checkpoints/pokemon_11k_800epochs_64dim.pth
diffumon train --data-dir /path/to/dataset --num-epochs 100 --checkpoint-path checkpoints/my_dataset_100_epochs.pth
Where /path/to/dataset
should have a directory structure like the following:
/path/to/dataset/
train/
class_0/
img_0.png
img_1.png
test/
class_0/
img_0.png
img_1.png
diffumon sample --help
diffumon sample --checkpoint-path checkpoints/fashion_mnist_100epochs.pth --num-samples 32 --output-dir samples/fashion_mnist_100epochs
diffumon sample --checkpoint-path checkpoints/pokemon_11k_800epochs_32dim.pth --num-samples 32 --output-dir samples/pokemon_11k_800epochs_32dim
- Denoising Diffusion Probabilistic Models - The original paper by Ho et al. (2020)
- diffusion on github - The official codebase by the authors.
- Improving Denoise Diffusion Probabilistic Models - Improved methodology by Nichol et al. (2021)
- What are Diffusion Models - By Lilian Weng - Math heavy blog post explaining the concept.
- Tutorial on Diffusion Models for Imaging and Vision - Tutorial by Stanley Chan which succinctly explains the math quite well.
- The Annotated Diffusion - Basic tutorial for diffusion which goes off lucidrain's PyTorch implementation. This was the most utilized reference for this project!
- lucidrains denoising-diffusion-pytorch - Ports Jonathan Ho's original code to PyTorch along with many of the original implementation's quirks. This was used as the primary code reference for this project.
black
, ruff
, isort
, and pre-commit
should come as preinstalled dev developer packages in the virtual environment.
It's strongly recommended to install pre-commit hooks to ensure code consistency and quality which will automatically run formatters (but not linters) before each commit.
pre-commit install
There are also example notebook(s) in the notebooks/
directory.
Make sure to install the diffumon
kernel in Jupyter to run the notebooks.
python -m ipykernel install --user --name diffumon --display-name "Python Diffumon"
- Add support for more preloaded datasets
- Add smarter periodic checkpointing
- Add logging
- Improve learning rate scheduling
- Add DDIM (Denoising Diffusion Implicit Models) support
- Add (Hydra-based?) preconfigured training options