QRes-VAE (Quantized ResNet VAE) is a neural network model for lossy image compression. It is based on the ResNet VAE architecture.
Paper: Lossy Image Compression with Quantized Hierarchical VAEs, WACV 2023 Best Paper Award (Algorithms track)
Arxiv: https://arxiv.org/abs/2208.13056
- Progressive coding: the QRes-VAE model learns a hierarchy of features. It compresses/decompresses images in a coarse-to-fine fashion.
Note: images below are from the CelebA dataset and COCO dataset, respectively.
- Lossy compression efficiency: the QRes-VAE model has a competetive rate-distortion performance, especially at higher bit rates.
Requirements:
- Python,
pytorch>=1.9
,tqdm
,compressai
(link),timm>=0.5.4
(link). - Code has been tested in all of the following environments:
- Both Windows and Linux, with Intel CPUs and Nvidia GPUs
- Python 3.9
pytorch=1.9, 1.10, 1.11
with CUDA 11.3pytorch=1.12
with CUDA 11.6. This setup is recommended. Models run faster (both training and testing) in this setup than in previous ones.
Download:
- Download the repository;
- Download the pre-trained model checkpoints and put them in the
checkpoints
folder. Seecheckpoints/README.md
for expected folder structure.
- QRes-VAE (34M) [Google Drive]: our main model for natural image compression.
- QRes-VAE (17M) [Google Drive]: a smaller model trained on CelebA dataset for ablation study.
- QRes-VAE (34M, lossless) [Google Drive]: a lossless compression model. Better than PNG but not as good as WebP.
The lmb
in the name of folders is the multiplier for MSE during training. I.e., loss = rate + lmb * mse
.
A larger lmb
produces a higher bit rate but lower distortion.
- Compression and decompression (lossy): See
demo.ipynb
. - Compression and decompression (lossless):
experiments/demo-lossless.ipynb
- Progressive decoding:
experiments/progressive-decoding.ipynb
- Sampling:
experiments/uncond-sampling.ipynb
- Latent space interpolation:
experiments/latent-interpolation.ipynb
- Inpainting:
experiments/inpainting.ipynb
- Rate-distortion:
python evaluate.py --root /path/to/dataset
- BD-rate:
experiments/bd-rate.ipynb
- Estimate end-to-end flops:
experiments/estimate-flops.ipynb
We provide training instructions for QRes-VAE in our new project repository: https://github.com/duanzhiihao/lossy-vae/tree/main/lvae/models/qresvae
The code has a non-commercial license, as found in the LICENSE file.
@article{duan2023qres,
title={Lossy Image Compression with Quantized Hierarchical VAEs},
author={Duan, Zhihao and Lu, Ming and Ma, Zhan and Zhu, Fengqing},
journal={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
pages={198--207},
year={2023},
month=Jan
}