Inspired by LIIF, DiffIR, and IDM, we fuse INR and latent diffusion for arbitrary scale, efficient and realistic image SR. Our model needs two stage training. At the first stage, our model learns how to encode a prior from a HR image. By injecting the prior to a latent representation of a LR image, LIIF can upsample the LR image more correctly. At the second stage, the prior encoding module is replaced to diffusion module. We expect that by sampling the prior from the diffusion module, we can generate plausible details for SR images. Besides, as we conduct the diffusion prcoess on a prior, we can reduce computational cost and the number of denoising step than IDM. To alleviate over-smoothing problem, we give GAN loss to our model. Quantitative and qualitative results are as follows.
We use DIV2K and Benchmark. The Benchmark includes Set5, Set14, Urban100, and B100.
mkdir load
for putting the dataset folders.
- DIV2K:
mkdir
andcd
intoload/div2k
. Download andunzip
the Train_HR, Valid_HR, Valid_LR_X2, Valid_LR_X3, and Valid_LR_X4 (provided by DIV2K website).mv
X4/
,X3/
, andX2/
folders of Valid_LR to a singleDIV2K_valid_LR_bicubic
folder. - Benchmark:
cd
intoload/
. Download andtar -xf
the Benchmark (provided by this repo).
- Stage1
python train.py --config /home/kaist2/Desktop/LIIFusion/configs/train-two-stage/train_stage1.yaml
- Stage2
python train.py --config /home/kaist2/Desktop/LIIFusion/configs/train-two-stage/train_stage2.yaml
- DIV2K
bash scripts/test-div2k.sh [MODEL_PATH] [GPU]
- Benchmark
bash scripts/test-benchmark.sh [MODEL_PATH] [GPU]
python demo.py --input [IMAGE_PATH] --model [MODEL_PATH] --scale [SCALE_NUM]