This is the Project I have done to upscale the Blurry and Low Resolution Image.
- ESRGAN is the enhanced version of the SRGAN.
- Python 3
- PyTorch >= 1.0 (CUDA version >= 7.5 if installing with CUDA. More details)
- Python packages:
pip install numpy opencv-python
- OpenCV glob2:
pip install opencv-python glob2
- Clone this github repo.
git clone https://github.com/xinntao/ESRGAN
cd ESRGAN
- Place your own low-resolution images in
./LR
folder. (There are two sample images - baboon and comic). - Download pretrained models from Google Drive . Place the models in
./models
. We provide two models with high perceptual quality and high PSNR performance (see model list). - Run test. We provide ESRGAN model and RRDB_PSNR model and you can config in the
test.py
.
python test.py
- The results are in
./results
folder.
Note: I train this Model using CPU(cuz i don't own GPU) so for GPU you need change the device Value in file test.py
Before diving into the ESRGAN first let’s get a High-level understanding of the GAN. GANs are capable of generating Fake data that looks realistic. Some of the GAN applications are to enhance the quality of the image. The high-level architecture of the GAN contains two main networks namely the generator network and the discriminator network. The generator Network tries to generate the fake data and the discriminator network tries to distinguish between real and fake data, hence helping the generator to generate more realistic data.
The main architecture of the ESRGAN is the same as the SRGAN with some modifications. ESRGAN has Residual in Residual Dense Block(RRDB) which combines multi-level residual network and dense connection without Batch Normalization.
ESRGAN improve the SRGAN from three aspects:
- adopt a deeper model using Residual-in-Residual Dense Block (RRDB) without batch normalization layers.
- employ Relativistic average GAN instead of the vanilla GAN.
- improve the perceptual loss by using the features before activation.
In contrast to SRGAN, which claimed that deeper models are increasingly difficult to train, our deeper ESRGAN model shows its superior performance with easy training.
Besides using standard discriminator ESRGAN uses the relativistic GAN, which tries to predict the probability that the real image is relatively more realistic than a fake image.
dis_loss = K.mean(K.binary_crossentropy(K.zeros_like(fake_logits),fake_logits)+K.binary_crossentropy(K.ones_like(real_logits),real_logits))
gen_loss = K.mean(K.binary_crossentropy(K.zeros_like(real_logit),real_logit)+K.binary_crossentropy(K.ones_like(fake_logit),fake_logit))
A More effective perceptual loss is introduced by constraining features before the activation function.
from keras.applications.vgg19 import preprocess_input
generated_feature = vgg(preprocess_vgg(img_hr))
original_fearure = vgg(preprocess_vgg(gen_hr))
percept_loss = tf.losses.mean_squared_error(generated_feature,original_fearure)
We propose the network interpolation strategy to balance the visual quality and PSNR.
We show the smooth animation with the interpolation parameters changing from 0 to 1. Interestingly, it is observed that the network interpolation strategy provides a smooth control of the RRDB_PSNR model and the fine-tuned ESRGAN model.
ESRGAN scales the Low Resolution(LR) image to a High-Resolution image with an upscaling factor of 4. For optimization, Adam optimizer is used with default values.
we have seen how ESRGAN outperforms its earlier version, SRGAN, and practically we have seen how to implement this environment in your local machine. For More information visit the original paper
@xinntao