Skip to content

The goal of this project is to color the black and white portraits. This project is based on the deep learning concepts and uses Pix2Pix GAN which is a general approach for image-to-image translation.

Notifications You must be signed in to change notification settings

amirdy/colorizing-BW-images

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 

Repository files navigation

Preview    Details of Implementation    Networks    Hyperparameters - Losses - Tools    Results    References    Useful Resources   

The goal of this project is to color the black and white portraits. This project is based on the deep learning concepts and uses Pix2Pix GAN which is a general approach for image-to-image translation.

Preview

Details of Implementation

Dataset

This dataset consists of about 5,000 people's images including family portraits, wedding portraits, and so on.

The images have different dimensions. The average dimension is (636 x 788).

Preprocessing

1- The Black & White images were removed from the dataset.

2- The dimension of the images was changed to (768 x 768).

3- These RGB color space images were converted to Lab color space images.

  • Why? In the Lab color space only the second(a) and third(b) channels color the image. Thus, for the training of the Generator, the L-channel can be ignored. Removing one channel definitely makes the problem easier.

4- At the end, some sort of normalization was taken:

  • For all values in the L-channel like x:

  • For all values in the a-channel or b-channel like x:

  • (These two steps led to having a 3-dimensional matrix (a tensor of the image in Lab color space) in which all the values are in the range of [-1,1])

Networks

Generator :

  • Goal:

    • Takes the L-channel of the image as an input and generates the ab-channels as an output.
  • Input:

    • A batch of images [L-channel]:
      • A tensor of shape : (Batch size, 1, 768, 768)
  • Output:

    • The predicted values [ab-channels]:

      • A tensor of shape : (Batch size, 2, 768, 768)
  • Structure:

    • Unet1

      • Encoder : EfficientNet-b5

      • ImageNet pre-trained weights were used.

        • The input channel was set to 1.

        • The output channel was set to 2.

    The L-channel concatenates to ab-channels generated from the generator, and the colored image is obtained. Therefore the generator generates the colored image.

Discriminator :

  • Goal:

    • Takes an image(in Lab color space) and then generates a 2-dimensional matrix (94 x 94) in which each value represents that the corresponding part of the image is real or fake (a number that indicates the probability of being Real)
  • In the Training phase of the Discriminator:

    • Input

      • A batch of images that contains the original images (in a Lab color space) and the images generated from the generator:
        • A tensor of shape : (Batch size*2, 3, 768, 768)
    • Output:

      • The generated matrix:
        • A tensor of shape : (Batch size*2, 1, 94, 94)
  • In the Training phase of the Generator:

    • Input

      • A batch of images generated from the generator:
        • A tensor of shape : (Batch size, 3, 768, 768)
    • Output:

      • The generated matrix:
        • A tensor of shape : (Batch size, 1, 94, 94)
  • Structure:

       (1) Conv → BN → ReLU
       (2) Conv → BN → ReLU 
       (3) Conv → BN → ReLU 
       (4) Conv → BN → ReLU 
       (5) Conv → Sigmoid
       
       (Please check the source code for more information, especially the number of filters and their sizes)

Hyperparameters - Losses - Tools

  • Batch size:

    • 3
  • Train vs Test Split:

    • Approximately : 0.9 | 0.1
  • In the Training phase of the Discriminator:

    • Optimizer:
      • ADAM
        • Learning Rate : 0.001
    • Loss:
      • 0.5 . BCE(Discriminator(X), Y)

        • If X is a real image, then Y is a tensor in which all the values are 1.
        • If X is a fake image (the image is derived from the generator), then Y is a tensor in which all the values are 0.
  • In the Training phase of the Generator:

    • Optimizer:
      • ADAM
        • Learning Rate : 0.001
    • Loss:
      • BCE(Discriminator(XGenerator), 1T) + 10 . |XReal - XGenerator|1

        • 1T is a tensor in which all the values are 1.
        • XReal is the real image, and XGenerator is the fake colored version of XReal derived from the generator.
  • Tools:

    • Python - Pytorch (Using Google Colab Pro)

Results

Training finished after 46 epochs (on Tesla T4). In every three epochs, the weights of both networks were saved. After reviewing the Test images generated by the generator at every three epochs, the epoch led to the best images was obtained. In Epoch 27, the generator generated more realistic Test images.

The Generator and Discriminator Losses (for Train images) at each epoch are shown in the figures below.

alt text alt text

References

[1] Segmentation Models Pytorch - Pavel Iakubovskii

Useful Resources

[1] Understanding PatchGAN

[2] How to Develop a Pix2Pix GAN for Image-to-Image Translation

[3] Colorizing black & white images

About

The goal of this project is to color the black and white portraits. This project is based on the deep learning concepts and uses Pix2Pix GAN which is a general approach for image-to-image translation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published