The translation of an input image into a corresponding output image is a common problem in image processing, graphics, and vision. Even though the scenario is always the same i.e. to map pixels to pixels, these challenges are frequently solved with application-specific techniques. Conditional adversarial nets are a general-purpose approach that looks to be effective for a wide range of these issues. Automatic image colorization has piqued attention for a variety of applications, including the restoration of aged or deteriorated photos.
This Dataset contains 7129 colourful RGB images & 7129 grayscale images of landscapes in jpg image format. Images consists of streets, buildings, mountains, glaciers, trees etc and their corresponding grayscale image in two different folders.
Landscape color and grayscale images Dataset
Pre-Processing include common image related function such as loading, resizing, random cropping plus rotation etc. And some image color-space transformation function like rgb2lab and vice versa.
RGB -> LAB
@tf.function()
def rbg2lab(target_img, isnormalized = True, normalize_lab = False, comp = ''):
if not isnormalized:
target_img = target_img/255.0 # normalizes the rgb image in range 0-1
# Takes RGB Image in Normalized Form
target_img = tfio.experimental.color.rgb_to_lab(target_img)
if normalize_lab:
tf.Assert(tf.reduce_any(tf.equal(comp, ['vis', 'net'])), data=[comp], name='Lab_Normalization_Error')
if comp == 'vis':
target_img = (target_img + [0, 128, 128]) / [100., 255., 255.] # normalizes in 0-1 range for visualization of image
else:
target_img = target_img / [50., 127.5, 127.5] + [-1, 0., 0.] # normalizes in -1 to 1 range for neural networks as they perform better in this range
return target_img
LAB -> RGB
@tf.function()
def lab2rgb(lab_img, isnormalized = False, comp = ''):
if isnormalized:
tf.Assert(tf.reduce_any(tf.equal(comp, ['vis', 'net'])), data=[comp], name='Lab_Normalization_Error')
if comp == 'vis':
lab_img = lab_img * [100.,255., 255.] + [0, -128, -128]; # from 0-1 range
else:
lab_img = (lab_img + [1.,0., 0.]) * [50., 127.5, 127.5]; # from -1 to 1 range
# Takes LAB Image in Unnormalized Form.
rgb = tfio.experimental.color.lab_to_rgb(lab_img)
return rgb
The input (grayscale image) is transmitted through a number of layers that gradually downsample the data until it reaches a bottleneck layer, at which time the process is reversed and the latent representation is upsampled into an colored output image, we add skip connections in the shape of a "U-Net". We add skip connections between each layer i and layer n-i where n is the total number of layers, in particular. Each skip connection simply concatenates all layers i and n-i channels.
The discriminator attempts to determine whether each of the N x N patches in a picture is authentic or phoney. The final output of D is calculated by averaging all responses.
This model combines both the generator and discriminator, calculates losses and matrices, performs gradient descent steps and displays results.
Gist Link : GAN Architecture