CNN과 Resnet의 결과물을 사용하여 grey image를 색칠하는 모델을 만들어 보았다.
Place365-Standard의 small images 256*256 의 validation 이미지을 사용하여 학습을 진행 하였다. 이중 흑백 이미지는 outlier로 보고, 학습에 도움되지 않을 것 같아 제외 하였다.
ResNet50 에서 Fully Connected layer 를 없애고 , ConvTransepose2d 를 사용하여 크기를 다시 256*256으로 맞추어 주었다.
Full structure
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 64, 128, 128] 3,200
BatchNorm2d-2 [-1, 64, 128, 128] 128
ReLU-3 [-1, 64, 128, 128] 0
Conv_block-4 [-1, 64, 128, 128] 0
MaxPool2d-5 [-1, 64, 64, 64] 0
Conv2d-6 [-1, 64, 64, 64] 4,160
BatchNorm2d-7 [-1, 64, 64, 64] 128
ReLU-8 [-1, 64, 64, 64] 0
Conv_block-9 [-1, 64, 64, 64] 0
Conv2d-10 [-1, 64, 64, 64] 36,928
BatchNorm2d-11 [-1, 64, 64, 64] 128
ReLU-12 [-1, 64, 64, 64] 0
Conv_block-13 [-1, 64, 64, 64] 0
Conv2d-14 [-1, 256, 64, 64] 16,640
BatchNorm2d-15 [-1, 256, 64, 64] 512
Conv_block-16 [-1, 256, 64, 64] 0
Conv2d-17 [-1, 256, 64, 64] 16,640
ReLU-18 [-1, 256, 64, 64] 0
Res_block-19 [-1, 256, 64, 64] 0
Conv2d-20 [-1, 64, 64, 64] 16,448
BatchNorm2d-21 [-1, 64, 64, 64] 128
ReLU-22 [-1, 64, 64, 64] 0
Conv_block-23 [-1, 64, 64, 64] 0
Conv2d-24 [-1, 64, 64, 64] 36,928
BatchNorm2d-25 [-1, 64, 64, 64] 128
ReLU-26 [-1, 64, 64, 64] 0
Conv_block-27 [-1, 64, 64, 64] 0
Conv2d-28 [-1, 256, 64, 64] 16,640
BatchNorm2d-29 [-1, 256, 64, 64] 512
Conv_block-30 [-1, 256, 64, 64] 0
Identity-31 [-1, 256, 64, 64] 0
ReLU-32 [-1, 256, 64, 64] 0
Res_block-33 [-1, 256, 64, 64] 0
Conv2d-34 [-1, 64, 64, 64] 16,448
BatchNorm2d-35 [-1, 64, 64, 64] 128
ReLU-36 [-1, 64, 64, 64] 0
Conv_block-37 [-1, 64, 64, 64] 0
Conv2d-38 [-1, 64, 64, 64] 36,928
BatchNorm2d-39 [-1, 64, 64, 64] 128
ReLU-40 [-1, 64, 64, 64] 0
Conv_block-41 [-1, 64, 64, 64] 0
Conv2d-42 [-1, 256, 64, 64] 16,640
BatchNorm2d-43 [-1, 256, 64, 64] 512
Conv_block-44 [-1, 256, 64, 64] 0
Identity-45 [-1, 256, 64, 64] 0
ReLU-46 [-1, 256, 64, 64] 0
Res_block-47 [-1, 256, 64, 64] 0
Conv2d-48 [-1, 128, 32, 32] 32,896
BatchNorm2d-49 [-1, 128, 32, 32] 256
ReLU-50 [-1, 128, 32, 32] 0
Conv_block-51 [-1, 128, 32, 32] 0
Conv2d-52 [-1, 128, 32, 32] 147,584
BatchNorm2d-53 [-1, 128, 32, 32] 256
ReLU-54 [-1, 128, 32, 32] 0
Conv_block-55 [-1, 128, 32, 32] 0
Conv2d-56 [-1, 512, 32, 32] 66,048
BatchNorm2d-57 [-1, 512, 32, 32] 1,024
Conv_block-58 [-1, 512, 32, 32] 0
Conv2d-59 [-1, 512, 32, 32] 131,584
ReLU-60 [-1, 512, 32, 32] 0
Res_block-61 [-1, 512, 32, 32] 0
Conv2d-62 [-1, 128, 32, 32] 65,664
BatchNorm2d-63 [-1, 128, 32, 32] 256
ReLU-64 [-1, 128, 32, 32] 0
Conv_block-65 [-1, 128, 32, 32] 0
Conv2d-66 [-1, 128, 32, 32] 147,584
BatchNorm2d-67 [-1, 128, 32, 32] 256
ReLU-68 [-1, 128, 32, 32] 0
Conv_block-69 [-1, 128, 32, 32] 0
Conv2d-70 [-1, 512, 32, 32] 66,048
BatchNorm2d-71 [-1, 512, 32, 32] 1,024
Conv_block-72 [-1, 512, 32, 32] 0
Identity-73 [-1, 512, 32, 32] 0
ReLU-74 [-1, 512, 32, 32] 0
Res_block-75 [-1, 512, 32, 32] 0
Conv2d-76 [-1, 128, 32, 32] 65,664
BatchNorm2d-77 [-1, 128, 32, 32] 256
ReLU-78 [-1, 128, 32, 32] 0
Conv_block-79 [-1, 128, 32, 32] 0
Conv2d-80 [-1, 128, 32, 32] 147,584
BatchNorm2d-81 [-1, 128, 32, 32] 256
ReLU-82 [-1, 128, 32, 32] 0
Conv_block-83 [-1, 128, 32, 32] 0
Conv2d-84 [-1, 512, 32, 32] 66,048
BatchNorm2d-85 [-1, 512, 32, 32] 1,024
Conv_block-86 [-1, 512, 32, 32] 0
Identity-87 [-1, 512, 32, 32] 0
ReLU-88 [-1, 512, 32, 32] 0
Res_block-89 [-1, 512, 32, 32] 0
Conv2d-90 [-1, 128, 32, 32] 65,664
BatchNorm2d-91 [-1, 128, 32, 32] 256
ReLU-92 [-1, 128, 32, 32] 0
Conv_block-93 [-1, 128, 32, 32] 0
Conv2d-94 [-1, 128, 32, 32] 147,584
BatchNorm2d-95 [-1, 128, 32, 32] 256
ReLU-96 [-1, 128, 32, 32] 0
Conv_block-97 [-1, 128, 32, 32] 0
Conv2d-98 [-1, 512, 32, 32] 66,048
BatchNorm2d-99 [-1, 512, 32, 32] 1,024
Conv_block-100 [-1, 512, 32, 32] 0
Identity-101 [-1, 512, 32, 32] 0
ReLU-102 [-1, 512, 32, 32] 0
Res_block-103 [-1, 512, 32, 32] 0
Conv2d-104 [-1, 256, 16, 16] 131,328
BatchNorm2d-105 [-1, 256, 16, 16] 512
ReLU-106 [-1, 256, 16, 16] 0
Conv_block-107 [-1, 256, 16, 16] 0
Conv2d-108 [-1, 256, 16, 16] 590,080
BatchNorm2d-109 [-1, 256, 16, 16] 512
ReLU-110 [-1, 256, 16, 16] 0
Conv_block-111 [-1, 256, 16, 16] 0
Conv2d-112 [-1, 1024, 16, 16] 263,168
BatchNorm2d-113 [-1, 1024, 16, 16] 2,048
Conv_block-114 [-1, 1024, 16, 16] 0
Conv2d-115 [-1, 1024, 16, 16] 525,312
ReLU-116 [-1, 1024, 16, 16] 0
Res_block-117 [-1, 1024, 16, 16] 0
Conv2d-118 [-1, 256, 16, 16] 262,400
BatchNorm2d-119 [-1, 256, 16, 16] 512
ReLU-120 [-1, 256, 16, 16] 0
Conv_block-121 [-1, 256, 16, 16] 0
Conv2d-122 [-1, 256, 16, 16] 590,080
BatchNorm2d-123 [-1, 256, 16, 16] 512
ReLU-124 [-1, 256, 16, 16] 0
Conv_block-125 [-1, 256, 16, 16] 0
Conv2d-126 [-1, 1024, 16, 16] 263,168
BatchNorm2d-127 [-1, 1024, 16, 16] 2,048
Conv_block-128 [-1, 1024, 16, 16] 0
Identity-129 [-1, 1024, 16, 16] 0
ReLU-130 [-1, 1024, 16, 16] 0
Res_block-131 [-1, 1024, 16, 16] 0
Conv2d-132 [-1, 256, 16, 16] 262,400
BatchNorm2d-133 [-1, 256, 16, 16] 512
ReLU-134 [-1, 256, 16, 16] 0
Conv_block-135 [-1, 256, 16, 16] 0
Conv2d-136 [-1, 256, 16, 16] 590,080
BatchNorm2d-137 [-1, 256, 16, 16] 512
ReLU-138 [-1, 256, 16, 16] 0
Conv_block-139 [-1, 256, 16, 16] 0
Conv2d-140 [-1, 1024, 16, 16] 263,168
BatchNorm2d-141 [-1, 1024, 16, 16] 2,048
Conv_block-142 [-1, 1024, 16, 16] 0
Identity-143 [-1, 1024, 16, 16] 0
ReLU-144 [-1, 1024, 16, 16] 0
Res_block-145 [-1, 1024, 16, 16] 0
Conv2d-146 [-1, 256, 16, 16] 262,400
BatchNorm2d-147 [-1, 256, 16, 16] 512
ReLU-148 [-1, 256, 16, 16] 0
Conv_block-149 [-1, 256, 16, 16] 0
Conv2d-150 [-1, 256, 16, 16] 590,080
BatchNorm2d-151 [-1, 256, 16, 16] 512
ReLU-152 [-1, 256, 16, 16] 0
Conv_block-153 [-1, 256, 16, 16] 0
Conv2d-154 [-1, 1024, 16, 16] 263,168
BatchNorm2d-155 [-1, 1024, 16, 16] 2,048
Conv_block-156 [-1, 1024, 16, 16] 0
Identity-157 [-1, 1024, 16, 16] 0
ReLU-158 [-1, 1024, 16, 16] 0
Res_block-159 [-1, 1024, 16, 16] 0
Conv2d-160 [-1, 256, 16, 16] 262,400
BatchNorm2d-161 [-1, 256, 16, 16] 512
ReLU-162 [-1, 256, 16, 16] 0
Conv_block-163 [-1, 256, 16, 16] 0
Conv2d-164 [-1, 256, 16, 16] 590,080
BatchNorm2d-165 [-1, 256, 16, 16] 512
ReLU-166 [-1, 256, 16, 16] 0
Conv_block-167 [-1, 256, 16, 16] 0
Conv2d-168 [-1, 1024, 16, 16] 263,168
BatchNorm2d-169 [-1, 1024, 16, 16] 2,048
Conv_block-170 [-1, 1024, 16, 16] 0
Identity-171 [-1, 1024, 16, 16] 0
ReLU-172 [-1, 1024, 16, 16] 0
Res_block-173 [-1, 1024, 16, 16] 0
Conv2d-174 [-1, 256, 16, 16] 262,400
BatchNorm2d-175 [-1, 256, 16, 16] 512
ReLU-176 [-1, 256, 16, 16] 0
Conv_block-177 [-1, 256, 16, 16] 0
Conv2d-178 [-1, 256, 16, 16] 590,080
BatchNorm2d-179 [-1, 256, 16, 16] 512
ReLU-180 [-1, 256, 16, 16] 0
Conv_block-181 [-1, 256, 16, 16] 0
Conv2d-182 [-1, 1024, 16, 16] 263,168
BatchNorm2d-183 [-1, 1024, 16, 16] 2,048
Conv_block-184 [-1, 1024, 16, 16] 0
Identity-185 [-1, 1024, 16, 16] 0
ReLU-186 [-1, 1024, 16, 16] 0
Res_block-187 [-1, 1024, 16, 16] 0
Conv2d-188 [-1, 512, 8, 8] 524,800
BatchNorm2d-189 [-1, 512, 8, 8] 1,024
ReLU-190 [-1, 512, 8, 8] 0
Conv_block-191 [-1, 512, 8, 8] 0
Conv2d-192 [-1, 512, 8, 8] 2,359,808
BatchNorm2d-193 [-1, 512, 8, 8] 1,024
ReLU-194 [-1, 512, 8, 8] 0
Conv_block-195 [-1, 512, 8, 8] 0
Conv2d-196 [-1, 2048, 8, 8] 1,050,624
BatchNorm2d-197 [-1, 2048, 8, 8] 4,096
Conv_block-198 [-1, 2048, 8, 8] 0
Conv2d-199 [-1, 2048, 8, 8] 2,099,200
ReLU-200 [-1, 2048, 8, 8] 0
Res_block-201 [-1, 2048, 8, 8] 0
Conv2d-202 [-1, 512, 8, 8] 1,049,088
BatchNorm2d-203 [-1, 512, 8, 8] 1,024
ReLU-204 [-1, 512, 8, 8] 0
Conv_block-205 [-1, 512, 8, 8] 0
Conv2d-206 [-1, 512, 8, 8] 2,359,808
BatchNorm2d-207 [-1, 512, 8, 8] 1,024
ReLU-208 [-1, 512, 8, 8] 0
Conv_block-209 [-1, 512, 8, 8] 0
Conv2d-210 [-1, 2048, 8, 8] 1,050,624
BatchNorm2d-211 [-1, 2048, 8, 8] 4,096
Conv_block-212 [-1, 2048, 8, 8] 0
Identity-213 [-1, 2048, 8, 8] 0
ReLU-214 [-1, 2048, 8, 8] 0
Res_block-215 [-1, 2048, 8, 8] 0
Conv2d-216 [-1, 512, 8, 8] 1,049,088
BatchNorm2d-217 [-1, 512, 8, 8] 1,024
ReLU-218 [-1, 512, 8, 8] 0
Conv_block-219 [-1, 512, 8, 8] 0
Conv2d-220 [-1, 512, 8, 8] 2,359,808
BatchNorm2d-221 [-1, 512, 8, 8] 1,024
ReLU-222 [-1, 512, 8, 8] 0
Conv_block-223 [-1, 512, 8, 8] 0
Conv2d-224 [-1, 2048, 8, 8] 1,050,624
BatchNorm2d-225 [-1, 2048, 8, 8] 4,096
Conv_block-226 [-1, 2048, 8, 8] 0
Identity-227 [-1, 2048, 8, 8] 0
ReLU-228 [-1, 2048, 8, 8] 0
Res_block-229 [-1, 2048, 8, 8] 0
ConvTranspose2d-230 [-1, 1024, 16, 16] 33,555,456
ReLU-231 [-1, 1024, 16, 16] 0
ConvTrans_block-232 [-1, 1024, 16, 16] 0
ConvTranspose2d-233 [-1, 512, 32, 32] 8,389,120
ReLU-234 [-1, 512, 32, 32] 0
ConvTrans_block-235 [-1, 512, 32, 32] 0
ConvTranspose2d-236 [-1, 256, 64, 64] 2,097,408
ReLU-237 [-1, 256, 64, 64] 0
ConvTrans_block-238 [-1, 256, 64, 64] 0
ConvTranspose2d-239 [-1, 128, 128, 128] 524,416
ReLU-240 [-1, 128, 128, 128] 0
ConvTrans_block-241 [-1, 128, 128, 128] 0
ConvTranspose2d-242 [-1, 64, 256, 256] 131,136
ReLU-243 [-1, 64, 256, 256] 0
ConvTrans_block-244 [-1, 64, 256, 256] 0
Conv2d-245 [-1, 2, 256, 256] 1,154
================================================================
Total params: 68,219,330
Trainable params: 68,219,330
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.25
Forward/backward pass size (MB): 661.00
Params size (MB): 260.24
Estimated Total Size (MB): 921.49
----------------------------------------------------------------
기본적으로 conv, relu, batch norm, convtranspose를 사용하여 학습하였다.
Full structure
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 64, 256, 256] 640
ReLU-2 [-1, 64, 256, 256] 0
Conv2d-3 [-1, 64, 128, 128] 36,928
ReLU-4 [-1, 64, 128, 128] 0
BatchNorm2d-5 [-1, 64, 128, 128] 128
Conv2d-6 [-1, 128, 128, 128] 73,856
ReLU-7 [-1, 128, 128, 128] 0
Conv2d-8 [-1, 128, 64, 64] 147,584
ReLU-9 [-1, 128, 64, 64] 0
BatchNorm2d-10 [-1, 128, 64, 64] 256
Conv2d-11 [-1, 256, 64, 64] 295,168
ReLU-12 [-1, 256, 64, 64] 0
Conv2d-13 [-1, 256, 64, 64] 590,080
ReLU-14 [-1, 256, 64, 64] 0
Conv2d-15 [-1, 256, 32, 32] 590,080
ReLU-16 [-1, 256, 32, 32] 0
BatchNorm2d-17 [-1, 256, 32, 32] 512
Conv2d-18 [-1, 512, 32, 32] 1,180,160
ReLU-19 [-1, 512, 32, 32] 0
Conv2d-20 [-1, 512, 32, 32] 2,359,808
ReLU-21 [-1, 512, 32, 32] 0
Conv2d-22 [-1, 512, 32, 32] 2,359,808
ReLU-23 [-1, 512, 32, 32] 0
BatchNorm2d-24 [-1, 512, 32, 32] 1,024
Conv2d-25 [-1, 512, 32, 32] 2,359,808
ReLU-26 [-1, 512, 32, 32] 0
Conv2d-27 [-1, 512, 32, 32] 2,359,808
ReLU-28 [-1, 512, 32, 32] 0
Conv2d-29 [-1, 512, 32, 32] 2,359,808
ReLU-30 [-1, 512, 32, 32] 0
BatchNorm2d-31 [-1, 512, 32, 32] 1,024
Conv2d-32 [-1, 512, 32, 32] 2,359,808
ReLU-33 [-1, 512, 32, 32] 0
Conv2d-34 [-1, 512, 32, 32] 2,359,808
ReLU-35 [-1, 512, 32, 32] 0
Conv2d-36 [-1, 512, 32, 32] 2,359,808
ReLU-37 [-1, 512, 32, 32] 0
BatchNorm2d-38 [-1, 512, 32, 32] 1,024
Conv2d-39 [-1, 512, 32, 32] 2,359,808
ReLU-40 [-1, 512, 32, 32] 0
Conv2d-41 [-1, 512, 32, 32] 2,359,808
ReLU-42 [-1, 512, 32, 32] 0
Conv2d-43 [-1, 512, 32, 32] 2,359,808
ReLU-44 [-1, 512, 32, 32] 0
BatchNorm2d-45 [-1, 512, 32, 32] 1,024
ConvTranspose2d-46 [-1, 256, 64, 64] 2,097,408
ReLU-47 [-1, 256, 64, 64] 0
Conv2d-48 [-1, 256, 64, 64] 590,080
ReLU-49 [-1, 256, 64, 64] 0
Conv2d-50 [-1, 313, 64, 64] 80,441
ReLU-51 [-1, 313, 64, 64] 0
ConvTranspose2d-52 [-1, 256, 128, 128] 1,282,304
ReLU-53 [-1, 256, 128, 128] 0
ConvTranspose2d-54 [-1, 256, 256, 256] 1,048,832
ReLU-55 [-1, 256, 256, 256] 0
Conv2d-56 [-1, 2, 256, 256] 512
================================================================
Total params: 33,976,953
Trainable params: 33,976,953
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.25
Forward/backward pass size (MB): 654.56
Params size (MB): 129.61
Estimated Total Size (MB): 784.42
----------------------------------------------------------------
train 이미지 데이터 셋에서 imageio로 이미지를 열어서 이미지의 shape 길이가 3, 즉 (256,256,3) 이 아니면
(256,256)인 grey 이미지 이기 때문에 outlier 폴더로 이동한다.
test 폴더안의 이미지들과 result 폴더안의 이미지들의 psnr을 측정하여 최소,평균,최대 psnr을 출력.
rgb 이미지를 lab channel로 변경하여 lchannel 만 저장하여 grey 이미지로 변환한다.