Sometimes there's something wrong with the image. Let's call these images 'low quality'. (For example watermark on the image, blurry images, red eyes, black and white photos etc)
If I can convert the good images to low-quality images then I can reverse the process and improve the low-quality photos.
To archive this I created a custom loss function which incorporates feature loss (also known as perceptual loss), along with gram loss.
- Improve dark image quality
- Improve OCR from the dark image
From this list I choosed MSRA Text Detection 500 Database (MSRA-TD500).
I ignored text boundix boxes and resized images to 512px.
With python Pillow I changed the image resolution, brightness, contrast, sharpness and quality. Samples are in /imgs/ directory.
Then I combined all the changes and made 2 datasets
Disclaimer: I trained it less than an hour and the whole project took 4 hours to make.
Try to detect text from the crappy images, manually improved images, images improved with ML and original images. Compare the results and conclude if its something to invest more time in.