Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slicing: Quality focused default image format selection and jpg quality #956

Merged
merged 7 commits into from
Nov 5, 2023

Conversation

jokober
Copy link
Contributor

@jokober jokober commented Oct 9, 2023

Image quality is important in the domain of small object detection. i think the default behavior of SAHI could be more focused on image quality. let me explain why:

I realized two things

  • I was using jpg input images and I overlooked the out_ext argument of slice_coco. Hence the images where sliced and then saved as jpg again resulting in lower image quality
  • pillow.save() is called without the quality parameter resulting in saving the images with the default of quality=75

Both together intruduces image quality reduction due to slicing.

I propose two changes

  1. In slice_coco the default suffix should be the original suffix for lossless image formats and png for lossy formats ('.jpg','.jpeg').
  2. Set the quality=95 as a default parameter for pillow.save(). This will only have effect if the suffix is jpg

jokober and others added 3 commits October 9, 2023 14:11
@bertrandchauveau
Copy link

Interesting finding! Perhaps the subsampling=0 argument should also be stated with pillow.save?
Old references but mentioned here:
JPEG Image Quality in PIL - jdhao's digital space
python - Why is the quality of JPEG images produced by PIL so poor? - Stack Overflow

Or why not use the quality='keep'argument for jpg images?
According to the pillow documentation: "The value keep is only valid for JPEG files and will retain the original image quality level, subsampling, and qtables."

@jokober
Copy link
Contributor Author

jokober commented Oct 18, 2023

Thanks for pointing that out! I changed the quality parameter accordingly (quality='keep')

@fcakyon
Copy link
Collaborator

fcakyon commented Nov 5, 2023

@jokober @bertrandchauveau thanks a lot for your valuable discussion! Once the tests pass, we can merge 👍

@fcakyon fcakyon added this pull request to the merge queue Nov 5, 2023
Merged via the queue into obss:main with commit ad25062 Nov 5, 2023
9 checks passed
@@ -317,7 +317,7 @@ def _export_single_slice(image: np.ndarray, output_dir: str, slice_file_name: st
image_pil = read_image_as_pil(image)
slice_file_path = str(Path(output_dir) / slice_file_name)
# export sliced image
image_pil.save(slice_file_path)
image_pil.save(slice_file_path, quality="keep")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI using quality="keep" fails (silently) for jpg ...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @fcakyon - I'd raise an issue but these don't seem to be available on this repo. I'd probably try/except the code executed in the thread pool too, just so we don't silently fail again.

Copy link
Contributor Author

@jokober jokober Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the hint. Can you try with following line (only for jpegs)?

image_pil.save(slice_file_path, 'JPEG', quality="keep")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I just deleted the quality parameter to make it work = ) It looks like it won't work as this still errors for me.

from PIL import Image
Image.new("RGB", (100, 100)).save("tmp.jpg", "JPEG", quality="keep")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh from above:

According to the pillow documentation: "The value keep is only valid for JPEG files and will retain the original image quality level, subsampling, and qtables."

The problem is these aren't JPG files, these are slices of numpy arrays or similar, so I don't think anything will be kept. Maybe just revert to the quality argument you had?

@kodonnell
Copy link

  • I was using jpg input images and I overlooked the out_ext argument of slice_coco. Hence the images where sliced and then saved as jpg again resulting in lower image quality

Out of interest, do you have any examples of the poorer quality - both visual, but also the impact on model performance? As far as I can tell, converting a JPEG (which already has the compression artifacts etc.) into PNG gains very little, but loses a lot in terms of disk speed and performance - PNG is obviously a lot bigger, but also a lot slower to encode/decode (especially for PIL in one of my tests a while back). Even if the re-compression adds a few more artifacts, IIRC there are papers showing a bunch of computer vision is relatively robust to this - though that might not be for small objects. Anyway, for 99% of use cases, this seems like it will harm most users - especially when there's already the out_ext='.png' option for the 1%. If it's complicated to fix the bug introduced in this PR, then maybe worth reverting? And wrapping a test in to catch it in future.

For context, I ended up in the 99%. I've got 32gb of 4k JPEG and I noticed that the tiled image directory was getting massive. (I think it would have ended up something like 300gb, but I killed it before I could test that.) Hence how I ended up here - specifying out_ext='.jpg' and then wondering why I didn't get any images due to the silent failing.

PS - you should totally look into QOI if you want lossless without the CPU/disk bottlenecking that can come from using PNG (assuming you've got a decent GPU) ...

So qoi isn't far off PNG in terms of compression, but 4x-20x faster to encode and 1.5x-6x faster to decode.
https://github.com/kodonnell/qoi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants