-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add JPEG augmentation #8316
Add JPEG augmentation #8316
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/8316
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit ae595be with merge base 2ba586d (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@NicolasHug For PIL backend, it's actually very straight-forward to also support other image formats by simply passing # `format` can be JPEG, WEBP, or whatever PIL support
def compression(image, format, quality):
...
# `format` can be a list of string, though the same `quality` value does not mean the same quality across formats.
class Compression(nn.Module)
def __init__(self, format, quality):
... For PyTorch backend, only JPEG can be supported natively for now, since torchvision only supports JPEG. We can say this is the limitation for PyTorch backend in the docs? (of course we can also try going through PIL for non-JPEG formats) |
Some questions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the excellent PR with such short notice @gau-nernst ! We really appreciate the high quality contribution.
I have a few comments below but it looks great already. If you're not able to address those before the release branch cut on Monday then no worries, just let me know and I'll do it myself.
To answer a few of your questions above:
For PIL backend, it's actually very straight-forward to also support other image formats by simply passing format="WEBP" for example. Do you want to make this more generic?
We try to provide the same offering for PIL images and tensors, so I think it's preferable to just stick to JPEG here (note that we could also have png, since we have png decoding for tensors). That would make the transition from PIL to tensor backend smoother, and for users who are relying on PIL only, it's not too hard to write a custom transform to achieve that anyway.
Should jpeg() checks if the tensor is on CPU and is uint8? If we don't check, encode_jpeg() will raise error, so that is also fine I think.
Great point - it's fine to rely on encode_jpeg()
for the error, but we should clarify these expectations in the docstring
Do I need to handle torchscript somehow? The tests on my local machine don't fail, so it seems like encode_jpeg() and decode_jpeg() is torchscript-compatible?
Yes we need the functional to be torchscript-compatbile. encode_jpeg
and decode_jpeg
already support torchscript and the tests are passing, so there's nothing more you need to do here :)
@@ -317,3 +317,42 @@ def _transform(self, inpt: Any, params: Dict[str, Any]) -> Any: | |||
return output | |||
else: | |||
return inpt | |||
|
|||
|
|||
def _setup_quality(quality: Union[int, Sequence[int]]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is fairly short and only used in a single place, so let's just inline it in the class'__init__
?
test/test_transforms_v2.py
Outdated
|
||
@pytest.mark.parametrize("quality", [5, 75]) | ||
@pytest.mark.parametrize("color_space", ["RGB", "GRAY"]) | ||
def test_functional_image_correctness(self, quality, color_space): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't hurt to keep this, but it seems mostly a subset of the correctness test in test_transform_image_correctness
so maybe we can get rid of it? I'll let you decide on that!
|
||
@_register_kernel_internal(jpeg, torch.Tensor) | ||
@_register_kernel_internal(jpeg, tv_tensors.Image) | ||
@_register_kernel_internal(jpeg, tv_tensors.Video) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with the other kernels we should expose a jpeg_video
kernel that calls into jpeg_image()
.
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
@NicolasHug I fixed the comments you left. If there are still issues, I can work on this tomorrow (midnight in my country now). If you want to merge this PR by today, you can finish it.
Understand, I totally agree. From what I understand, PNG encoding is lossless (unless you use limited color palette), so there is no point doing PNG compression as an augmentation. Video-based image encoding, like WebP, might be useful to simulate low bitrate video frames (low bitrate JPEG does not quite look the same), but it will be for future PRs :). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I'll merge once the CI is green. Thank you so much for the great PR @gau-nernst !
Reviewed By: vmoens Differential Revision: D55062770 fbshipit-source-id: 926a1eea4f55cb0b3c1a4f379088c1505ec70479 Co-authored-by: Nicolas Hug <contact@nicolas-hug.com> Co-authored-by: Nicolas Hug <nh.nicolas.hug@gmail.com>
Fixes #8290
cc @vfdev-5