-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BC-breaking] Unified inputs for grayscale ops and transforms #2586
[BC-breaking] Unified inputs for grayscale ops and transforms #2586
Conversation
…-5/issue-2292-unify-grayscale
…-6/issue-2292-unify-grayscale
- deprecated F.to_grayscale in favor of F.rgb_to_grayscale
Codecov Report
@@ Coverage Diff @@
## master #2586 +/- ##
==========================================
+ Coverage 70.28% 71.64% +1.36%
==========================================
Files 95 94 -1
Lines 8204 8162 -42
Branches 1299 1298 -1
==========================================
+ Hits 5766 5848 +82
+ Misses 2019 1903 -116
+ Partials 419 411 -8
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR Victor!
I've made a few comments, let me know what you think
"""Convert the given RGB Image Tensor to Grayscale. | ||
For RGB to Grayscale conversion, ITU-R 601-2 luma transform is performed which | ||
is L = R * 0.2989 + G * 0.5870 + B * 0.1140 | ||
|
||
Args: | ||
img (Tensor): Image to be converted to Grayscale in the form [C, H, W]. | ||
num_output_channels (int): number of channels of the output image. Value can be 1 or 3. Default, 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we need this at all or if we should move this logic to the transform itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, Grayscale
transform is using num_output_channels
as argument and RandomGrayscale
outputs the same number of channels as in the input.
If you are OK with BC change, we can remove it. It is a bit difficult to estimate how many users apply F.to_grayscale
with this option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not say to remove it from Grayscale
, but only from the functional implementation.
…-5/issue-2292-unify-grayscale
…-6/issue-2292-unify-grayscale
…-5/issue-2292-unify-grayscale
@fmassa I updated the PR. Changes:
Let me know if it corresponds to your ideas. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
# This implementation closely follows the TF one: | ||
# https://github.com/tensorflow/tensorflow/blob/v2.3.0/tensorflow/python/ops/image_ops_impl.py#L2105-L2138 | ||
l_img = (0.2989 * r + 0.587 * g + 0.114 * b).to(img.dtype) | ||
l_img = l_img.unsqueeze(dim=-3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a BC-breaking change, as before the tensor returned would have 2 dimensions. I'll change the title of the PR to account for this
* [WIP] Unify ops Grayscale and RandomGrayscale * Unified inputs for grayscale op and transforms - deprecated F.to_grayscale in favor of F.rgb_to_grayscale * Fixes bug with fp input * [WIP] Updated code according to review * Removed unused import
Related to #2292
Description:
GrayScale
,RandomGrayscale
,F.rgb_to_grayscale
.F_t.rgb_to_grayscale
implemention to match better PIL implementationtest_adjustments
=> we need to update this test according to how we test results intest_adjust_gamma
.F.to_grayscale
in favor ofF.rgb_to_grayscale
BC-breaking:
rgb_to_grayscale
now returns a tensor with the same number of dimensions as the original tensor.EDIT:
L = (19595 * r + 38470 * g + 7471 * b + 2 ** 15) / 2 ** 16
(0.299 * r + 0.587 * g + 0.114 * b)
Unlocks #2566 #2595