Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check num of channels on adjust_* transformations #3069

Merged
merged 3 commits into from
Dec 2, 2020

Conversation

datumbox
Copy link
Contributor

@datumbox datumbox commented Dec 1, 2020

Some of the transformations can be applied only to RGB or Grayscale images, nevertheless currently we don't explicitly check to ensure that an image with the right number of channels is provided. In the past this was not a problem since io.image only supported RGB, nevertheless recent PRs #2984, #2988 and #3024 added support for additional image types. Though the tensor meta data are not enough to tell us the type of image and its color space, the number of channels can be used as a proxy.

In this PR, I explicitly check for the right number of channels in all the adjust_* transforms. Note that those transforms don't handle yet correctly transparency. There are other transformations that could potentially provide incorrect results when transparency is available but I don't patch here because they might be used in other applications. Some of these transformations are affine(), rotate() and perspective().

@@ -339,13 +339,13 @@ def freeze_rng_state():
class TransformsTester(unittest.TestCase):

def _create_data(self, height=3, width=3, channels=3, device="cpu"):
tensor = torch.randint(0, 255, (channels, height, width), dtype=torch.uint8, device=device)
tensor = torch.randint(0, 256, (channels, height, width), dtype=torch.uint8, device=device)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

randint is exclusive on the upper bound, so the right value here is 256. I opted in fixing this small bug in-place to avoid doing unnecessary CI runs.

@datumbox datumbox mentioned this pull request Dec 1, 2020
Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good ! Thanks @datumbox

I think there is no much point to update private docstring as they are not rendered anywhere and just remained from the previous versions... Probably, we can either remove them to avoid the maintainance...

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Dec 1, 2020

Note that those transforms don't handle yet correctly transparency. There are other transformations that could potentially provide incorrect results when transparency is available but I don't patch here because they might be used in other applications.

Speaking about transparency channel for tensor inputs is a bit complicated as the input is unstructured (e.g. x[..., 0, :, :] does not mean that this is the red channel of RGB color space)

@datumbox
Copy link
Contributor Author

datumbox commented Dec 1, 2020

@vfdev-5 thanks!

Probably, we can either remove them to avoid the maintainance...

Good call. I saw it but I was not sure whether we still want it. Let me clean it up here then.

Speaking about transparency channel for tensor inputs is a bit complicated as the input is unstructured (e.g. x[..., 0, :, :] does not mean that this is the red channel of RGB color space)

I agree with your remark that it's complicated and potentially requires more meta data or assumptions to support this. Obviously channels==3 does not mean RGB as there are other colorspaces that use 3 channels, but just to clarify, is this what you mean on your last remark OR there is another underlying technical detail I'm not aware of?

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Dec 1, 2020

Good call. I saw it but I was not sure whether we still want it. Let me clean it up here then.

sounds good!

Obviously channels==3 does not mean RGB as there are other colorspaces that use 3 channels, but just to clarify, is this what you mean on your last remark OR there is another underlying technical detail I'm not aware of?

Yes, this is it :)

@datumbox
Copy link
Contributor Author

datumbox commented Dec 1, 2020

@vfdev-5 On second thought, I think the doc removal should happen on a separate PR to avoid mixing things too much. I believe that if I remove it here, we might need to discuss how we communicate to the user that PIL and Tensor support different types, so this might block the merge and I need the changes for an other PR.

I've created an issue to describe what needs to be done: #3071

@datumbox datumbox merged commit 7f1a05a into pytorch:master Dec 2, 2020
@datumbox datumbox deleted the bugfix/limit_channels branch December 2, 2020 18:20
vfdev-5 pushed a commit to Quansight/vision that referenced this pull request Dec 4, 2020
* Fixing upperbound value on tests and documentation.

* Limit the number of channels on adjust_* transoforms.
facebook-github-bot pushed a commit that referenced this pull request Dec 8, 2020
Summary:
* Fixing upperbound value on tests and documentation.

* Limit the number of channels on adjust_* transoforms.

Reviewed By: datumbox

Differential Revision: D25371948

fbshipit-source-id: e998347eee5443341dc60ad6c5b5372b58e05019
@datumbox datumbox mentioned this pull request Jan 5, 2021
13 tasks
@vfdev-5 vfdev-5 mentioned this pull request Jun 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants