Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix write and encode jpeg tests #3908

Merged
merged 4 commits into from
May 25, 2021
Merged

Conversation

NicolasHug
Copy link
Member

... well... mostly.

This PR Fixes the test logic which is currently wrong as it assumes that the "decode" phase is the same for torchvision and for PIL, which does not hold.
I also ported the tests to use pytest

Unfortunately the correct tests don't pass on Windows (probably a difference in the underlying libjpeg?), so we can keep them around for now. When this is merged, I will open an issue to keep track of the progress about fixing the windows tests.

This PR should fix the long-standing issues in fbcode (see https://www.internalfb.com/diff/D28605919), but I'm submitting it here instead of in fbcode because of the windows-skipping thing, which will be easier to verify with CircleCI.

CC @datumbox @pmeier @fmassa as previously discussed

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

One thing we could also do in the future if we don't manage to align the libjpeg versions across platforms is to test decode(encode(x)) ~= x, although this is a less robust test compared to what we have now.

For reference, this is how PIL compare images (uses the histogram).

I would prefer if we keep our more accurate checks in general, but for lossy compressions we might need to resort to more approximate checks

@NicolasHug NicolasHug merged commit eaddb90 into pytorch:master May 25, 2021
# valid comparison. This is done in test_encode_jpeg, but unfortunately
# these more correct tests fail on windows (probably because of a difference
# in libjpeg) between torchvision and PIL.
# FIXME: make the correct tests pass on windows and remove this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this comment. The right comparison would be to encode from the same decoded image and compare the results. Unfortunately as Nicolas explains, that's going to fail on Windows. I would not be surprised if the expected image is created on Linux or macOS to make this work. Hence this test is misleading and does not test that the encoding on our side is the same as the encoding on PIL side on the same platform.

Having said that, it's good that Nicolas found a way around to maintain the test until we decide whether we want to keep it or drop it. I'm on the fence on keeping it and Nicolas briefly considered dropping it. I don't have strong opinions on this. @fmassa thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the underlying issue might be that we are using different libjpeg for different OSes, which is less than ideal.

Fixing this would probably fix the issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key issue comes from the fact that apparently on windows, the libjpeg version of PIL is different from that of torchvision.

Having different libjpeg versions across OSes wouldn't be a problem if for a given OS, the libjpeg version was the same for both PIL and torchvision I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicolasHug it would be good to log the libraries installed in

conda env update --file "${this_dir}/environment.yml" --prune
and
conda env update --file "${this_dir}/environment.yml" --prune
, our current constraint on libjpeg is <=9b, which in principle allows for different versions to be installed on different OSes, if there is one version missing in conda

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if the problem lies within PIL, this means that if we were to add a test comparing the encoding of PIL on the CI itself it should fail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to log the libraries installed

I'm not sure how to best do that yet but I started looking into it in #3968

facebook-github-bot pushed a commit that referenced this pull request May 25, 2021
Reviewed By: vincentqb, cpuhrsch

Differential Revision: D28679967

fbshipit-source-id: 000bea1e2bc5fe7db14fbc36d80528300c7f7650
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants