-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added bicubic support for interpolation with AA #3810
Added bicubic support for interpolation with AA #3810
Conversation
…polate-aa-cpu-more-modes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks!
I think this is good to merge. Can you show me what type of differences in interpolation we have between different methods?
// taken from | ||
// https://github.com/python-pillow/Pillow/blob/6812205f18ca4ef54372e87e1a13ce4a859434df/ | ||
// src/libImaging/Resample.c#L46-L62 | ||
static inline scalar_t _filter(scalar_t x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the future: check if this is equivalent / similar to https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/UpSample.h#L324-L332
m.impl( | ||
TORCH_SELECTIVE_NAME("torchvision::_interpolate_bicubic_aa"), | ||
TORCH_FN(interpolate_bicubic_aa_forward_kernel)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you, but another option would be to handle this all inside a single function in C++, and don't expose multiple variants to be dispatched on the python side of things.
# High value is mostly required for test cases with | ||
# downsampling and upsampling where we can not exactly | ||
# match PIL implementation. | ||
accepted_tol = 15.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share some image examples with me of the image difference between PIL and your implementation?
Something like plt.imshow(pil_interp - tv_interp)
so that I can see what types of differences we are seeing here?
Thanks! I think the differences might be due to PIL computing the bicubic interpolation with integers while we use floats, so they might have more rounding errors due to the more expensive computations. Let's get this merged, can you rebase the PR? |
Summary: * Added support for bicubic mode with AA * Updated comment in the test Reviewed By: cpuhrsch Differential Revision: D28538771 fbshipit-source-id: 8c5bc434a8b3478c2088b46886a28c561d666b55
Summary: Description: - Added antialias flag to interpolate (CPU only) - forward and backward for bicubic mode - added tests Previous PR for bilinear, #65142 ### Benchmarks <details> <summary> Forward pass, CPU. PTH interpolation vs PIL </summary> Cases: - PTH RGB 3 Channels, float32 vs PIL RGB uint8 (apples vs pears) - PTH 1 Channel, float32 vs PIL 1 Channel Float Code: https://gist.github.com/vfdev-5/b173761a567f2283b3c649c3c0574112 ``` Torch config: PyTorch built with: - GCC 9.3 - C++ Version: 201402 - OpenMP 201511 (a.k.a. OpenMP 4.5) - CPU capability usage: AVX2 - CUDA Runtime 11.1 - NVCC architecture flags: -gencode;arch=compute_61,code=sm_61 - CuDNN 8.0.5 - Build settings: BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_PYTORCH_QNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=1, USE_CUDNN=1, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=0, USE_OPENMP=ON, USE_ROCM=OFF, Num threads: 1 [------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (320, 196) -------------------] | Reference, PIL 8.4.0, mode: RGB | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 | 4.5 | 5.2 channels_last non-contiguous torch.float32 | 4.5 | 5.3 Times are in milliseconds (ms). [------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (460, 220) -------------------] | Reference, PIL 8.4.0, mode: RGB | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 | 5.7 | 6.4 channels_last non-contiguous torch.float32 | 5.7 | 6.4 Times are in milliseconds (ms). [------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (120, 96) --------------------] | Reference, PIL 8.4.0, mode: RGB | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 | 3.0 | 4.0 channels_last non-contiguous torch.float32 | 2.9 | 4.1 Times are in milliseconds (ms). [------------------ Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (1200, 196) -------------------] | Reference, PIL 8.4.0, mode: RGB | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 | 14.7 | 17.1 channels_last non-contiguous torch.float32 | 14.8 | 17.2 Times are in milliseconds (ms). [------------------ Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (120, 1200) -------------------] | Reference, PIL 8.4.0, mode: RGB | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 | 3.5 | 3.9 channels_last non-contiguous torch.float32 | 3.5 | 3.9 Times are in milliseconds (ms). [---------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (320, 196) ---------] | Reference, PIL 8.4.0, mode: F | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 | 2.4 | 1.8 Times are in milliseconds (ms). [---------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (460, 220) ---------] | Reference, PIL 8.4.0, mode: F | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 | 3.1 | 2.2 Times are in milliseconds (ms). [---------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (120, 96) ----------] | Reference, PIL 8.4.0, mode: F | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 | 1.6 | 1.4 Times are in milliseconds (ms). [--------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (1200, 196) ---------] | Reference, PIL 8.4.0, mode: F | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 | 7.9 | 5.7 Times are in milliseconds (ms). [--------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (120, 1200) ---------] | Reference, PIL 8.4.0, mode: F | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 | 1.7 | 1.3 Times are in milliseconds (ms). ``` </details> Code is moved from torchvision: pytorch/vision#3810 and pytorch/vision#4208 Pull Request resolved: #68819 Reviewed By: mikaylagawarecki Differential Revision: D33339117 Pulled By: jbschlosser fbshipit-source-id: 6a0443bbba5439f52c7dbc1be819b75634cf67c4
Summary: Description: - Added antialias flag to interpolate (CPU only) - forward and backward for bicubic mode - added tests Previous PR for bilinear, #65142 ### Benchmarks <details> <summary> Forward pass, CPU. PTH interpolation vs PIL </summary> Cases: - PTH RGB 3 Channels, float32 vs PIL RGB uint8 (apples vs pears) - PTH 1 Channel, float32 vs PIL 1 Channel Float Code: https://gist.github.com/vfdev-5/b173761a567f2283b3c649c3c0574112 ``` Torch config: PyTorch built with: - GCC 9.3 - C++ Version: 201402 - OpenMP 201511 (a.k.a. OpenMP 4.5) - CPU capability usage: AVX2 - CUDA Runtime 11.1 - NVCC architecture flags: -gencode;arch=compute_61,code=sm_61 - CuDNN 8.0.5 - Build settings: BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_PYTORCH_QNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=1, USE_CUDNN=1, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=0, USE_OPENMP=ON, USE_ROCM=OFF, Num threads: 1 [------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (320, 196) -------------------] | Reference, PIL 8.4.0, mode: RGB | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 | 4.5 | 5.2 channels_last non-contiguous torch.float32 | 4.5 | 5.3 Times are in milliseconds (ms). [------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (460, 220) -------------------] | Reference, PIL 8.4.0, mode: RGB | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 | 5.7 | 6.4 channels_last non-contiguous torch.float32 | 5.7 | 6.4 Times are in milliseconds (ms). [------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (120, 96) --------------------] | Reference, PIL 8.4.0, mode: RGB | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 | 3.0 | 4.0 channels_last non-contiguous torch.float32 | 2.9 | 4.1 Times are in milliseconds (ms). [------------------ Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (1200, 196) -------------------] | Reference, PIL 8.4.0, mode: RGB | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 | 14.7 | 17.1 channels_last non-contiguous torch.float32 | 14.8 | 17.2 Times are in milliseconds (ms). [------------------ Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (120, 1200) -------------------] | Reference, PIL 8.4.0, mode: RGB | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 | 3.5 | 3.9 channels_last non-contiguous torch.float32 | 3.5 | 3.9 Times are in milliseconds (ms). [---------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (320, 196) ---------] | Reference, PIL 8.4.0, mode: F | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 | 2.4 | 1.8 Times are in milliseconds (ms). [---------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (460, 220) ---------] | Reference, PIL 8.4.0, mode: F | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 | 3.1 | 2.2 Times are in milliseconds (ms). [---------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (120, 96) ----------] | Reference, PIL 8.4.0, mode: F | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 | 1.6 | 1.4 Times are in milliseconds (ms). [--------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (1200, 196) ---------] | Reference, PIL 8.4.0, mode: F | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 | 7.9 | 5.7 Times are in milliseconds (ms). [--------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (120, 1200) ---------] | Reference, PIL 8.4.0, mode: F | 1.11.0a0+gitb0bdf58 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 | 1.7 | 1.3 Times are in milliseconds (ms). ``` </details> Code is moved from torchvision: pytorch/vision#3810 and pytorch/vision#4208 Pull Request resolved: #68819 Reviewed By: mikaylagawarecki Differential Revision: D33339117 Pulled By: jbschlosser fbshipit-source-id: 6a0443bbba5439f52c7dbc1be819b75634cf67c4
Description:
Note: