[DONOTMERGE] deform_conv2d with headers #3051

datumbox · 2020-11-26T18:25:49Z

Proof of concept for #2956

Builds upon #2998

Moves C++ code from header files to cpp files. Each cpp file now has a separate header file with "public" functions.
Remove unnecessary repeated includes.
Use a similar naming convention as in PyTorch for the kernel implementations in C++ and CUDA.
Remove the _param postfix from the variables (DeformConv2d specific detail).
Register the operators in their own files instead all of them in vision.cpp.

…and cpp files.

- Create header files for kernel implementation and remove definitions from vision_*.h files. - Eliminate unnecessary headers and ensure all cpp include their headers.

datumbox · 2020-11-27T18:07:51Z

Concerning registering the operators in the Operator files (instead in vision.cpp), I checked how PyTorch does this and noticed that they use the DECLARE_DISPATCH macro. Here is an example of how the operator is registered in various places of the implementation:

I'm not sure whether it's possible for us to use the above macro but I understand that we can't just move the TORCH_LIBRARY and TORCH_LIBRARY_IMPL declarations on multiple places. So it's worth checking out what alternatives we have.

ezyang · 2020-11-30T16:33:50Z

DECLARE_DISPATCH serves a different function; it's for making sure we can dispatch to differently vectorized implementations (we build CPU kernels multiple times at different levels of vectorization, then dispatch to them depending on what CPU capability is supported at runtime). This functionality hasn't been integrated into the dispatcher proper so it has to be done separately.

I'm not sure whether it's possible for us to use the above macro but I understand that we can't just move the TORCH_LIBRARY and TORCH_LIBRARY_IMPL declarations on multiple places. So it's worth checking out what alternatives we have.

I'm not sure what problem you're referring to here. TORCH_LIBRARY_IMPL is explicitly designed to be usable from multiple files. The model is:

Define schemas for all operators in torchvision in one file
For each implementation (can be separate file), have a TORCH_LIBRARY_IMPL that registers it

You can see an example in aten/src/ATen/native/quantized/library.cpp

…thods around to minimize git blame changes.

datumbox · 2020-11-30T17:46:11Z

You can see an example in aten/src/ATen/native/quantized/library.cpp

@ezyang Thanks that's useful.

Based on the example, the registration of the forward/backward methods continues happening in a single file (library.cpp in your example) using macro TORCH_LIBRARY but the TORCH_LIBRARY_IMPL calls move in the place where the methods are defined.

In our case then, I understand that:

the TORCH_LIBRARY calls will remain in the csrc/vision.cpp file.
the CPU TORCH_LIBRARY_IMPL calls will go in the csrc/cpu/deform_conv2d_kernel.cpp file.
the CUDA TORCH_LIBRARY_IMPL calls will go in the csrc/cuda/deform_conv2d_kernel.cu file .
the Autocast/Autograd TORCH_LIBRARY_IMPL calls will go in the csrc/deform_conv2d.cpp file.

Edit: I implemented the above on df0e8a7 but unfortunately it fails on windows:

[10/14] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -Dtorchvision_EXPORTS -IC:\Users\circleci\project\torchvision\csrc -IC:\Users\circleci\project\env\lib\site-packages\torch\include -IC:\Users\circleci\project\env\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\circleci\project\env\lib\site-packages\torch\include\TH -IC:\Users\circleci\project\env\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -IC:\Users\circleci\project\env\include -IC:\Users\circleci\project\env\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\circleci\project\torchvision\csrc\cuda\deform_conv2d_kernel.cu -o C:\Users\circleci\project\build\temp.win-amd64-3.8\Release\Users\circleci\project\torchvision\csrc\cuda\deform_conv2d_kernel.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75
FAILED: C:/Users/circleci/project/build/temp.win-amd64-3.8/Release/Users/circleci/project/torchvision/csrc/cuda/deform_conv2d_kernel.obj 
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -Dtorchvision_EXPORTS -IC:\Users\circleci\project\torchvision\csrc -IC:\Users\circleci\project\env\lib\site-packages\torch\include -IC:\Users\circleci\project\env\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\circleci\project\env\lib\site-packages\torch\include\TH -IC:\Users\circleci\project\env\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -IC:\Users\circleci\project\env\include -IC:\Users\circleci\project\env\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\circleci\project\torchvision\csrc\cuda\deform_conv2d_kernel.cu -o C:\Users\circleci\project\build\temp.win-amd64-3.8\Release\Users\circleci\project\torchvision\csrc\cuda\deform_conv2d_kernel.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75
C:/Users/circleci/project/env/lib/site-packages/torch/include\c10/util/ThreadLocalDebugInfo.h(12): warning: modifier is ignored on an enum specifier

C:/Users/circleci/project/env/lib/site-packages/torch/include\ATen/record_function.h(19): warning: modifier is ignored on an enum specifier

C:/Users/circleci/project/env/lib/site-packages/torch/include\torch/csrc/jit/ir/ir.h(1349): error: member "torch::jit::ProfileOptionalOp::Kind" may not be initialized

C:/Users/circleci/project/env/lib/site-packages/torch/include\torch/csrc/j
it/serialization/pickler.h(211): error: member "torch::jit::Pickler::kBufferSize" may not be initialized

C:/Users/circleci/project/env/lib/site-packages/torch/include\torch/csrc/jit/serialization/pickler.h(228): error: expression must have a constant value
C:/Users/circleci/project/env/lib/site-packages/torch/include\torch/csrc/jit/serialization/pickler.h(228): note: the value of member "torch::jit::Pickler::kBufferSize"
(211): here cannot be used as a constant

C:/Users/circleci/project/env/lib/site-packages/torch/include\torch/csrc/autograd/profiler_legacy.h(93): warning: modifier is ignored on an enum specifier

C:/Users/circleci/project/env/lib/site-packages/torch/include\torch/csrc/autograd/profiler_legacy.h(356): warning: modifier is ignored on an enum specifier

C:/Users/circleci/project/env/lib/site-packages/torch/include\torch/csrc/autograd/profiler_kineto.h(16): warning: modifier is ignored on an enum specifier

3 errors detected in the compilation of "C:/Users/circleci/AppData/Local/Temp/tmpxft_00000a54_00000000-7_deform_conv2d_kernel.cpp1.ii".
deform_conv2d_kernel.cu

All other platforms work fine (ignore macOS_py3.7 which currently has an unrelated issue). If you have any recommendation please let me know.

ezyang · 2020-12-01T15:46:18Z

This looks like the library is causing more headers to be included, and the flags you are compiling under trip an error under the flags. To unblock, I'd find the flag controlling "C:/Users/circleci/project/env/lib/site-packages/torch/include\torch/csrc/jit/ir/ir.h(1349): error: member "torch::jit::ProfileOptionalOp::Kind" may not be initialized
" and suppress this error in vision (it should be somewhere in pytorch cmake). The proper fix is to fix torch/csrc/jit/ir/ir.h to fix the warning.

datumbox · 2020-12-01T17:10:17Z

@ezyang Thanks for the pointer. We decided to go ahead with the rest of the changes without including the registration move but I'll do the investigation once the main refactoring is completed.

datumbox · 2020-12-02T13:50:39Z

@ezyang

I traced the original problem back to the use of constexpr instead of const that causes issues under windows. There are two problematic files in PyTorch: ir.h and pickler.h. This issue has appeared before and was patched in different places:

how to use libtorch library in cuda file with nvcc compiler(c++)? pytorch#27958
member "torch::jit::detail::AttributePolicy::all_slots" may not be initialized pytorch#39394
[Caffe2] Windows build errors in generated file caffe2.pb.h pytorch#7962

I think the permanent fix is do replace constexpr with CONSTEXPR_EXCEPT_WIN_CUDA similar to this PR. I'll try sending a PR to PyTorch later this week but since this is a blocker for me, I wanted to explore your other suggestion of using flags on TorchVision's cmake conf. I'm not sure which flag is responsible for turning off the member ”X” may not be initialized warning. I've seen some solutions in an other PR but I'm not certain if this is advised.

If you have any pointers will be highly appreciated.

ezyang · 2020-12-02T15:13:11Z

I was hoping the workaround would be simple, guess it's not obvious. Just send in the upstream fix and ping on work chat, I can expedite its review.

datumbox · 2020-12-04T11:54:35Z

It seems that the issue with pickler.h is resolved by pytorch/pytorch#48717 but the ir.h requires additional work. We now fail with:

C:/Users/circleci/project/env/lib/site-packages/torch/include\torch/csrc/jit/ir/ir.h(1329): error: a member of type "const c10::Symbol" cannot have an in-class initializer

C:/Users/circleci/project/env/lib/site-packages/torch/include\torch/csrc/jit/ir/ir.h(1349): error: a member of type "const c10::Symbol" cannot have an in-class initializer

I'm going to try to move the initializer outside of the class on a separate PR but I would really appreciate some input.

datumbox added 7 commits November 12, 2020 21:34

Rename files.

af140b6

Standardizing method names.

5946432

Adding anonymous namespaces.

a47ff4a

Applying C++ naming rules and alinging variable names across headers …

88596ab

…and cpp files.

Syncing names across implementations.

aefe934

Merge branch 'master' into refactoring/namespaces_ops

f38b4a7

Rename deform_conv2d.h to deform_conv2d.cpp

602acb2

facebook-github-bot added the cla signed label Nov 26, 2020

datumbox force-pushed the refactoring/headers_ops branch 4 times, most recently from 77cbf7c to 868de6c Compare November 26, 2020 19:10

Use header files:

9026df0

- Create header files for kernel implementation and remove definitions from vision_*.h files. - Eliminate unnecessary headers and ensure all cpp include their headers.

datumbox force-pushed the refactoring/headers_ops branch from 868de6c to 9026df0 Compare November 26, 2020 19:20

datumbox added 2 commits November 26, 2020 20:45

Change the naming convention for kernel implementations.

ec11906

Remove the _param postfix from the variables and standardizing names.

f3f8469

datumbox force-pushed the refactoring/headers_ops branch from 76e7140 to f3f8469 Compare November 26, 2020 22:03

datumbox requested a review from fmassa November 26, 2020 22:20

datumbox force-pushed the refactoring/headers_ops branch from e1ecdcd to 6f87104 Compare November 27, 2020 16:53

datumbox force-pushed the refactoring/headers_ops branch from 6f87104 to f6782a0 Compare November 30, 2020 17:16

Exposing public forward/backward methods to the C++ API and moving me…

da80ce1

…thods around to minimize git blame changes.

datumbox force-pushed the refactoring/headers_ops branch from f6782a0 to da80ce1 Compare November 30, 2020 17:20

datumbox force-pushed the refactoring/headers_ops branch 3 times, most recently from f3c5a2e to 085ab41 Compare November 30, 2020 19:17

Registering operators in their files.

df0e8a7

datumbox force-pushed the refactoring/headers_ops branch from 085ab41 to df0e8a7 Compare November 30, 2020 21:33

datumbox closed this Dec 2, 2020

datumbox mentioned this pull request Dec 2, 2020

Error "member may not be initialized" due to constexpr at Windows pytorch/pytorch#48716

Closed

datumbox deleted the refactoring/headers_ops branch December 2, 2020 17:44

This was referenced Dec 4, 2020

error: a member of type "const c10::Symbol" cannot have an in-class initializer on Windows pytorch/pytorch#48835

Closed

Per file C++ Operator registration #3135

Merged

Restructure and clean up the video/video_reader C++ implementations #3155

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DONOTMERGE] deform_conv2d with headers #3051

[DONOTMERGE] deform_conv2d with headers #3051

datumbox commented Nov 26, 2020 •

edited

Loading

datumbox commented Nov 27, 2020 •

edited

Loading

ezyang commented Nov 30, 2020

datumbox commented Nov 30, 2020 •

edited

Loading

ezyang commented Dec 1, 2020

datumbox commented Dec 1, 2020

datumbox commented Dec 2, 2020

ezyang commented Dec 2, 2020

datumbox commented Dec 4, 2020 •

edited

Loading

[DONOTMERGE] deform_conv2d with headers #3051

[DONOTMERGE] deform_conv2d with headers #3051

Conversation

datumbox commented Nov 26, 2020 • edited Loading

datumbox commented Nov 27, 2020 • edited Loading

ezyang commented Nov 30, 2020

datumbox commented Nov 30, 2020 • edited Loading

ezyang commented Dec 1, 2020

datumbox commented Dec 1, 2020

datumbox commented Dec 2, 2020

ezyang commented Dec 2, 2020

datumbox commented Dec 4, 2020 • edited Loading

datumbox commented Nov 26, 2020 •

edited

Loading

datumbox commented Nov 27, 2020 •

edited

Loading

datumbox commented Nov 30, 2020 •

edited

Loading

datumbox commented Dec 4, 2020 •

edited

Loading