-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes deform_conv issue with large input/output #4351
Fixes deform_conv issue with large input/output #4351
Conversation
30a2324
to
3498de6
Compare
if (use_64bits_indexing) { | ||
AT_DISPATCH_FLOATING_TYPES_AND_HALF( | ||
input.scalar_type(), "deformable_im2col", ([&] { | ||
deformable_im2col_kernel<scalar_t, int64_t><<<blocks, threads>>>( | ||
num_kernels, | ||
input.data_ptr<scalar_t>(), | ||
data_offset.data_ptr<scalar_t>(), | ||
data_mask.data_ptr<scalar_t>(), | ||
height, | ||
width, | ||
weight_h, | ||
weight_w, | ||
pad_h, | ||
pad_w, | ||
stride_h, | ||
stride_w, | ||
dilation_h, | ||
dilation_w, | ||
parallel_imgs, | ||
n_in_channels, | ||
deformable_group, | ||
out_h, | ||
out_w, | ||
use_mask, | ||
data_col.data_ptr<scalar_t>()); | ||
})); | ||
|
||
} else { | ||
AT_DISPATCH_FLOATING_TYPES_AND_HALF( | ||
input.scalar_type(), "deformable_im2col", ([&] { | ||
deformable_im2col_kernel<scalar_t, int><<<blocks, threads>>>( | ||
num_kernels, | ||
input.data_ptr<scalar_t>(), | ||
data_offset.data_ptr<scalar_t>(), | ||
data_mask.data_ptr<scalar_t>(), | ||
height, | ||
width, | ||
weight_h, | ||
weight_w, | ||
pad_h, | ||
pad_w, | ||
stride_h, | ||
stride_w, | ||
dilation_h, | ||
dilation_w, | ||
parallel_imgs, | ||
n_in_channels, | ||
deformable_group, | ||
out_h, | ||
out_w, | ||
use_mask, | ||
data_col.data_ptr<scalar_t>()); | ||
})); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fmassa any clever ideas on how to reduce code duplication ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine. You could write a macro here to call this only once, like what PyTorch does here for example, and then is dispatched here.
If you do this, don't forget to #undef
the macros after they are used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM, thanks @vfdev-5 !
If you want to send a follow-up PR adding the macros, I'm fine with it, but let's get this merged now
if (use_64bits_indexing) { | ||
AT_DISPATCH_FLOATING_TYPES_AND_HALF( | ||
input.scalar_type(), "deformable_im2col", ([&] { | ||
deformable_im2col_kernel<scalar_t, int64_t><<<blocks, threads>>>( | ||
num_kernels, | ||
input.data_ptr<scalar_t>(), | ||
data_offset.data_ptr<scalar_t>(), | ||
data_mask.data_ptr<scalar_t>(), | ||
height, | ||
width, | ||
weight_h, | ||
weight_w, | ||
pad_h, | ||
pad_w, | ||
stride_h, | ||
stride_w, | ||
dilation_h, | ||
dilation_w, | ||
parallel_imgs, | ||
n_in_channels, | ||
deformable_group, | ||
out_h, | ||
out_w, | ||
use_mask, | ||
data_col.data_ptr<scalar_t>()); | ||
})); | ||
|
||
} else { | ||
AT_DISPATCH_FLOATING_TYPES_AND_HALF( | ||
input.scalar_type(), "deformable_im2col", ([&] { | ||
deformable_im2col_kernel<scalar_t, int><<<blocks, threads>>>( | ||
num_kernels, | ||
input.data_ptr<scalar_t>(), | ||
data_offset.data_ptr<scalar_t>(), | ||
data_mask.data_ptr<scalar_t>(), | ||
height, | ||
width, | ||
weight_h, | ||
weight_w, | ||
pad_h, | ||
pad_w, | ||
stride_h, | ||
stride_w, | ||
dilation_h, | ||
dilation_w, | ||
parallel_imgs, | ||
n_in_channels, | ||
deformable_group, | ||
out_h, | ||
out_w, | ||
use_mask, | ||
data_col.data_ptr<scalar_t>()); | ||
})); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine. You could write a macro here to call this only once, like what PyTorch does here for example, and then is dispatched here.
If you do this, don't forget to #undef
the macros after they are used
Summary: * WIP on fixing index overflow issue * Fixed backward pass for large num_kernels * Fixed clang formatting * Fixed GET_BLOCKS int/int64_t types issue Reviewed By: fmassa Differential Revision: D30793320 fbshipit-source-id: ce99a6c2c0f859b32d2c565da451640331f935f8 Co-authored-by: vfdev-5 <vfdev-5@gmail.com> Co-authored-by: Francisco Massa <fvsmassa@gmail.com>
Fixes #4269
Description:
!!! Currently, no tests provided as reproducible example allocates large tensors - any suggestions on that ?