-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unalignment case for conv #242
Comments
CONV currently do not support this feature. However, it is pretty easy to add it. In GEMM, this line splits a 128-bit load into multiple smaller loads. You just need to do the same to the conv mainloop in https://github.com/NVIDIA/cutlass/blob/master/include/cutlass/conv/threadblock/implicit_gemm_multistage.h . The underlying iterators in https://github.com/NVIDIA/cutlass/blob/master/include/cutlass/conv/threadblock and several defaultxxx files need some plumbing, too. We welcome the community to upstream this feature. |
okay thanks, I will try to upload a patch |
Awesome, just first try to figure out how GEMM works, and then just do the same for CONV. It is not hard. |
support unalignment input for conv2d fprop stage=2 Fix for issue #242
I find out that I can set AlignmentA/B in gemm to handle the case when shape is unalignment for int4.
But how can I run conv in similar case, such as channel equals to 1 or 3.
Is there any configuration to set global load granularity for conv, thanks.
The text was updated successfully, but these errors were encountered: