-
I see For the sake of discussion, let us consider this one: cutlass/include/cutlass/gemm/collective/sm90_mma_tma_gmma_ss.hpp Lines 424 to 445 in a8f2c80 I noticed that locally, I could disable the first Hence, I wonder what the I did a little bit of digging, and I found that the cutlass/include/cute/arch/mma_sm90_gmma.hpp Lines 92 to 98 in a8f2c80 So, unlike
Thank you! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
this does not have any correctness implication for the kernel. it simply is an NVVM code motion fence to ensure the registers of the WGMMA instruction do not get touched by anything else in the middle of the WGMMA batch of instructions |
Beta Was this translation helpful? Give feedback.
-
A reply in 3 mins... Wow! Thank you, @thakkarV! Follow up on your reply, can we come up with an example where something would go wrong without For instance, (I am not sure if it is the case, but) does this
|
Beta Was this translation helpful? Give feedback.
this does not have any correctness implication for the kernel. it simply is an NVVM code motion fence to ensure the registers of the WGMMA instruction do not get touched by anything else in the middle of the WGMMA batch of instructions