You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
NVIDIA is implementing an optimization to pass the LHS operand of WGMMA ops in register. This allows element-wise prologues to pass the intermediate result directly to WGMMA without writing it to shared memory like it currently does.
The OpenXLA team is currently reviewing NVIDIA's changes with the intent of eventually writing a PR against this repository. We heard through @ThomasRaoux and @gflegar that the Triton team is planning a similar feature, so it would be great to align the requirements. @ThomasRaoux, how can we best achieve this? Are you far along enough in the planning phase that you could provide some feedback to @ggengnv? Or would you prefer for us to do a round of reviews first?
The text was updated successfully, but these errors were encountered:
@lezcano is currently working on a mixed mode kernel that will require this support but at this point there hasn't been much design work on the mmav3 specific part yet and there are a few steps before getting to that.
Looking at the changes in the link, it seems that there is some more work needed to productize it so maybe we should join efforts indeed. What lond of timeline did you have in mind for this work?
NVIDIA is implementing an optimization to pass the LHS operand of WGMMA ops in register. This allows element-wise prologues to pass the intermediate result directly to WGMMA without writing it to shared memory like it currently does.
The OpenXLA team is currently reviewing NVIDIA's changes with the intent of eventually writing a PR against this repository. We heard through @ThomasRaoux and @gflegar that the Triton team is planning a similar feature, so it would be great to align the requirements. @ThomasRaoux, how can we best achieve this? Are you far along enough in the planning phase that you could provide some feedback to @ggengnv? Or would you prefer for us to do a round of reviews first?
The text was updated successfully, but these errors were encountered: