Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requirements to pass WGMMA LHS operand in registers #4785

Open
chsigg opened this issue Sep 23, 2024 · 3 comments
Open

Requirements to pass WGMMA LHS operand in registers #4785

chsigg opened this issue Sep 23, 2024 · 3 comments

Comments

@chsigg
Copy link
Collaborator

chsigg commented Sep 23, 2024

NVIDIA is implementing an optimization to pass the LHS operand of WGMMA ops in register. This allows element-wise prologues to pass the intermediate result directly to WGMMA without writing it to shared memory like it currently does.

The OpenXLA team is currently reviewing NVIDIA's changes with the intent of eventually writing a PR against this repository. We heard through @ThomasRaoux and @gflegar that the Triton team is planning a similar feature, so it would be great to align the requirements. @ThomasRaoux, how can we best achieve this? Are you far along enough in the planning phase that you could provide some feedback to @ggengnv? Or would you prefer for us to do a round of reviews first?

@Jokeren
Copy link
Contributor

Jokeren commented Sep 23, 2024

We heard through @ThomasRaoux and @gflegar that the Triton team is planning a similar feature

What feature are you referring to? I don't think anyone on our side is working on this.

@ThomasRaoux
Copy link
Collaborator

@lezcano is currently working on a mixed mode kernel that will require this support but at this point there hasn't been much design work on the mmav3 specific part yet and there are a few steps before getting to that.

Looking at the changes in the link, it seems that there is some more work needed to productize it so maybe we should join efforts indeed. What lond of timeline did you have in mind for this work?

@ggengnv
Copy link

ggengnv commented Sep 26, 2024

As an update, I've addressed existing PR comments and split the original PR into two for ease of review.

I'm currently on leave till 10/9 - will be happy to address feedback and resume effort on this once I return :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants