Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inject desired pattern for handling Transpose for fp8 gemm rewrite #17440

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

wenscarl
Copy link
Contributor

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:

a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)

to

a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)

@elfiegg
Copy link
Contributor

elfiegg commented Sep 24, 2024

Also cc @akuegel

Copy link
Member

@akuegel akuegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please also add a test?

xla/service/gpu/transforms/gemm_rewriter.cc Show resolved Hide resolved
xla/service/gpu/transforms/gemm_rewriter.cc Outdated Show resolved Hide resolved
xla/service/gpu/transforms/gemm_rewriter.cc Show resolved Hide resolved
xla/service/gpu/transforms/gemm_rewriter.cc Show resolved Hide resolved
xla/service/gpu/transforms/gemm_rewriter.cc Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants