Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for matmul kernel crash with i8xf32 operands. #12

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Commits on Sep 10, 2024

  1. Workaround for matmul kernel crash with i8xf32 operands.

    The BlockedToMMA pass creates a layout with kWidth=4 when one operand is
    i8. However, the TritonGPU to LLVM lowering pass does not support
    lowering f32 with kWidth=4, which is the other operand, causing a
    segmentation fault.
    
    To work around this, if the operands' minBitWidth is 8 and maxBitWidth
    is 32, we use a minBitWidth of 16 instead of 8, creating a layout with
    kWidth=2 for both i8 and f32 operands.
    3gx committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    b313a8b View commit details
    Browse the repository at this point in the history