Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix the bug that for block_k=16 mma, the compilation crash on Ampere. #15

Closed
wants to merge 11 commits into from

Commits on Aug 26, 2024

  1. Configuration menu
    Copy the full SHA
    b535b55 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7a88766 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    abdaaff View commit details
    Browse the repository at this point in the history
  4. [BACKEND] Update LLVM version to llvm/llvm-project@99bb9a7 (triton-la…

    …ng#4410)
    
    Included the use of the non-deprecated version of createMCObjectStreamer (needed after llvm/llvm-project@f1422a8).
    karupayun committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    3dd3657 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    958e9a5 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    87538e5 View commit details
    Browse the repository at this point in the history
  7. [BACKEND] Update gcc debian package to point to a version 14.1.0-2 wh…

    …ich exists in gcc-defaults. (triton-lang#4548)
    
    The llvm build check is trying to get
    http://ftp.de.debian.org/debian/pool/main/g/gcc-defaults/gcc-aarch64-linux-gnu_13.2.0-7_amd64.deb,
    which does not exist and therefore fails. Updating the version to an
    existing one (14.1.0-2).
    
    [x] I am not making a trivial change, such as fixing a typo in a
    comment.
    [x] I have written a PR description following these
      [rules](https://cbea.ms/git-commit/#why-not-how).
    [x] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`.
    [x] This PR does not need a test because it is not a functional change,
    should fix git checks builds.
    [x] I have not added any `lit` tests.
    khasanovaa authored and karupayun committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    32fc9c5 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    494f55c View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    b2de88f View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2024

  1. OpenXLA-specific changes

    jax-triton-dev authored and karupayun committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    7a5940c View commit details
    Browse the repository at this point in the history

Commits on Sep 6, 2024

  1. fix the bug that for block_k=16 mma, the compilation crash on Ampere.

    The origin issue is reported here: triton-lang#3435
    The issue happens during compilation, when arith.sitofp (from i8 to fp16) operates on the tensor operand which has dot_op layout with the first dimension of the tensor being 16 and opidx = 1.
    For example: %104 = arith.sitofp %103 : tensor<16x64xi8, #triton_gpu.dot_op<{opIdx = 1, parent = #mma, kWidth = 4}>> to tensor<16x64xf16, #triton_gpu.dot_op<{opIdx = 1, parent = #mma, kWidth = 4}>>
    
    Investigation shows that the bug happens in TritonGPUToLLVM pass.
    in the corner case (block_k = 16 and opidx = 1) extra elements will be unpacked in include/triton/Conversion/TritonGPUToLLVM/ElementwiseOpToLLVM.h:line 186-194.
    The code unpack extra elements due to an implicit assumption in lib/Dialect/TritonGPU/IR/Dialect.h, at line 2000, at least 4 rep will be loaded.
    
    Therefore, in our patch, extra loaded elements are dropped in the corner case.
    bingyizh233 committed Sep 6, 2024
    Configuration menu
    Copy the full SHA
    daed93f View commit details
    Browse the repository at this point in the history