Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add workaround for FP16 multiplication #27

Merged
merged 5 commits into from
Aug 17, 2020
Merged

Commits on Aug 14, 2020

  1. Configuration menu
    Copy the full SHA
    8d68aeb View commit details
    Browse the repository at this point in the history
  2. Fix type

    thomasfaingnaert committed Aug 14, 2020
    Configuration menu
    Copy the full SHA
    68244e4 View commit details
    Browse the repository at this point in the history

Commits on Aug 17, 2020

  1. Revert "Fix type"

    This reverts commit 68244e4.
    thomasfaingnaert committed Aug 17, 2020
    Configuration menu
    Copy the full SHA
    a10648c View commit details
    Browse the repository at this point in the history
  2. Revert "Add workaround for FP16 multiplication"

    This reverts commit 8d68aeb.
    thomasfaingnaert committed Aug 17, 2020
    Configuration menu
    Copy the full SHA
    ec1e8f8 View commit details
    Browse the repository at this point in the history
  3. Only multiply FP32 WMMA fragments

    This avoids the conversions to FP32, which slow down the kernel.
    thomasfaingnaert committed Aug 17, 2020
    Configuration menu
    Copy the full SHA
    c700910 View commit details
    Browse the repository at this point in the history