Add workaround for FP16 multiplication #27

thomasfaingnaert · 2020-08-14T12:29:17Z

No description provided.

This reverts commit 68244e4.

This reverts commit 8d68aeb.

This avoids the conversions to FP32, which slow down the kernel.

This reverts commit 2b94068.

* Use native Float16 multiplication. * Revert "Add workaround for FP16 multiplication (#27)" * Don't cast to Float32 for vectorization.

thomasfaingnaert added 2 commits August 14, 2020 14:28

Add workaround for FP16 multiplication

8d68aeb

Fix type

68244e4

thomasfaingnaert marked this pull request as draft August 14, 2020 13:13

thomasfaingnaert added 3 commits August 17, 2020 13:17

Revert "Fix type"

a10648c

This reverts commit 68244e4.

Revert "Add workaround for FP16 multiplication"

ec1e8f8

This reverts commit 8d68aeb.

Only multiply FP32 WMMA fragments

c700910

This avoids the conversions to FP32, which slow down the kernel.

thomasfaingnaert marked this pull request as ready for review August 17, 2020 11:34

thomasfaingnaert merged commit 2b94068 into master Aug 17, 2020

thomasfaingnaert deleted the fix-fp16-mul branch August 17, 2020 12:03

thomasfaingnaert mentioned this pull request Jan 27, 2021

Use native Float16 #69

Merged

maleadt added a commit that referenced this pull request Jan 28, 2021

Revert "Add workaround for FP16 multiplication (#27)"

311b1e6

This reverts commit 2b94068.

maleadt added a commit that referenced this pull request Feb 2, 2021

Revert "Add workaround for FP16 multiplication (#27)"

c0d53a0

This reverts commit 2b94068.

maleadt added a commit that referenced this pull request Feb 2, 2021

Use native Float16 (#69)

40c1dac

* Use native Float16 multiplication. * Revert "Add workaround for FP16 multiplication (#27)" * Don't cast to Float32 for vectorization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add workaround for FP16 multiplication #27

Add workaround for FP16 multiplication #27

thomasfaingnaert commented Aug 14, 2020

Add workaround for FP16 multiplication #27

Add workaround for FP16 multiplication #27

Conversation

thomasfaingnaert commented Aug 14, 2020