Skip to content

v3.4.0 Dgemm Assembly

Compare
Choose a tag to compare
@guacamoleo guacamoleo released this 01 Dec 15:00
· 4228 commits to master since this release

Features:

  • Dgemm has been implemented in assembly.
    • GlobalSplitU and LocalSplitU have not yet been implemented for it.
    • 64x64x8 with prefetching is fastest kernel configuration.