Skip to content

V4.2.0 Performance improvements

Compare
Choose a tag to compare
@amcamd amcamd released this 15 May 22:10
· 3930 commits to master since this release

Features

  • Fractional global capability
  • Additional ResNet sizes
  • Round up for half vgprs
  • Initial code for PersistentKernel (disabled)
  • Feature inner unroll2
  • Enable BufferStore and buffer_atomic_cmpswap for GSU>1