Skip to content

Simd v6.1.140

Compare
Choose a tag to compare
@ermig1979 ermig1979 released this 19 Aug 10:38
· 63 commits to master since this release

Algorithms

New features
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetRelu16b.
  • API of SynetAdd16b framework.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of class SynetAdd16bUniform.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations, AMX-BF16 of class SynetConvolution16bNchwGemm.
Improving
  • AMX-BF16 optimizations of class SynetInnerProduct16bGemmNN.
Bug fixing
  • Error in Base implementation of class SynetMergedConvolution16bCdc.
  • Error in Base implementation of class SynetMergedConvolution16bDc.
  • Error in Base implementation of class SynetInnerProduct16bGemmNN.
  • Error in Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function Float32ToBFloat16.

Test framework

New features
  • Tests for verifying functionality of function SynetRelu16b.
  • Tests for verifying functionality of SynetAdd16b framework.