forked from tensorflow/tensorflow
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add in optimizations for softmax for Fusion F1.
Confirmed that the test passes with: ``` make -f tensorflow/lite/micro/tools/make/Makefile TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=F1_190305_swupgrade test_kernel_softmax_test -j8 ``` However, the latency improvement is only ~1000 ticks, as tested with: ``` make -f tensorflow/lite/micro/tools/make/Makefile TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=F1_190305_swupgrade test_keyword_benchmark -j8 ``` Since Softmax is currently a small fraction of the overall keyword_benchmark latency we will focus on the latency of only this particular OP. With the optimized implementation: ``` SOFTMAX took 749 ticks (0 ms). ``` Reference implementation: ``` SOFTMAX took 2052 ticks (2 ms). ``` And with the LUT hifimini implementation (for completeness): ``` SOFTMAX took 1142 ticks (1 ms). ``` The gain of ~1500 ticks ticks is still worth merging because after all the optimizations (e.g. tensorflow#47098), this will still mean a ~5% improvement for the keyword benchmark. And the benefits might be more significant for other models too.
- Loading branch information
1 parent
ed58135
commit 06e80ff
Showing
2 changed files
with
73 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters