Build for TARGET_ARCH=fusion_f1 via reference implementation fallback. #45464

advaitjain · 2020-12-08T01:01:01Z

This change adds reference fallbacks to the optimized xtensa kernels for the case when TARGET_ARCH is anything other than hifimini.

This sets the stage for a baseline from which we can incrementally optimize for architectures other than hifimini.

The goal is to have a starting point where all the unit tests pass for TARGET_ARCH=hifimini (which will use the optimized implementations) or any other TARGET_ARCH (with reference fallback).

Tested for TARGET_ARCH=fusion_f1 with:

make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=Google_F1 test

make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=Google_F1 test_keyword_benchmark

InitializeKeywordRunner() took 239061 ticks (239 ms)
KeywordRunNIerations(1) took 168564 ticks (168 ms)
KeywordRunNIerations(10) took 1685111 ticks (1685 ms)

make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=Google_F1 keyword_benchmark BUILD_TYPE=release
xt-size tensorflow/lite/micro/tools/make/gen/xtensa_fusion_f1/bin/keyword_benchmark

   text	   data	    bss	    dec	    hex	filename
  48256	  40132	  24952	 113340	  1babc	tensorflow/lite/micro/tools/make/gen/xtensa_fusion_f1/bin/keyword_benchmark

After this change, we can:

add a continuous build for Hifi4
add optimizations for Hifi4 on a per kernelbasis and keep profiling the impact of these optimizations on the keyword benchmark cycles and binary size.

Also tested that TARGET_ARCH=hifimini is unaffected:

make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=hifimini XTENSA_CORE=mini1m1m_RG test_keyword_benchmark

InitializeKeywordRunner() took 1392788 ticks (1392 ms)
KeywordRunNIerations(1) took 89195 ticks (89 ms)
KeywordRunNIerations(10) took 891509 ticks (891 ms)

make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=hifimini XTENSA_CORE=mini1m1m_RG keyword_benchmark BUILD_TYPE=release
xt-size tensorflow/lite/micro/tools/make/gen/xtensa_hifimini/bin/keyword_benchmark
   text	   data	    bss	    dec	    hex	filename
  46080	  40204	  24952	 111236	  1b284	tensorflow/lite/micro/tools/make/gen/xtensa_hifimini/bin/keyword_benchmark

google-ml-butler · 2020-12-08T01:01:04Z

Thanks for contributing to TensorFlow Lite Micro.

To keep this process moving along, we'd like to make sure that you have completed the items on this list:

Read the contributing guidelines for TensorFlow Lite Micro
Created a TF Lite Micro Github issue
Linked to the issue from the PR description

We would like to have a discussion on the Github issue first to determine the best path forward, and then proceed to the PR review.

advaitjain · 2020-12-08T01:16:09Z

tagging @pnikam-cad @nyadla-sys @kpraving

This change adds reference fallbacks to the optimized xtensa kernels for the case when TARGET_ARCH is anything other than hifimini. This sets the stage for a baseline from which we can incrementally optimize for architectures other than hifimini. The goal is to have a starting point where all the unit tests pass for `TARGET_ARCH=hifimini` (which will use the optimized implementations) or any other `TARGET_ARCH` (with reference fallback). Tested for `TARGET_ARCH=fusion_f1` with: ``` make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=Google_F1 test ``` With the following profiling results: ``` make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=Google_F1 test_keyword_benchmark InitializeKeywordRunner() took 239061 ticks (239 ms) KeywordRunNIerations(1) took 168564 ticks (168 ms) KeywordRunNIerations(10) took 1685111 ticks (1685 ms) ``` ``` make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=Google_F1 keyword_benchmark BUILD_TYPE=release xt-size tensorflow/lite/micro/tools/make/gen/xtensa_fusion_f1/bin/keyword_benchmark text data bss dec hex filename 48256 40132 24952 113340 1babc tensorflow/lite/micro/tools/make/gen/xtensa_fusion_f1/bin/keyword_benchmark ``` After this change, we can: * add a continuous build for Hifi4 * add optimizations for Hifi4 on a per kernelbasis and keep profiling the impact of these optimizations on the keyword benchmark cycles and binary size. Also tested that `TARGET_ARCH=hifimini` is unaffected: ``` make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=hifimini XTENSA_CORE=mini1m1m_RG test_keyword_benchmark InitializeKeywordRunner() took 1392788 ticks (1392 ms) KeywordRunNIerations(1) took 89195 ticks (89 ms) KeywordRunNIerations(10) took 891509 ticks (891 ms) ``` ``` make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=hifimini XTENSA_CORE=mini1m1m_RG keyword_benchmark BUILD_TYPE=release xt-size tensorflow/lite/micro/tools/make/gen/xtensa_hifimini/bin/keyword_benchmark text data bss dec hex filename 46080 40204 24952 111236 1b284 tensorflow/lite/micro/tools/make/gen/xtensa_hifimini/bin/keyword_benchmark ```

…ng-formatting.

advaitjain · 2020-12-09T06:07:58Z

Internal checks were failing (while the external build was ok) because there is an automatic clang-format step prior to the code being imported into the google codebase. And my original commit was missing clang-format and an associated header.

d2fd64f fixes the issue.

#45464 added a new file but did not add the Apache header. Instead the internal change was force submitted. This resulted in breaking all sync between internal and external. PiperOrigin-RevId: 346613912 Change-Id: I078c18f677dcf05be01966b2277f28b4ef42ad68

google-ml-butler bot added the size:L CL Change Size: Large label Dec 8, 2020

advaitjain requested a review from nkreeger December 8, 2020 01:01

google-cla bot added the cla: yes label Dec 8, 2020

advaitjain requested review from njeffrie and rockyrhodes December 8, 2020 01:01

gbaned self-assigned this Dec 8, 2020

gbaned added the comp:micro Related to TensorFlow Lite Microcontrollers label Dec 8, 2020

advaitjain force-pushed the xtensa-fusion-f1 branch from 7024ab8 to 00f5e3c Compare December 8, 2020 19:47

advaitjain added the kokoro:force-run Tests on submitted change label Dec 8, 2020

kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 8, 2020

njeffrie approved these changes Dec 9, 2020

View reviewed changes

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Dec 9, 2020

kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 9, 2020

advaitjain added 2 commits December 8, 2020 21:45

Merge remote-tracking branch 'upstream/master' into xtensa-fusion-f1

6b24408

clang-format + fix missing header that caused a build-error after cla…

d2fd64f

…ng-formatting.

google-ml-butler bot removed the ready to pull PR ready for merge process label Dec 9, 2020

advaitjain added the kokoro:force-run Tests on submitted change label Dec 9, 2020

kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 9, 2020

advaitjain added the ready to pull PR ready for merge process label Dec 9, 2020

copybara-service bot merged commit 07208f7 into tensorflow:master Dec 9, 2020

advaitjain deleted the xtensa-fusion-f1 branch December 9, 2020 23:23

advaitjain mentioned this pull request Dec 12, 2020

Refactor quantize (reference and xtensa) for a complete fallback. #45618

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build for TARGET_ARCH=fusion_f1 via reference implementation fallback. #45464

Build for TARGET_ARCH=fusion_f1 via reference implementation fallback. #45464

advaitjain commented Dec 8, 2020

google-ml-butler bot commented Dec 8, 2020

advaitjain commented Dec 8, 2020

advaitjain commented Dec 9, 2020

Build for TARGET_ARCH=fusion_f1 via reference implementation fallback. #45464

Build for TARGET_ARCH=fusion_f1 via reference implementation fallback. #45464

Conversation

advaitjain commented Dec 8, 2020

google-ml-butler bot commented Dec 8, 2020

advaitjain commented Dec 8, 2020

advaitjain commented Dec 9, 2020