[BYOC][Contrib] Arm Compute Library integration #5915

lhutton1 · 2020-06-24T15:54:12Z

Arm Compute Library (ACL) integration using the BYOC infrastructure. This will enable offloading select operators from a relay graph to ACL so we can achieve faster inference times on Arm CPU's due to hand crafted optimized routines. The PR adds initial support for offloading FP32 conv2d, maxpool2d and reshape to ACL. ACL codegen is used to generate a JSON representation of an operator or 'ACL layer', the ACL runtime then uses this representation to construct a layer, cache it and create a packed
function to for the graph runtime to call into.

RFC here: https://discuss.tvm.ai/t/rfc-byoc-arm-compute-library-integration/7082

Change-Id: If756dcea787ea346b1508e9a191b7eed7bd02b7f

comaniac

Just briefly reviewed the flow. Will review the codegen and runtime in details after migrating to JSON runtime.

cmake/modules/contrib/ACL.cmake

comaniac · 2020-06-25T17:58:16Z

src/relay/backend/contrib/acl/README.md

+under the License.
+-->
+
+# Relay Arm&reg; Compute Library Integration


It would be better to put this in the document (and use RST doc style) so that people can easily find it.

Makes sense, would somewhere under docs be the best place to put this? Perhaps docs/backend/contrib or docs/runtime/contrib?

Good question. I'm looking at "tutorials" or "deploy". @tqchen do you have a preference?

src/relay/backend/contrib/acl/README.md

tqchen · 2020-06-25T18:37:55Z

let us consider expand the name acl to arm_compute_lib or some other alternatives, since ACL means different things to ML/NLP audiences

lhutton1 · 2020-06-25T20:20:03Z

let us consider expand the name acl to arm_compute_lib or some other alternatives, since ACL means different things to ML/NLP audiences

I'd be happy to change this, arm_compute_lib sounds good to me, @u99127 do you have any other preference?

lhutton1 · 2020-07-09T16:59:38Z

Please note this PR depends on #5919.

comaniac

Finished reviewing the infra and left some comments. Will review the doc and test cases later on.

python/tvm/relay/op/contrib/arm_compute_lib.py

src/relay/backend/contrib/arm_compute_lib/codegen_acl.h

src/relay/backend/contrib/arm_compute_lib/codegen.cc

src/runtime/contrib/arm_compute_lib/acl_runtime.cc

src/runtime/contrib/arm_compute_lib/acl_utils.cc

zhiics

Thanks for the contribution. I briefly looked at it. Will do a more careful review.

src/relay/backend/contrib/arm_compute_lib/codegen_acl.h

cmake/modules/contrib/ArmComputeLib.cmake

src/runtime/contrib/arm_compute_lib/acl_runtime.cc

docs/deploy/arm_compute_lib.rst

src/runtime/contrib/arm_compute_lib/acl_runtime.cc

docs/deploy/arm_compute_lib.rst

python/tvm/relay/op/contrib/arm_compute_lib.py

src/runtime/contrib/arm_compute_lib/acl_allocator.cc

tests/python/contrib/test_arm_compute_lib/infrastructure.py

Arm Compute Library (ACL) integration using the BYOC infrastructure. This will enable offloading select operators from a relay graph to ACL so we can achieve faster inference times on Arm CPU's due to hand crafted optimized routines. The PR adds initial support for offloading FP32 conv2d, maxpool2d and reshape to ACL. ACL codegen is used to generate a JSON representation of an operator or 'ACL layer', the ACL runtime then uses this representation to construct a layer, cache it and create a packed function to for the graph runtime to call into. RFC here: https://discuss.tvm.ai/t/rfc-byoc-arm-compute-library-integration/7082 Change-Id: If756dcea787ea346b1508e9a191b7eed7bd02b7f

* Now uses JSON runtime * Addresses tutorial comments * Rename acl to arm_compute_lib in user facing api Change-Id: I3b5ef80607f713e898363e82ab4398fbc2cf267a

Change-Id: I041fda14f3bf9975f3518ba8a4e3ab43ba98403d

* correct mistakes in tutorial * reshuffle runtime to use fewer macro blocks * preprocess module using "optimize" functionality * use new module api Change-Id: I219488e617e5767edd7489b43b8bfce876cd24b8

* Skips runtime tests as these are not supported on x86. Change-Id: I6843c003a2604afe95cfdccf2323d2a336b56fe5

Change-Id: I3f9eec15c599f01b1105d624fb053b73bfb6ed41

comaniac

The changes with the new optimize process looks much better :)
Just a few comments left but overall I think it's good to go.

python/tvm/relay/op/contrib/arm_compute_lib.py

src/runtime/contrib/arm_compute_lib/acl_runtime.cc

tests/python/contrib/test_arm_compute_lib/test_runtime.py

* Add warning to ACL engine creation * Correct runtime check so it doesn't fail when codegen not present * Improve testing to check acl partitions is what is expected * Check results of multiple runs test Change-Id: I9522950930805b9b601dad03269adcf8ed3138cc

comaniac

LGTM. Thanks for the hard work!

lhutton1 · 2020-07-16T19:30:13Z

Not too sure what happened to the Mac build, looks unrelated.

comaniac · 2020-07-16T20:40:45Z

Not too sure what happened to the Mac build, looks unrelated.

Should not be an issue. You could use git commit --allow-empty to re-trigger the CI.

zhiics

@mbaret @manupa-arm @u99127 can you guys take a look as well?

src/relay/backend/contrib/arm_compute_lib/codegen.cc

src/relay/backend/contrib/arm_compute_lib/codegen_acl.h

src/runtime/contrib/arm_compute_lib/acl_allocator.cc

src/runtime/contrib/arm_compute_lib/acl_allocator.h

src/runtime/contrib/arm_compute_lib/acl_runtime.cc

tests/python/contrib/test_arm_compute_lib/infrastructure.py

mbaret

Generally LGTM, just a few minor things and questions. I think at some point (not blocking this PR) we need to think in more detail how BYOC should treat data layout conversions.

python/tvm/relay/op/contrib/arm_compute_lib.py

src/relay/backend/contrib/arm_compute_lib/codegen.cc

src/runtime/contrib/arm_compute_lib/acl_runtime.cc

manupak

Some minor comments

python/tvm/relay/op/contrib/arm_compute_lib.py

src/relay/backend/contrib/arm_compute_lib/codegen.cc

* Multiple style improvements * Use base class for creating json node for single op * Move GetSource to base class * Improve annotation checks Change-Id: I8219659c4b99e86df887cd914720157cb94c61a0

mbaret

LGTM

manupak

LGTM

zhiics · 2020-07-17T15:57:42Z

@FrozenGene Can you take another and approve explicitly if it looks good to you?

cmake/config.cmake

docs/deploy/arm_compute_lib.rst

src/runtime/contrib/arm_compute_lib/acl_allocator.cc

Change-Id: I8f610bd37af1e3740fd48c2d502bcc4727d9d712

FrozenGene

LGTM

src/relay/backend/contrib/arm_compute_lib/codegen.cc

Change-Id: I6c37f0d75a064001c74e171ff83b9f7a7c3f1918

zhiics · 2020-07-21T15:31:15Z

Thanks @lhutton1 and everyone.

* [BYOC][Contrib] Arm Compute Library integration Arm Compute Library (ACL) integration using the BYOC infrastructure. This will enable offloading select operators from a relay graph to ACL so we can achieve faster inference times on Arm CPU's due to hand crafted optimized routines. The PR adds initial support for offloading FP32 conv2d, maxpool2d and reshape to ACL. ACL codegen is used to generate a JSON representation of an operator or 'ACL layer', the ACL runtime then uses this representation to construct a layer, cache it and create a packed function to for the graph runtime to call into. RFC here: https://discuss.tvm.ai/t/rfc-byoc-arm-compute-library-integration/7082 Change-Id: If756dcea787ea346b1508e9a191b7eed7bd02b7f * Refactor ACL integration to support JSON runtime * Now uses JSON runtime * Addresses tutorial comments * Rename acl to arm_compute_lib in user facing api Change-Id: I3b5ef80607f713e898363e82ab4398fbc2cf267a * Address comments Change-Id: I041fda14f3bf9975f3518ba8a4e3ab43ba98403d * Address comments * correct mistakes in tutorial * reshuffle runtime to use fewer macro blocks * preprocess module using "optimize" functionality * use new module api Change-Id: I219488e617e5767edd7489b43b8bfce876cd24b8 * Enable ACL codegen tests in CI * Skips runtime tests as these are not supported on x86. Change-Id: I6843c003a2604afe95cfdccf2323d2a336b56fe5 * Fix check for runtime Change-Id: I3f9eec15c599f01b1105d624fb053b73bfb6ed41 * Address comments * Add warning to ACL engine creation * Correct runtime check so it doesn't fail when codegen not present * Improve testing to check acl partitions is what is expected * Check results of multiple runs test Change-Id: I9522950930805b9b601dad03269adcf8ed3138cc * Address comments * Multiple style improvements * Use base class for creating json node for single op * Move GetSource to base class * Improve annotation checks Change-Id: I8219659c4b99e86df887cd914720157cb94c61a0 * Improve tutorial Change-Id: I8f610bd37af1e3740fd48c2d502bcc4727d9d712 * Initialize conv with nullptr Change-Id: I6c37f0d75a064001c74e171ff83b9f7a7c3f1918

comaniac mentioned this pull request Jun 24, 2020

[CI][Contrib] Add ACL docker installation #5916

Merged

tqchen added the status: need review label Jun 24, 2020

lhutton1 force-pushed the acl-integration branch from 438b3cb to a91007b Compare June 25, 2020 14:06

comaniac reviewed Jun 25, 2020

View reviewed changes

lhutton1 force-pushed the acl-integration branch 2 times, most recently from 6d91877 to d087a67 Compare July 9, 2020 16:57

lhutton1 force-pushed the acl-integration branch from d087a67 to 11d218d Compare July 10, 2020 13:54

comaniac requested changes Jul 14, 2020

View reviewed changes

zhiics reviewed Jul 14, 2020

View reviewed changes

lhutton1 force-pushed the acl-integration branch 2 times, most recently from 699b943 to 6997ec3 Compare July 14, 2020 13:29

comaniac reviewed Jul 15, 2020

View reviewed changes

FrozenGene reviewed Jul 15, 2020

View reviewed changes

zhiics mentioned this pull request Jul 15, 2020

[BYOC][Optimization] Run accelerator specific optimizations #6068

Merged

lhutton1 added 5 commits July 16, 2020 09:35

Refactor ACL integration to support JSON runtime

128dde5

* Now uses JSON runtime * Addresses tutorial comments * Rename acl to arm_compute_lib in user facing api Change-Id: I3b5ef80607f713e898363e82ab4398fbc2cf267a

Address comments

1af38dd

Change-Id: I041fda14f3bf9975f3518ba8a4e3ab43ba98403d

Address comments

04399f3

* correct mistakes in tutorial * reshuffle runtime to use fewer macro blocks * preprocess module using "optimize" functionality * use new module api Change-Id: I219488e617e5767edd7489b43b8bfce876cd24b8

Enable ACL codegen tests in CI

691ae69

* Skips runtime tests as these are not supported on x86. Change-Id: I6843c003a2604afe95cfdccf2323d2a336b56fe5

lhutton1 force-pushed the acl-integration branch from 6997ec3 to 691ae69 Compare July 16, 2020 09:14

Fix check for runtime

f0080e8

Change-Id: I3f9eec15c599f01b1105d624fb053b73bfb6ed41

comaniac reviewed Jul 16, 2020

View reviewed changes

Address comments

d55fbe9

* Add warning to ACL engine creation * Correct runtime check so it doesn't fail when codegen not present * Improve testing to check acl partitions is what is expected * Check results of multiple runs test Change-Id: I9522950930805b9b601dad03269adcf8ed3138cc

comaniac approved these changes Jul 16, 2020

View reviewed changes

zhiics reviewed Jul 16, 2020

View reviewed changes

FrozenGene reviewed Jul 17, 2020

View reviewed changes

tests/python/contrib/test_arm_compute_lib/infrastructure.py Outdated Show resolved Hide resolved

mbaret reviewed Jul 17, 2020

View reviewed changes

manupak reviewed Jul 17, 2020

View reviewed changes

python/tvm/relay/op/contrib/arm_compute_lib.py Show resolved Hide resolved

src/relay/backend/contrib/arm_compute_lib/codegen.cc Show resolved Hide resolved

Address comments

51f7b0a

* Multiple style improvements * Use base class for creating json node for single op * Move GetSource to base class * Improve annotation checks Change-Id: I8219659c4b99e86df887cd914720157cb94c61a0

mbaret approved these changes Jul 17, 2020

View reviewed changes

manupak approved these changes Jul 17, 2020

View reviewed changes

zhiics approved these changes Jul 17, 2020

View reviewed changes

FrozenGene requested changes Jul 20, 2020

View reviewed changes

cmake/config.cmake Show resolved Hide resolved

docs/deploy/arm_compute_lib.rst Outdated Show resolved Hide resolved

docs/deploy/arm_compute_lib.rst Outdated Show resolved Hide resolved

src/runtime/contrib/arm_compute_lib/acl_allocator.cc Outdated Show resolved Hide resolved

Improve tutorial

2f4a644

Change-Id: I8f610bd37af1e3740fd48c2d502bcc4727d9d712

FrozenGene approved these changes Jul 21, 2020

View reviewed changes

src/relay/backend/contrib/arm_compute_lib/codegen.cc Outdated Show resolved Hide resolved

Initialize conv with nullptr

06c869d

Change-Id: I6c37f0d75a064001c74e171ff83b9f7a7c3f1918

zhiics merged commit d8c9bb1 into apache:master Jul 21, 2020

zhiics added status: accepted and removed status: need review labels Jul 21, 2020

lhutton1 deleted the acl-integration branch July 21, 2020 15:32

ZihengJiang mentioned this pull request Sep 25, 2020

TVM v0.7 Release Note Candidate #6486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BYOC][Contrib] Arm Compute Library integration #5915

[BYOC][Contrib] Arm Compute Library integration #5915

lhutton1 commented Jun 24, 2020

comaniac left a comment

comaniac Jun 25, 2020

lhutton1 Jun 25, 2020

comaniac Jun 25, 2020

tqchen commented Jun 25, 2020

lhutton1 commented Jun 25, 2020

lhutton1 commented Jul 9, 2020

comaniac left a comment •

edited

Loading

zhiics left a comment

comaniac left a comment

comaniac left a comment

lhutton1 commented Jul 16, 2020

comaniac commented Jul 16, 2020

zhiics left a comment

mbaret left a comment

manupak left a comment

mbaret left a comment

manupak left a comment

zhiics commented Jul 17, 2020

FrozenGene left a comment

zhiics commented Jul 21, 2020

[BYOC][Contrib] Arm Compute Library integration #5915

[BYOC][Contrib] Arm Compute Library integration #5915

Conversation

lhutton1 commented Jun 24, 2020

comaniac left a comment

Choose a reason for hiding this comment

comaniac Jun 25, 2020

Choose a reason for hiding this comment

lhutton1 Jun 25, 2020

Choose a reason for hiding this comment

comaniac Jun 25, 2020

Choose a reason for hiding this comment

tqchen commented Jun 25, 2020

lhutton1 commented Jun 25, 2020

lhutton1 commented Jul 9, 2020

comaniac left a comment • edited Loading

Choose a reason for hiding this comment

zhiics left a comment

Choose a reason for hiding this comment

comaniac left a comment

Choose a reason for hiding this comment

comaniac left a comment

Choose a reason for hiding this comment

lhutton1 commented Jul 16, 2020

comaniac commented Jul 16, 2020

zhiics left a comment

Choose a reason for hiding this comment

mbaret left a comment

Choose a reason for hiding this comment

manupak left a comment

Choose a reason for hiding this comment

mbaret left a comment

Choose a reason for hiding this comment

manupak left a comment

Choose a reason for hiding this comment

zhiics commented Jul 17, 2020

FrozenGene left a comment

Choose a reason for hiding this comment

zhiics commented Jul 21, 2020

comaniac left a comment •

edited

Loading