-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BYOC][Contrib] Arm Compute Library integration #5915
Conversation
438b3cb
to
a91007b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just briefly reviewed the flow. Will review the codegen and runtime in details after migrating to JSON runtime.
under the License. | ||
--> | ||
|
||
# Relay Arm® Compute Library Integration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to put this in the document (and use RST doc style) so that people can easily find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, would somewhere under docs be the best place to put this? Perhaps docs/backend/contrib
or docs/runtime/contrib
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I'm looking at "tutorials" or "deploy". @tqchen do you have a preference?
let us consider expand the name acl to arm_compute_lib or some other alternatives, since ACL means different things to ML/NLP audiences |
I'd be happy to change this, |
6d91877
to
d087a67
Compare
Please note this PR depends on #5919. |
d087a67
to
11d218d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished reviewing the infra and left some comments. Will review the doc and test cases later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution. I briefly looked at it. Will do a more careful review.
699b943
to
6997ec3
Compare
Arm Compute Library (ACL) integration using the BYOC infrastructure. This will enable offloading select operators from a relay graph to ACL so we can achieve faster inference times on Arm CPU's due to hand crafted optimized routines. The PR adds initial support for offloading FP32 conv2d, maxpool2d and reshape to ACL. ACL codegen is used to generate a JSON representation of an operator or 'ACL layer', the ACL runtime then uses this representation to construct a layer, cache it and create a packed function to for the graph runtime to call into. RFC here: https://discuss.tvm.ai/t/rfc-byoc-arm-compute-library-integration/7082 Change-Id: If756dcea787ea346b1508e9a191b7eed7bd02b7f
* Now uses JSON runtime * Addresses tutorial comments * Rename acl to arm_compute_lib in user facing api Change-Id: I3b5ef80607f713e898363e82ab4398fbc2cf267a
Change-Id: I041fda14f3bf9975f3518ba8a4e3ab43ba98403d
* correct mistakes in tutorial * reshuffle runtime to use fewer macro blocks * preprocess module using "optimize" functionality * use new module api Change-Id: I219488e617e5767edd7489b43b8bfce876cd24b8
* Skips runtime tests as these are not supported on x86. Change-Id: I6843c003a2604afe95cfdccf2323d2a336b56fe5
6997ec3
to
691ae69
Compare
Change-Id: I3f9eec15c599f01b1105d624fb053b73bfb6ed41
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes with the new optimize
process looks much better :)
Just a few comments left but overall I think it's good to go.
* Add warning to ACL engine creation * Correct runtime check so it doesn't fail when codegen not present * Improve testing to check acl partitions is what is expected * Check results of multiple runs test Change-Id: I9522950930805b9b601dad03269adcf8ed3138cc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the hard work!
Not too sure what happened to the Mac build, looks unrelated. |
Should not be an issue. You could use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbaret @manupa-arm @u99127 can you guys take a look as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, just a few minor things and questions. I think at some point (not blocking this PR) we need to think in more detail how BYOC should treat data layout conversions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments
* Multiple style improvements * Use base class for creating json node for single op * Move GetSource to base class * Improve annotation checks Change-Id: I8219659c4b99e86df887cd914720157cb94c61a0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@FrozenGene Can you take another and approve explicitly if it looks good to you? |
Change-Id: I8f610bd37af1e3740fd48c2d502bcc4727d9d712
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Change-Id: I6c37f0d75a064001c74e171ff83b9f7a7c3f1918
Thanks @lhutton1 and everyone. |
* [BYOC][Contrib] Arm Compute Library integration Arm Compute Library (ACL) integration using the BYOC infrastructure. This will enable offloading select operators from a relay graph to ACL so we can achieve faster inference times on Arm CPU's due to hand crafted optimized routines. The PR adds initial support for offloading FP32 conv2d, maxpool2d and reshape to ACL. ACL codegen is used to generate a JSON representation of an operator or 'ACL layer', the ACL runtime then uses this representation to construct a layer, cache it and create a packed function to for the graph runtime to call into. RFC here: https://discuss.tvm.ai/t/rfc-byoc-arm-compute-library-integration/7082 Change-Id: If756dcea787ea346b1508e9a191b7eed7bd02b7f * Refactor ACL integration to support JSON runtime * Now uses JSON runtime * Addresses tutorial comments * Rename acl to arm_compute_lib in user facing api Change-Id: I3b5ef80607f713e898363e82ab4398fbc2cf267a * Address comments Change-Id: I041fda14f3bf9975f3518ba8a4e3ab43ba98403d * Address comments * correct mistakes in tutorial * reshuffle runtime to use fewer macro blocks * preprocess module using "optimize" functionality * use new module api Change-Id: I219488e617e5767edd7489b43b8bfce876cd24b8 * Enable ACL codegen tests in CI * Skips runtime tests as these are not supported on x86. Change-Id: I6843c003a2604afe95cfdccf2323d2a336b56fe5 * Fix check for runtime Change-Id: I3f9eec15c599f01b1105d624fb053b73bfb6ed41 * Address comments * Add warning to ACL engine creation * Correct runtime check so it doesn't fail when codegen not present * Improve testing to check acl partitions is what is expected * Check results of multiple runs test Change-Id: I9522950930805b9b601dad03269adcf8ed3138cc * Address comments * Multiple style improvements * Use base class for creating json node for single op * Move GetSource to base class * Improve annotation checks Change-Id: I8219659c4b99e86df887cd914720157cb94c61a0 * Improve tutorial Change-Id: I8f610bd37af1e3740fd48c2d502bcc4727d9d712 * Initialize conv with nullptr Change-Id: I6c37f0d75a064001c74e171ff83b9f7a7c3f1918
* [BYOC][Contrib] Arm Compute Library integration Arm Compute Library (ACL) integration using the BYOC infrastructure. This will enable offloading select operators from a relay graph to ACL so we can achieve faster inference times on Arm CPU's due to hand crafted optimized routines. The PR adds initial support for offloading FP32 conv2d, maxpool2d and reshape to ACL. ACL codegen is used to generate a JSON representation of an operator or 'ACL layer', the ACL runtime then uses this representation to construct a layer, cache it and create a packed function to for the graph runtime to call into. RFC here: https://discuss.tvm.ai/t/rfc-byoc-arm-compute-library-integration/7082 Change-Id: If756dcea787ea346b1508e9a191b7eed7bd02b7f * Refactor ACL integration to support JSON runtime * Now uses JSON runtime * Addresses tutorial comments * Rename acl to arm_compute_lib in user facing api Change-Id: I3b5ef80607f713e898363e82ab4398fbc2cf267a * Address comments Change-Id: I041fda14f3bf9975f3518ba8a4e3ab43ba98403d * Address comments * correct mistakes in tutorial * reshuffle runtime to use fewer macro blocks * preprocess module using "optimize" functionality * use new module api Change-Id: I219488e617e5767edd7489b43b8bfce876cd24b8 * Enable ACL codegen tests in CI * Skips runtime tests as these are not supported on x86. Change-Id: I6843c003a2604afe95cfdccf2323d2a336b56fe5 * Fix check for runtime Change-Id: I3f9eec15c599f01b1105d624fb053b73bfb6ed41 * Address comments * Add warning to ACL engine creation * Correct runtime check so it doesn't fail when codegen not present * Improve testing to check acl partitions is what is expected * Check results of multiple runs test Change-Id: I9522950930805b9b601dad03269adcf8ed3138cc * Address comments * Multiple style improvements * Use base class for creating json node for single op * Move GetSource to base class * Improve annotation checks Change-Id: I8219659c4b99e86df887cd914720157cb94c61a0 * Improve tutorial Change-Id: I8f610bd37af1e3740fd48c2d502bcc4727d9d712 * Initialize conv with nullptr Change-Id: I6c37f0d75a064001c74e171ff83b9f7a7c3f1918
* [BYOC][Contrib] Arm Compute Library integration Arm Compute Library (ACL) integration using the BYOC infrastructure. This will enable offloading select operators from a relay graph to ACL so we can achieve faster inference times on Arm CPU's due to hand crafted optimized routines. The PR adds initial support for offloading FP32 conv2d, maxpool2d and reshape to ACL. ACL codegen is used to generate a JSON representation of an operator or 'ACL layer', the ACL runtime then uses this representation to construct a layer, cache it and create a packed function to for the graph runtime to call into. RFC here: https://discuss.tvm.ai/t/rfc-byoc-arm-compute-library-integration/7082 Change-Id: If756dcea787ea346b1508e9a191b7eed7bd02b7f * Refactor ACL integration to support JSON runtime * Now uses JSON runtime * Addresses tutorial comments * Rename acl to arm_compute_lib in user facing api Change-Id: I3b5ef80607f713e898363e82ab4398fbc2cf267a * Address comments Change-Id: I041fda14f3bf9975f3518ba8a4e3ab43ba98403d * Address comments * correct mistakes in tutorial * reshuffle runtime to use fewer macro blocks * preprocess module using "optimize" functionality * use new module api Change-Id: I219488e617e5767edd7489b43b8bfce876cd24b8 * Enable ACL codegen tests in CI * Skips runtime tests as these are not supported on x86. Change-Id: I6843c003a2604afe95cfdccf2323d2a336b56fe5 * Fix check for runtime Change-Id: I3f9eec15c599f01b1105d624fb053b73bfb6ed41 * Address comments * Add warning to ACL engine creation * Correct runtime check so it doesn't fail when codegen not present * Improve testing to check acl partitions is what is expected * Check results of multiple runs test Change-Id: I9522950930805b9b601dad03269adcf8ed3138cc * Address comments * Multiple style improvements * Use base class for creating json node for single op * Move GetSource to base class * Improve annotation checks Change-Id: I8219659c4b99e86df887cd914720157cb94c61a0 * Improve tutorial Change-Id: I8f610bd37af1e3740fd48c2d502bcc4727d9d712 * Initialize conv with nullptr Change-Id: I6c37f0d75a064001c74e171ff83b9f7a7c3f1918
* [BYOC][Contrib] Arm Compute Library integration Arm Compute Library (ACL) integration using the BYOC infrastructure. This will enable offloading select operators from a relay graph to ACL so we can achieve faster inference times on Arm CPU's due to hand crafted optimized routines. The PR adds initial support for offloading FP32 conv2d, maxpool2d and reshape to ACL. ACL codegen is used to generate a JSON representation of an operator or 'ACL layer', the ACL runtime then uses this representation to construct a layer, cache it and create a packed function to for the graph runtime to call into. RFC here: https://discuss.tvm.ai/t/rfc-byoc-arm-compute-library-integration/7082 Change-Id: If756dcea787ea346b1508e9a191b7eed7bd02b7f * Refactor ACL integration to support JSON runtime * Now uses JSON runtime * Addresses tutorial comments * Rename acl to arm_compute_lib in user facing api Change-Id: I3b5ef80607f713e898363e82ab4398fbc2cf267a * Address comments Change-Id: I041fda14f3bf9975f3518ba8a4e3ab43ba98403d * Address comments * correct mistakes in tutorial * reshuffle runtime to use fewer macro blocks * preprocess module using "optimize" functionality * use new module api Change-Id: I219488e617e5767edd7489b43b8bfce876cd24b8 * Enable ACL codegen tests in CI * Skips runtime tests as these are not supported on x86. Change-Id: I6843c003a2604afe95cfdccf2323d2a336b56fe5 * Fix check for runtime Change-Id: I3f9eec15c599f01b1105d624fb053b73bfb6ed41 * Address comments * Add warning to ACL engine creation * Correct runtime check so it doesn't fail when codegen not present * Improve testing to check acl partitions is what is expected * Check results of multiple runs test Change-Id: I9522950930805b9b601dad03269adcf8ed3138cc * Address comments * Multiple style improvements * Use base class for creating json node for single op * Move GetSource to base class * Improve annotation checks Change-Id: I8219659c4b99e86df887cd914720157cb94c61a0 * Improve tutorial Change-Id: I8f610bd37af1e3740fd48c2d502bcc4727d9d712 * Initialize conv with nullptr Change-Id: I6c37f0d75a064001c74e171ff83b9f7a7c3f1918
Arm Compute Library (ACL) integration using the BYOC infrastructure. This will enable offloading select operators from a relay graph to ACL so we can achieve faster inference times on Arm CPU's due to hand crafted optimized routines. The PR adds initial support for offloading FP32 conv2d, maxpool2d and reshape to ACL. ACL codegen is used to generate a JSON representation of an operator or 'ACL layer', the ACL runtime then uses this representation to construct a layer, cache it and create a packed
function to for the graph runtime to call into.
RFC here: https://discuss.tvm.ai/t/rfc-byoc-arm-compute-library-integration/7082
Change-Id: If756dcea787ea346b1508e9a191b7eed7bd02b7f