-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Hexagon] Implement model launcher #8986
Conversation
This implements a launcher that allows execution of ML models compiled into a shared library on Hexagon DSP. It consists of two parts: the Hexagon-side skel library and `launcher_android` to be used from `adb shell`. The launcher does not implement any performance-related optimizations, it's built on top of the `graph_executor` from TVM runtime, and so it executes a single layer at a time. This launcher should not be used to measure performance (because if will be highly suboptimal), its main purpose is to help in validating correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kparzysz-quic thanks for pushing this up! just a couple questions/nits, can merge those in follow on if we really need them.
|
||
The launcher consists of two parts: part running on Hexagon, and part running | ||
on Android. They need to be compiled separately. Since some source files are | ||
shared between these two parts, make sure to delete all object files beteween |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: between
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
- `liblauncher_rpc_skel.so`, | ||
- `libgcc.so` (this one should come from the Hexagon toolchain), | ||
- `launcher_android`, | ||
- `libtvm_runtime.so` (for Android). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe clarify: for the Android-side binary or even launcher_android
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I addressed it, let me know if that's what you meant.
auto input = tvm::runtime::NDArray::FromDLPack(&managed); | ||
|
||
tvm::runtime::PackedFunc set_input = get_module_func(TheModel->graph_executor, "set_input"); | ||
set_input(input_idx, input); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for my understanding: is it possible to use set_input_zero_copy here? or, does input_value
go away when this call returns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_input_zero_copy
might work, but I haven't tested it.
<!--- KIND, either express or implied. See the License for the --> | ||
<!--- specific language governing permissions and limitations --> | ||
<!--- under the License. --> | ||
# Hexagon Graph Launcher |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a mention here of the Hexagon devices / Snapdragon SoCs that we expect the launcher to work on / have tested the launcher on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I think one can infer from l47: one of v65, v66, v68
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But explicitly stating the Snapdragon devices could be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
``` | ||
|
||
The launcher does not perform any correctness verification. In order to verify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps preface these 2 paragraphs with # Disclaimer or Future work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
These are only the binaries related to the launcher itself. To run a model | ||
copy the shared object with the model and the model JSON file over to the | ||
device (both are obtained from relay). Also, copy all input files for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful for the general public to indicate how these can be produced out of relay with a small python snippet. Or have a small python script that does this in the same way of the howtodeploy example: https://github.com/apache/tvm/blob/main/apps/howto_deploy/prepare_test_libs.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
- `HEXAGON_ARCH` to one of v65, v66, v68 | ||
- `TVM_RUNTIME_HEXAGON=/path/to/libtvm_runtime.a` _statically_ linked | ||
TVM runtime | ||
Make sure to provide the path to launcher's `CMakeLists.txt` directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need an extra space here otherwise this line appears as a continuation of the previous bullet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
2. Create a subdirectory for the build files, and run `cmake` with the | ||
following variables set: | ||
- `FASTRPC_LIBS=SKEL` | ||
- `HEXAGON_SDK_ROOT` to the path to the Hexagon SDK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `HEXAGON_SDK_ROOT` to the path to the Hexagon SDK | |
- `USE_HEXAGON_SDK` to the path to the Hexagon SDK |
nit: would be nice to normalize to the naming convention used for the hexagon cmake variables in TVM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
- `HEXAGON_SDK_ROOT` to the path to the Hexagon SDK | ||
- `CMAKE_C_COMPILER=hexagon-clang` | ||
- `CMAKE_CXX_COMPILER=hexagon-clang++` | ||
- `HEXAGON_ARCH` to one of v65, v66, v68 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `HEXAGON_ARCH` to one of v65, v66, v68 | |
- `USE_HEXAGON_ARCH` to one of v65, v66, v68 |
nit: would be nice to normalize to the naming convention used for the hexagon cmake variables in TVM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
1. Build TVM runtime for Android. Unlike in the Hexagon case, this should be | ||
the dynamic library (which is the default), i.e. `libtvm_runtime.so`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Elaborate here to include that the the compiler used here should be the aarch64 linux compiler for android.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
1. Build the static version of TVM runtime for Hexagon: this step is the same | ||
as building the shared version, except at the cmake step, add | ||
`-DBUILD_STATIC_RUNTIME=ON`. The compilation step should create | ||
`libtvm_runtime.a`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Elaborate here to include that the the compiler used here should be the hexagon-clang compiler from the hexagon toolchain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
2. Create a subdirectory for the build files (different from the one used for | ||
Hexagon files), and run `cmake` with the following variables set: | ||
- `FASTRPC_LIBS=STUB` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I follow these instructions and try to run the launcher on an 888 device I am seeing an error opening the FastRPC channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be a number of reasons for that. The diagnostic output from mini-dm
usually contains enough information to help resolve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth adding a comment about using mini-dm
to inspect issues with FastRPC? I'll leave that up to your discretion.
- `HEXAGON_SDK_ROOT` -> `USE_HEXAGON_SDK` - `HEXAGON_ARCH` -> `USE_HEXAGON_ARCH`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @kparzysz-quic for promptly addressing all of the comments and requests. Before merging we're working on reproducing the example to confirm that the instructions are accurate and that the launcher is functional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @kparzysz-quic! I've reproduced the flow as written in the Readme.md locally from this PR, still with a FastRPC error but we can readdress the Readme should it be necessary post-merge. Excited to have model execution on Hexagon via TVM main 🎉 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like we have no more blockers!
* main: (102 commits) Implementation of relay_to_tir target hook (apache#8423) [Onnx] Fix NLL Loss tests (apache#8971) [Bugfix] Fix other div zero errors also in rewrite_simplify (apache#8983) [ONNX] enable the onnx tests after PR apache#8274 merged (apache#9019) [Hexagon] Disable `thread_local` on Hexagon (apache#9025) [Hexagon] Allow undefined symbols in libtvm_runtime.so on Hexagon (apache#9024) [Onnx] Add momentum (apache#9000) fix (apache#9021) [Community] @AndrewZhaoLuo -> Reviewer (apache#9020) [Hexagon] Implement model launcher (apache#8986) [Relay][Pass] Add ExtractOperators pass (apache#8996) [BYOC][TensorRT] Add TensorRT own int8 calibration support to TensorRT BYOC integration (apache#8808) [ONNX] Add Einsum converter (apache#8985) Add standalone_crt/ to be part of the wheel package, when available. (apache#9005) [Relay] Remove memory planing from LowerTEPass (apache#8974) [Hexagon] Treat floats as float32 when passing args to offloaded kernels (apache#9010) [Runtime] Pipeline Executor Initial patch. (apache#8702) [Hexagon] `llvm-options` attribute is an array of strings (apache#9011) disable cuda int8 schedule for non-cuda gpu target (apache#9014) [Torch] Add an option to make imported models compatible with the Relay text parser (apache#9015) ...
* [Hexagon] Implement model launcher This implements a launcher that allows execution of ML models compiled into a shared library on Hexagon DSP. It consists of two parts: the Hexagon-side skel library and `launcher_android` to be used from `adb shell`. The launcher does not implement any performance-related optimizations, it's built on top of the `graph_executor` from TVM runtime, and so it executes a single layer at a time. This launcher should not be used to measure performance (because if will be highly suboptimal), its main purpose is to help in validating correctness. * Address review comments: explanations and elaborations in README.md * Rename cmake variables to be same as for TVM - `HEXAGON_SDK_ROOT` -> `USE_HEXAGON_SDK` - `HEXAGON_ARCH` -> `USE_HEXAGON_ARCH` * Address more review comments * Error out in cmake when USE_HEXAGON_SDK/USE_HEXAGON_ARCH are undefined * Change FATAL_ERROR to SEND_ERROR in cmake file
* [Hexagon] Implement model launcher This implements a launcher that allows execution of ML models compiled into a shared library on Hexagon DSP. It consists of two parts: the Hexagon-side skel library and `launcher_android` to be used from `adb shell`. The launcher does not implement any performance-related optimizations, it's built on top of the `graph_executor` from TVM runtime, and so it executes a single layer at a time. This launcher should not be used to measure performance (because if will be highly suboptimal), its main purpose is to help in validating correctness. * Address review comments: explanations and elaborations in README.md * Rename cmake variables to be same as for TVM - `HEXAGON_SDK_ROOT` -> `USE_HEXAGON_SDK` - `HEXAGON_ARCH` -> `USE_HEXAGON_ARCH` * Address more review comments * Error out in cmake when USE_HEXAGON_SDK/USE_HEXAGON_ARCH are undefined * Change FATAL_ERROR to SEND_ERROR in cmake file
This implements a launcher that allows execution of ML models compiled into a shared library on Hexagon DSP. It consists of two parts: the Hexagon-side skel library and
launcher_android
to be used fromadb shell
.The launcher does not implement any performance-related optimizations, it's built on top of the
graph_executor
from TVM runtime, and so it executes a single layer at a time. This launcher should not be used to measure performance (because if will be highly suboptimal), its main purpose is to help in validating correctness.