Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hexagon] Implement model launcher #8986

Merged
merged 6 commits into from
Sep 16, 2021
Merged

[Hexagon] Implement model launcher #8986

merged 6 commits into from
Sep 16, 2021

Conversation

kparzysz-quic
Copy link
Contributor

This implements a launcher that allows execution of ML models compiled into a shared library on Hexagon DSP. It consists of two parts: the Hexagon-side skel library and launcher_android to be used from adb shell.

The launcher does not implement any performance-related optimizations, it's built on top of the graph_executor from TVM runtime, and so it executes a single layer at a time. This launcher should not be used to measure performance (because if will be highly suboptimal), its main purpose is to help in validating correctness.

@tmoreau89
Copy link
Contributor

CC @csullivan @adstraw

This implements a launcher that allows execution of ML models compiled
into a shared library on Hexagon DSP. It consists of two parts: the
Hexagon-side skel library and `launcher_android` to be used from
`adb shell`.

The launcher does not implement any performance-related optimizations,
it's built on top of the `graph_executor` from TVM runtime, and so it
executes a single layer at a time. This launcher should not be used to
measure performance (because if will be highly suboptimal), its main
purpose is to help in validating correctness.
Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kparzysz-quic thanks for pushing this up! just a couple questions/nits, can merge those in follow on if we really need them.


The launcher consists of two parts: part running on Hexagon, and part running
on Android. They need to be compiled separately. Since some source files are
shared between these two parts, make sure to delete all object files beteween
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: between

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

- `liblauncher_rpc_skel.so`,
- `libgcc.so` (this one should come from the Hexagon toolchain),
- `launcher_android`,
- `libtvm_runtime.so` (for Android).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe clarify: for the Android-side binary or even launcher_android

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I addressed it, let me know if that's what you meant.

auto input = tvm::runtime::NDArray::FromDLPack(&managed);

tvm::runtime::PackedFunc set_input = get_module_func(TheModel->graph_executor, "set_input");
set_input(input_idx, input);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my understanding: is it possible to use set_input_zero_copy here? or, does input_value go away when this call returns?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_input_zero_copy might work, but I haven't tested it.

<!--- KIND, either express or implied. See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->
# Hexagon Graph Launcher
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a mention here of the Hexagon devices / Snapdragon SoCs that we expect the launcher to work on / have tested the launcher on?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I think one can infer from l47: one of v65, v66, v68

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But explicitly stating the Snapdragon devices could be useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
```

The launcher does not perform any correctness verification. In order to verify
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps preface these 2 paragraphs with # Disclaimer or Future work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


These are only the binaries related to the launcher itself. To run a model
copy the shared object with the model and the model JSON file over to the
device (both are obtained from relay). Also, copy all input files for the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful for the general public to indicate how these can be produced out of relay with a small python snippet. Or have a small python script that does this in the same way of the howtodeploy example: https://github.com/apache/tvm/blob/main/apps/howto_deploy/prepare_test_libs.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

- `HEXAGON_ARCH` to one of v65, v66, v68
- `TVM_RUNTIME_HEXAGON=/path/to/libtvm_runtime.a` _statically_ linked
TVM runtime
Make sure to provide the path to launcher's `CMakeLists.txt` directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need an extra space here otherwise this line appears as a continuation of the previous bullet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

2. Create a subdirectory for the build files, and run `cmake` with the
following variables set:
- `FASTRPC_LIBS=SKEL`
- `HEXAGON_SDK_ROOT` to the path to the Hexagon SDK
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `HEXAGON_SDK_ROOT` to the path to the Hexagon SDK
- `USE_HEXAGON_SDK` to the path to the Hexagon SDK

nit: would be nice to normalize to the naming convention used for the hexagon cmake variables in TVM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

- `HEXAGON_SDK_ROOT` to the path to the Hexagon SDK
- `CMAKE_C_COMPILER=hexagon-clang`
- `CMAKE_CXX_COMPILER=hexagon-clang++`
- `HEXAGON_ARCH` to one of v65, v66, v68
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `HEXAGON_ARCH` to one of v65, v66, v68
- `USE_HEXAGON_ARCH` to one of v65, v66, v68

nit: would be nice to normalize to the naming convention used for the hexagon cmake variables in TVM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 59 to 60
1. Build TVM runtime for Android. Unlike in the Hexagon case, this should be
the dynamic library (which is the default), i.e. `libtvm_runtime.so`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elaborate here to include that the the compiler used here should be the aarch64 linux compiler for android.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 38 to 41
1. Build the static version of TVM runtime for Hexagon: this step is the same
as building the shared version, except at the cmake step, add
`-DBUILD_STATIC_RUNTIME=ON`. The compilation step should create
`libtvm_runtime.a`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elaborate here to include that the the compiler used here should be the hexagon-clang compiler from the hexagon toolchain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


2. Create a subdirectory for the build files (different from the one used for
Hexagon files), and run `cmake` with the following variables set:
- `FASTRPC_LIBS=STUB`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I follow these instructions and try to run the launcher on an 888 device I am seeing an error opening the FastRPC channel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be a number of reasons for that. The diagnostic output from mini-dm usually contains enough information to help resolve it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding a comment about using mini-dm to inspect issues with FastRPC? I'll leave that up to your discretion.

Copy link
Contributor

@tmoreau89 tmoreau89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @kparzysz-quic for promptly addressing all of the comments and requests. Before merging we're working on reproducing the example to confirm that the instructions are accurate and that the launcher is functional.

Copy link
Contributor

@csullivan csullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @kparzysz-quic! I've reproduced the flow as written in the Readme.md locally from this PR, still with a FastRPC error but we can readdress the Readme should it be necessary post-merge. Excited to have model execution on Hexagon via TVM main 🎉 .

Copy link
Contributor

@tmoreau89 tmoreau89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we have no more blockers!

@masahi masahi merged commit 148ddca into apache:main Sep 16, 2021
@kparzysz-quic kparzysz-quic deleted the launcher-upstream branch September 16, 2021 14:08
AndrewZhaoLuo added a commit to AndrewZhaoLuo/tvm that referenced this pull request Sep 16, 2021
* main: (102 commits)
  Implementation of relay_to_tir target hook (apache#8423)
  [Onnx] Fix NLL Loss tests (apache#8971)
  [Bugfix] Fix other div zero errors also in rewrite_simplify (apache#8983)
  [ONNX] enable the onnx tests after PR apache#8274 merged (apache#9019)
  [Hexagon] Disable `thread_local` on Hexagon (apache#9025)
  [Hexagon] Allow undefined symbols in libtvm_runtime.so on Hexagon (apache#9024)
  [Onnx] Add momentum (apache#9000)
  fix (apache#9021)
  [Community] @AndrewZhaoLuo -> Reviewer (apache#9020)
  [Hexagon] Implement model launcher (apache#8986)
  [Relay][Pass] Add ExtractOperators pass (apache#8996)
  [BYOC][TensorRT] Add TensorRT own int8 calibration support to TensorRT BYOC integration (apache#8808)
  [ONNX] Add Einsum converter (apache#8985)
  Add standalone_crt/ to be part of the wheel package, when available. (apache#9005)
  [Relay] Remove memory planing from LowerTEPass  (apache#8974)
  [Hexagon] Treat floats as float32 when passing args to offloaded kernels (apache#9010)
  [Runtime] Pipeline Executor Initial patch. (apache#8702)
  [Hexagon] `llvm-options` attribute is an array of strings (apache#9011)
  disable cuda int8 schedule for non-cuda gpu target (apache#9014)
  [Torch] Add an option to make imported models compatible with the Relay text parser (apache#9015)
  ...
ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021
* [Hexagon] Implement model launcher

This implements a launcher that allows execution of ML models compiled
into a shared library on Hexagon DSP. It consists of two parts: the
Hexagon-side skel library and `launcher_android` to be used from
`adb shell`.

The launcher does not implement any performance-related optimizations,
it's built on top of the `graph_executor` from TVM runtime, and so it
executes a single layer at a time. This launcher should not be used to
measure performance (because if will be highly suboptimal), its main
purpose is to help in validating correctness.

* Address review comments: explanations and elaborations in README.md

* Rename cmake variables to be same as for TVM

- `HEXAGON_SDK_ROOT` -> `USE_HEXAGON_SDK`
- `HEXAGON_ARCH` -> `USE_HEXAGON_ARCH`

* Address more review comments

* Error out in cmake when USE_HEXAGON_SDK/USE_HEXAGON_ARCH are undefined

* Change FATAL_ERROR to SEND_ERROR in cmake file
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
* [Hexagon] Implement model launcher

This implements a launcher that allows execution of ML models compiled
into a shared library on Hexagon DSP. It consists of two parts: the
Hexagon-side skel library and `launcher_android` to be used from
`adb shell`.

The launcher does not implement any performance-related optimizations,
it's built on top of the `graph_executor` from TVM runtime, and so it
executes a single layer at a time. This launcher should not be used to
measure performance (because if will be highly suboptimal), its main
purpose is to help in validating correctness.

* Address review comments: explanations and elaborations in README.md

* Rename cmake variables to be same as for TVM

- `HEXAGON_SDK_ROOT` -> `USE_HEXAGON_SDK`
- `HEXAGON_ARCH` -> `USE_HEXAGON_ARCH`

* Address more review comments

* Error out in cmake when USE_HEXAGON_SDK/USE_HEXAGON_ARCH are undefined

* Change FATAL_ERROR to SEND_ERROR in cmake file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants