From 9c593a138be41a28db076e6c84e38090e49b16d2 Mon Sep 17 00:00:00 2001 From: Mher Kazandjian Date: Tue, 1 Jun 2021 03:01:30 +0200 Subject: [PATCH] allow libbacktrace to be used when cross compiling the runtime (#7917) --- cmake/libs/Libbacktrace.cmake | 27 +++++++- docs/deploy/index.rst | 115 ++++++++++++++++++++++++++++++++-- docs/install/from_source.rst | 28 +++++---- 3 files changed, 152 insertions(+), 18 deletions(-) diff --git a/cmake/libs/Libbacktrace.cmake b/cmake/libs/Libbacktrace.cmake index 742855358809..58eb4e02bb5b 100644 --- a/cmake/libs/Libbacktrace.cmake +++ b/cmake/libs/Libbacktrace.cmake @@ -14,14 +14,39 @@ # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. + +# On MacOS, the default C compiler (/usr/bin/cc) is actually a small script that dispatches to a +# compiler the default SDK (usually /Library/Developer/CommandLineTools/usr/bin/ or +# /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/). CMake +# automatically detects what is being dispatched and uses it instead along with all the flags it +# needs. CMake makes this second compiler avaliable through the CMAKE_C_COMPILER variable, but it +# does not make the necessary flags available. This leads to configuration errors in libbacktrace +# because it can't find system libraries. Our solution is to detect if CMAKE_C_COMPILER lives in +# /Library or /Applications and switch to the default compiler instead. include(ExternalProject) + +if(CMAKE_SYSTEM_NAME MATCHES "Darwin" AND (CMAKE_C_COMPILER MATCHES "^/Library" + OR CMAKE_C_COMPILER MATCHES "^/Applications")) + set(c_compiler "/usr/bin/cc") + else() + set(c_compiler "${CMAKE_C_COMPILER}") +endif() + ExternalProject_Add(project_libbacktrace PREFIX libbacktrace SOURCE_DIR ${CMAKE_CURRENT_LIST_DIR}/../../3rdparty/libbacktrace BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR}/libbacktrace CONFIGURE_COMMAND "${CMAKE_CURRENT_LIST_DIR}/../../3rdparty/libbacktrace/configure" - "--prefix=${CMAKE_CURRENT_BINARY_DIR}/libbacktrace" --with-pic + "--prefix=${CMAKE_CURRENT_BINARY_DIR}/libbacktrace" + --with-pic + "CC=${c_compiler}" + "CFLAGS=${CMAKE_C_FLAGS}" + "LDFLAGS=${CMAKE_EXE_LINKER_FLAGS}" + "CPP=${c_compiler} -E" + "NM=${CMAKE_NM}" + "STRIP=${CMAKE_STRIP}" + "--host=${MACHINE_NAME}" INSTALL_DIR "${CMAKE_CURRENT_BINARY_DIR}/libbacktrace" BUILD_COMMAND make INSTALL_COMMAND make install diff --git a/docs/deploy/index.rst b/docs/deploy/index.rst index 3cbbb10bd74b..b127de982b61 100644 --- a/docs/deploy/index.rst +++ b/docs/deploy/index.rst @@ -25,12 +25,20 @@ as well as how to integrate it with your project. .. image:: https://tvm.apache.org/images/release/tvm_flexible.png +Build the TVM runtime library +----------------------------- + +.. _build-tvm-runtime-on-target-device: + Unlike traditional deep learning frameworks. TVM stack is divided into two major components: -- TVM compiler, which does all the compilation and optimizations +- TVM compiler, which does all the compilation and optimizations of the model - TVM runtime, which runs on the target devices. -In order to integrate the compiled module, we **do not** need to build entire TVM on the target device. You only need to build the TVM compiler stack on your desktop and use that to cross-compile modules that are deployed on the target device. +In order to integrate the compiled module, we **do not** need to build entire +TVM on the target device. You only need to build the TVM compiler stack on your +desktop and use that to cross-compile modules that are deployed on the target device. + We only need to use a light-weight runtime API that can be integrated into various platforms. For example, you can run the following commands to build the runtime API @@ -46,11 +54,103 @@ on a Linux based embedded system such as Raspberry Pi: cmake .. make runtime -Note that we type `make runtime` to only build the runtime library. +Note that we type ``make runtime`` to only build the runtime library. + +It is also possible to cross compile the runtime. Cross compiling +the runtime library should not be confused with cross compiling models +for embedded devices. + If you want to include additional runtime such as OpenCL, -you can modify `config.cmake` to enable these options. +you can modify ``config.cmake`` to enable these options. After you get the TVM runtime library, you can link the compiled library +.. figure:: https://raw.githubusercontent.com/tlc-pack/web-data/main/images/dev/tvm_deploy_crosscompile.svg + :align: center + :width: 85% + +A model (optimized or not by TVM) can be cross compiled by TVM for +different architectures such as ``aarch64`` on a ``x64_64`` host. Once the model +is cross compiled it is neccessary to have a runtime compatible with the target +architecture to be able to run the cross compiled model. + + +Cross compile the TVM runtime for other architectures +----------------------------------------------------- + +In the example :ref:`above ` the runtime library was +compiled on a Raspberry Pi. Producing the runtime library can be done much faster on +hosts that have high performace processors with ample resources (such as laptops, workstation) +compared to a target devices such as a Raspberry Pi. In-order to cross compile the runtime the toolchain +for the target device must be installed. After installing the correct toolchain, +the main difference compared to compiling natively is to pass some additional command +line argument to cmake that specify a toolchain to be used. For reference +building the TVM runtime library on a modern laptop (using 8 threads) for ``aarch64`` +takes around 20 seconds vs ~10 min to build the runtime on a Raspberry Pi 4. + +cross-compile for aarch64 +""""""""""""""""""""""""" + +.. code-block:: bash + + sudo apt-get update + sudo apt-get install gcc-aarch64-linux-gnu g++-aarch64-linux-gnu + +.. code-block:: bash + + cmake .. \ + -DCMAKE_SYSTEM_NAME=Linux \ + -DCMAKE_SYSTEM_VERSION=1 \ + -DCMAKE_C_COMPILER=/usr/bin/aarch64-linux-gnu-gcc \ + -DCMAKE_CXX_COMPILER=/usr/bin/aarch64-linux-gnu-g++ \ + -DCMAKE_FIND_ROOT_PATH=/usr/aarch64-linux-gnu \ + -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \ + -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \ + -DMACHINE_NAME=aarch64-linux-gnu + + make -j$(nproc) runtime + +For bare metal ARM devices the following toolchain is quite handy to install instead of gcc-aarch64-linux-* + +.. code-block:: bash + + sudo apt-get install gcc-multilib-arm-linux-gnueabihf g++-multilib-arm-linux-gnueabihf + + +cross-compile for RISC-V +""""""""""""""""""""""""" + +.. code-block:: bash + + sudo apt-get update + sudo apt-get install gcc-riscv64-linux-gnu g++-riscv64-linux-gnu + + +.. code-block:: bash + + cmake .. \ + -DCMAKE_SYSTEM_NAME=Linux \ + -DCMAKE_SYSTEM_VERSION=1 \ + -DCMAKE_C_COMPILER=/usr/bin/riscv64-linux-gnu-gcc \ + -DCMAKE_CXX_COMPILER=/usr/bin/riscv64-linux-gnu-g++ \ + -DCMAKE_FIND_ROOT_PATH=/usr/riscv64-linux-gnu \ + -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \ + -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \ + -DMACHINE_NAME=riscv64-linux-gnu + + make -j$(nproc) runtime + +The ``file`` command can be used to query the architecture of the produced runtime. + + +.. code-block:: bash + + file libtvm_runtime.so + libtvm_runtime.so: ELF 64-bit LSB shared object, UCB RISC-V, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=e9ak845b3d7f2c126dab53632aea8e012d89477e, not stripped + + +Optimize and tune models for target devices +------------------------------------------- + The easiest and recommended way to test, tune and benchmark TVM kernels on embedded devices is through TVM's RPC API. Here are the links to the related tutorials. @@ -58,8 +158,11 @@ Here are the links to the related tutorials. - :ref:`tutorial-cross-compilation-and-rpc` - :ref:`tutorial-deploy-model-on-rasp` +Deploy optimized model on target devices +---------------------------------------- + After you finished tuning and benchmarking, you might need to deploy the model on the -target device without relying on RPC. see the following resources on how to do so. +target device without relying on RPC. See the following resources on how to do so. .. toctree:: :maxdepth: 2 @@ -72,3 +175,5 @@ target device without relying on RPC. see the following resources on how to do s tensorrt vitis_ai bnns + + diff --git a/docs/install/from_source.rst b/docs/install/from_source.rst index bc6cdb90da15..5d723d1ce048 100644 --- a/docs/install/from_source.rst +++ b/docs/install/from_source.rst @@ -51,27 +51,31 @@ Build the Shared Library Our goal is to build the shared libraries: -- On Linux the target library are `libtvm.so` -- On macOS the target library are `libtvm.dylib` -- On Windows the target library are `libtvm.dll` + - On Linux the target library are `libtvm.so` and `libtvm_runtime.so` + - On macOS the target library are `libtvm.dylib` and `libtvm_runtime.dylib` + - On Windows the target library are `libtvm.dll` and `libtvm_runtime.dll` +It is also possible to :ref:`build the runtime ` library only. + +The minimal building requirements for the ``TVM`` libraries are: + + - A recent c++ compiler supporting C++ 14 (g++-5 or higher) + - CMake 3.5 or higher + - We highly recommend to build with LLVM to enable all the features. + - If you want to use CUDA, CUDA toolkit version >= 8.0 is required. If you are upgrading from an older version, make sure you purge the older version and reboot after installation. + - On macOS, you may want to install `Homebrew ` to easily install and manage dependencies. + +To install the these minimal pre-requisites on Ubuntu/Debian like +linux operating systems, execute (in a terminal): .. code:: bash sudo apt-get update sudo apt-get install -y python3 python3-dev python3-setuptools gcc libtinfo-dev zlib1g-dev build-essential cmake libedit-dev libxml2-dev -The minimal building requirements are - -- A recent c++ compiler supporting C++ 14 (g++-5 or higher) -- CMake 3.5 or higher -- We highly recommend to build with LLVM to enable all the features. -- If you want to use CUDA, CUDA toolkit version >= 8.0 is required. If you are upgrading from an older version, make sure you purge the older version and reboot after installation. -- On macOS, you may want to install `Homebrew ` to easily install and manage dependencies. - We use cmake to build the library. -The configuration of TVM can be modified by `config.cmake`. +The configuration of TVM can be modified by editing `config.cmake` and/or by passing cmake flags to the command line: - First, check the cmake in your system. If you do not have cmake,