Jetson AGX/TX2/Nano - Build release from pip3 install #1982

alexis-gruet-deel · 2020-07-09T18:16:22Z

Describe the feature and the current behavior/state.
Doing a pip3 install tensorflow-addons from an Nvidia TX2 produce No matching distribution found for tensorflow-addons. My current TensorFlow is 2.1.0 from Jetpack 4.3.

The text was updated successfully, but these errors were encountered:

bhack · 2020-07-09T19:39:06Z

The main issue Is that we need tensorflow/build#9

alexis-gruet-deel · 2020-07-12T20:37:18Z

I tried to build from src the r0.8.3 ; However, I'm not able to build w/ bazel. Pb from the build to find cuda libs ; then pb w/ crosstool:toolchain' does not contain a toolchain for cpu 'aarch64 My question ; How you guys compile tfa on tx2 with jetpack 4.3 (TF-2.1 / Cuda 10.0 / Cudnn 7) ?

bhack · 2020-07-12T20:53:39Z

Currently we don't release/build tfa on arm64.
As I told you we need to have Tensorflow build infra on that arch available. Check again tensorflow/build#9.

alexis-gruet-deel · 2020-07-13T11:22:02Z

Thanks @bhack :
I read twice your message and tensorflow/build#9 with limited knowledge on this field thus ; limited understanding on my side. I'm just surprised to see that I can compile for the CPU ; but not able to make it work for the GPU. I understand you are not providing release nor build ;

Can you confirm w/o tensorflow/build#9 there is no way to build tfa from sources on the tx2 for the GPU ? (while apparently make it work from src for the CPU)

bhack · 2020-07-13T11:58:37Z

What I mean is that as in tfa we have custom-ops we need custom-ops infra for arm64.
Nvidia packages like the one you are using from https://developer.download.nvidia.com/compute/redist/jp/v44/tensorflow/ are supported by Nvidia and not officially supported by Tensorflow/SIGs.

So if you want to go ahead with Nvidia packages I suggest you to post in https://forums.developer.nvidia.com/c/agx-autonomous-machines/jetson-embedded-systems/70 cause they know their packages receipts and probably they could add one prepared package also for tfa.
The official (Tensorflow) way to support ARM64 is solve and upvote tensorflow/build#9.

alexis-gruet-deel · 2020-07-13T12:08:11Z

Ok make total sense.
I finally make it works on the GPU by editing the external/local_config_cuda/crosstool/BUILD file and defining "aarch64": ":cc-compiler-local" under cc_toolchain_suite > toolchains { I've tested and everything is ok.

bhack · 2020-07-13T12:20:19Z

@MI-LA01 If you have solved on your local setup you can talk with @bzhaoopenstack to see if you could add a zuul job for tfa at https://github.com/theopenlab/openlab-zuul-jobs other then Tensorflow.

alexis-gruet-deel · 2020-07-13T12:55:44Z

Sure, pls note my version of Jetpack was 4.3 so I was only able to compile the tag <= v0.9.1

Tetsujinfr · 2020-09-28T22:37:48Z

Ok make total sense.
I finally make it works on the GPU by editing the external/local_config_cuda/crosstool/BUILD file and defining "aarch64": ":cc-compiler-local" under cc_toolchain_suite > toolchains { I've tested and everything is ok.

hi
how did you get tfa to work on Jetson pls?
I got the tf2.2 distrib from NVidia using the Jetson GPU and I would like to avoid building it from source. Did you manage to install tfa on top of the NVidia tf distribution?

I have built Bazel from src successfully, and now I am truing to build tfa from source with GPU support.

Here is my external/local_config_cuda/crosstool/BUILD file, I do not see what I did not do correctly, please help me:

licenses(["restricted"])

package(default_visibility = ["//visibility:public"])

load(":cc_toolchain_config.bzl", "cc_toolchain_config")

toolchain(
name = "toolchain-linux-aarch64",
exec_compatible_with = [
"@bazel_tools//platforms:linux",
"@bazel_tools//platforms:aarch64",
],
target_compatible_with = [
"@bazel_tools//platforms:linux",
"@bazel_tools//platforms:aarch64",
],
toolchain = ":cc-compiler-local",
toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
)

cc_toolchain_suite(
name = "toolchain",
toolchains = {
"local|compiler": ":cc-compiler-local",
"k8": ":cc-compiler-local",
"ppc": ":cc-compiler-local",
},
)

cc_toolchain(
name = "cc-compiler-local",
all_files = ":crosstool_wrapper_driver_is_not_gcc",
compiler_files = ":empty",
dwp_files = ":empty",
linker_files = ":crosstool_wrapper_driver_is_not_gcc",
objcopy_files = ":empty",
strip_files = ":empty",
# To support linker flags that need to go to the start of command line
# we need the toolchain to support parameter files. Parameter files are
# last on the command line and contain all shared libraries to link, so all
# regular options will be left of them.
supports_param_files = 1,
toolchain_config = ":cc-compiler-local-config",
toolchain_identifier = "local_linux",
)

cc_toolchain_config(
name = "cc-compiler-local-config",
cpu = "local",
builtin_include_directories = "/usr/include/c++/7,/usr/include/aarch64-linux-gnu/c++/7,/usr/include/c++/7/backward,/usr/lib/gcc/aarch64-linux-gnu/7/include,/usr/local/include,/usr/lib/gcc/aarch64-linux-gnu/7/include-fixed,/usr/include/aarch64-linux-gnu,/usr/include,/usr/local/cuda/targets/aarch64-linux/include,/usr/local/cuda/include,/usr/local/cuda/include,/usr/include".split(","),
extra_no_canonical_prefixes_flags = ["-fno-canonical-system-headers"],
host_compiler_path = "clang/bin/crosstool_wrapper_driver_is_not_gcc",
host_compiler_prefix = "/usr/bin",
host_compiler_warnings = [],
host_unfiltered_compile_flags = [],
linker_bin_path = "/usr/bin",
)

filegroup(
name = "empty",
srcs = [],
)

filegroup(
name = "crosstool_wrapper_driver_is_not_gcc",
srcs = ["clang/bin/crosstool_wrapper_driver_is_not_gcc"],
)

Error msg I still have:

external/local_config_cuda/crosstool/BUILD:22:1: in cc_toolchain_suite rule @local_config_cuda//crosstool:toolchain: cc_toolchain_suite '@local_config_cuda//crosstool:toolchain' does not contain a toolchain for cpu 'aarch64'

thanks for your help

Tetsujinfr · 2020-09-29T01:34:36Z

Ok so I misread you initial post and did not edit the BUILD file at the right place.
So for sake of ref, I have edited the BUILD.tpl file under addons/build_deps/toolchains/gpu/crosstool/ at the right place and the build did work fine.

Modified part of BUILD.tpl:

cc_toolchain_suite(
name = "toolchain",
toolchains = {
"local|compiler": ":cc-compiler-local",
"k8": ":cc-compiler-local",
"ppc": ":cc-compiler-local",
"aarch64": ":cc-compiler-local",
},
)

Thanks or your solution on this, brilliant!

JosephHuang913 · 2020-10-20T06:35:23Z

Hi @MI-LA01

I am trying to build tensorflow-addons-0.7.1 on Jetson nano but in vain.
I have installed bazel-3.6.0 on Jetson nano.
The SDK version is JetPack-4.4, tensorflow is 2.1.0
CUDA version: 10.2
CUDNN version: 8

When I run ./config.sh, it shows

Configuring TensorFlow Addons to be built from source...
fatal: not a git repository (or any of the parent directories): .git

> TensorFlow Addons will link to the framework in a pre-installed TF pacakge...
> Checking installed packages in /usr/bin/python
Traceback (most recent call last):
File "build_deps/check_deps.py", line 7, in
from pip._internal.req import parse_requirements
ImportError: No module named pip._internal.req
Package tensorflow>=2.1.0 will be installed. Are You Sure? [y/n] y
> Installing...
/usr/bin/python: No module named pip
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named tensorflow
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named tensorflow
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named tensorflow

Configuring GPU setup...

Build configurations successfully written to .bazelrc

How do you solve these problems?
Could you reveal more details?

Tetsujinfr · 2020-10-20T10:10:05Z

Hi. In your error msg it says "ImportError: No module named tensorflow so I think you need to install Tensorflow properly first, or make sure tensforflow is availaible to your virtual environment if you are using one.

bhack · 2020-10-20T10:53:27Z

https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html

JosephHuang913 · 2020-10-20T12:04:09Z

In fact, I have installed tensorflow-gpu-2.1.0 on Jetson nano with JetPack-4.4 according to the link provided by @bhack. I also re-installed pip3 but still got the error message “ ImportError: No module named pip._internal.req”. Should I down-grade to JetPack-4.3? I didn’t use virtual environment.

bhack · 2020-10-20T12:13:13Z

I think you have problem with pip. Try to force reinstall pip.

JosephHuang913 · 2020-10-20T12:21:18Z

Hi @bhack ,

I have force reinstalled pip by using the command:

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python3 get-pip.py --force-reinstall

The version of pip3 is up-to-date, however, I still got the error message.

bhack · 2020-10-20T12:44:02Z

Can you try in python:

from pip._internal.req import parse_requirements

JosephHuang913 · 2020-10-21T01:06:46Z

Hi @bhack

This is the result. The parse_requirements module is imported successfully. Do you have any comment?

joseph@jetson-nano:~/Download/addons-0.7.1$ python3
Python 3.6.9 (default, Oct 8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2020-10-21 09:04:18.396359: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
2020-10-21 09:04:21.874398: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libnvinfer.so.7
2020-10-21 09:04:21.876817: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libnvinfer_plugin.so.7
>>> tf.__version__
'2.1.0'
>>> from pip._internal.req import parse_requirements
>>>

bhack · 2020-10-21T01:08:11Z

What Is the error now?

JosephHuang913 · 2020-10-21T01:14:17Z

There is not any error massage.

Tetsujinfr · 2020-10-21T01:17:41Z

I guess, what is the error msg when you try to build tf addons with ./config.sh ?

JosephHuang913 · 2020-10-21T01:22:39Z

The error messages remain the same.

bhack · 2020-10-21T01:24:12Z

“ ImportError: No module named pip._internal.req”

JosephHuang913 · 2020-10-21T01:28:14Z

When I import parse_requirements from pip._internal.req in python3 environment, there is not any error message.
When I run ./configure.sh, I got ImportError: No module named pip._internal.req

bhack · 2020-10-21T01:38:22Z

Can you configure in the same python3 env?

JosephHuang913 · 2020-10-21T02:50:18Z

Hi @bhack ,

Thanks a lot. I have found the problem. The configure.sh of tensorflow-addons uses python-2.7 instead of python-3.6. Thus, I got the error messages.

joseph@jetson-nano:~/Download/addons-0.7.1$ python
Python 2.7.17 (default, Sep 30 2020, 13:38:04)
[GCC 7.5.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named tensorflow
>>> from pip._internal.req import parse_requirements
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named pip._internal.req
>>>

What should I do to solve this problem? Remove python-2.7? How should I uninstall python-2.7?
Or what should I do to configure tensorflow-addons with python-3.6 instead of python-2.7?

JosephHuang913 · 2020-10-21T03:18:01Z

Hi @bhack ,

I use the command to solve this problem.
ln -s /usr/bin/python3.6 /usr/bin/python

Then I met new problems.
What does "fatal: not a git repository (or any of the parent directories): .git" mean?

When the configure process successfully done, I run the following command:
bazel build --enable_runfiles build_pip_pkg
and got these new error messages:

joseph@jetson-nano:~/Download/addons-0.7.1$ bazel build --enable_runfiles build_pip_pkg
Starting local Bazel server and connecting to it...
WARNING: ignoring LD_PRELOAD in environment.
INFO: Repository local_config_cuda instantiated at:
no stack (--record_rule_instantiation_callstack not enabled)
Repository rule cuda_configure defined at:
/home/joseph/Download/addons-0.7.1/build_deps/toolchains/gpu/cuda_configure.bzl:1049:33: in
ERROR: An error occurred during the fetch of repository 'local_config_cuda':
Traceback (most recent call last):
File "/home/joseph/Download/addons-0.7.1/build_deps/toolchains/gpu/cuda_configure.bzl", line 1047, column 38, in _cuda_autoconf_impl
_create_local_cuda_repository(repository_ctx)
File "/home/joseph/Download/addons-0.7.1/build_deps/toolchains/gpu/cuda_configure.bzl", line 824, column 35, in _create_local_cuda_repository
cuda_config = _get_cuda_config(repository_ctx)
File "/home/joseph/Download/addons-0.7.1/build_deps/toolchains/gpu/cuda_configure.bzl", line 629, column 30, in _get_cuda_config
config = find_cuda_config(repository_ctx, ["cuda", "cudnn"])
File "/home/joseph/Download/addons-0.7.1/build_deps/toolchains/gpu/cuda_configure.bzl", line 1037, column 28, in find_cuda_config
auto_configure_fail("Failed to run find_cuda_config.py: %s" % exec_result.stderr)
File "/home/joseph/Download/addons-0.7.1/build_deps/toolchains/gpu/cuda_configure.bzl", line 261, column 9, in auto_configure_fail
fail("\n%sCuda Configuration Error:%s %s\n" % (red, no_color, msg))
Error in fail:
Cuda Configuration Error: Failed to run find_cuda_config.py: Could not find any cudnn.h matching version '8' in any subdirectory:
''
'include'
'include/cuda'
'include/*-linux-gnu'
'extras/CUPTI/include'
'include/cuda/CUPTI'
of:
'/usr'

INFO: Repository rules_java instantiated at:
no stack (--record_rule_instantiation_callstack not enabled)
Repository rule http_archive defined at:
/home/joseph/.cache/bazel/_bazel_joseph/ce34f67686c7dfc7a141bfac8fd86dc1/external/bazel_tools/tools/build_defs/repo/http.bzl:336:31: in
ERROR: /home/joseph/Download/addons-0.7.1/tensorflow_addons/activations/BUILD:5:11: //tensorflow_addons/activations:activations depends on //tensorflow_addons/custom_ops/activations:_activation_ops.so in repository @ which failed to fetch. no such package '@local_config_cuda//cuda':
Cuda Configuration Error: Failed to run find_cuda_config.py: Could not find any cudnn.h matching version '8' in any subdirectory:
''
'include'
'include/cuda'
'include/*-linux-gnu'
'extras/CUPTI/include'
'include/cuda/CUPTI'
of:
'/usr'

ERROR: Analysis of target '//:build_pip_pkg' failed; build aborted: Analysis failed
INFO: Elapsed time: 12.030s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (23 packages loaded, 68 targets configured)
currently loading: tensorflow_addons/custom_ops/activations ... (6 packages)
Fetching @local_config_tf; Restarting.

JosephHuang913 · 2020-10-21T03:52:31Z

What does TF_NEED_CUDA="1" mean?
If I set TF_NEED_CUDA="0", does that mean I can't use my GPU with tensorflow-addons?

bhack · 2020-10-21T08:48:27Z

Yes the problem seems that It cannot find Cudnn files in your system

JosephHuang913 · 2020-10-22T01:00:19Z

Hi @MI-LA01 ,

Do you have any suggestion?

alexis-gruet-deel · 2020-10-22T06:38:21Z

No, I don't really have suggestions. Cudnn is required which make obviously sense.
However, This is the way i fixed the things (was for YoloV4), see there are some environment vars to export, one is cudnn

Good luck!

As a side note :

if you wish to infer with YoloV4 on the jetson.. i suggest to go one way with TKDNN..
You need to checkout the release tag on this repository corresponding to your version of TF,Cuda,Cudnn. On the readme.md of this repo, if i remember well, you have this information provided.

JosephHuang913 · 2020-10-22T08:31:18Z

Hi @MI-LA01 ,

Thanks a lot for you help. I have built and installed tensorflow-addons successfully. However, when I import tensorflow-addons, I got the following error message. Do you have any idea about what's wrong?

joseph@Jetson-Nano:~/Downloads/addons$ python
Python 3.6.9 (default, Oct 8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow_addons as tfa
2020-10-22 16:16:14.321755: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/joseph/Downloads/addons/tensorflow_addons/__init__.py", line 21, in <module>
from tensorflow_addons import activations
File "/home/joseph/Downloads/addons/tensorflow_addons/activations/__init__.py", line 21, in <module>
from tensorflow_addons.activations.gelu import gelu
File "/home/joseph/Downloads/addons/tensorflow_addons/activations/gelu.py", line 24, in <module>
get_path_to_datafile("custom_ops/activations/_activation_ops.so"))
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /home/joseph/Downloads/addons/tensorflow_addons/custom_ops/activations/_activation_ops.so: cannot open shared object file: No such file or directory
>>> tfa.__version__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'tfa' is not defined

alexis-gruet-deel · 2020-10-22T09:04:08Z

No. I just understand the /home/joseph/Downloads/addons/tensorflow_addons/custom_ops/activations/_activation_ops.so was not found. What I would do, ensure the dynamic lib _activation_ops.so is present somewhere in your filesystem as a first step. If not, something went wrong during the compilation.

bhack · 2020-10-22T09:12:58Z

You could require to use https://github.com/tensorflow/addons/blob/master/tools/install_so_files.sh

Check #1909 /cc @seanpmorgan

JosephHuang913 · 2020-10-23T08:47:21Z

Hi @bhack ,

Thanks a lot. I have installed tensorflow-addons on Jetson nano successfully.

bhack mentioned this issue Jul 10, 2020

Add arm64 third-party CI tensorflow/build#9

Open

WindQAQ added the build label Jul 13, 2020

alexis-gruet-deel closed this as completed Jul 13, 2020

bhack mentioned this issue Aug 1, 2020

Cannot build tensorflow-addons-0.10.0 with bazel3.4.1 #2060

Closed

seanpmorgan mentioned this issue Sep 29, 2020

Add support for ARM architecture build from source #2182

Merged

3 tasks

Jetson AGX/TX2/Nano - Build release from pip3 install #1982

Jetson AGX/TX2/Nano - Build release from pip3 install #1982

Comments

alexis-gruet-deel commented Jul 9, 2020

bhack commented Jul 9, 2020

alexis-gruet-deel commented Jul 12, 2020

bhack commented Jul 12, 2020

alexis-gruet-deel commented Jul 13, 2020

bhack commented Jul 13, 2020 • edited Loading

alexis-gruet-deel commented Jul 13, 2020

bhack commented Jul 13, 2020

alexis-gruet-deel commented Jul 13, 2020 • edited Loading

Tetsujinfr commented Sep 28, 2020 • edited Loading

Tetsujinfr commented Sep 29, 2020

JosephHuang913 commented Oct 20, 2020 • edited Loading

Tetsujinfr commented Oct 20, 2020

bhack commented Oct 20, 2020

JosephHuang913 commented Oct 20, 2020

bhack commented Oct 20, 2020

JosephHuang913 commented Oct 20, 2020

bhack commented Oct 20, 2020

JosephHuang913 commented Oct 21, 2020 • edited Loading

bhack commented Oct 21, 2020

JosephHuang913 commented Oct 21, 2020

Tetsujinfr commented Oct 21, 2020

JosephHuang913 commented Oct 21, 2020

bhack commented Oct 21, 2020

JosephHuang913 commented Oct 21, 2020

bhack commented Oct 21, 2020

JosephHuang913 commented Oct 21, 2020

JosephHuang913 commented Oct 21, 2020

JosephHuang913 commented Oct 21, 2020

bhack commented Oct 21, 2020

JosephHuang913 commented Oct 22, 2020

alexis-gruet-deel commented Oct 22, 2020 • edited Loading

JosephHuang913 commented Oct 22, 2020

alexis-gruet-deel commented Oct 22, 2020

bhack commented Oct 22, 2020

JosephHuang913 commented Oct 23, 2020

bhack commented Jul 13, 2020 •

edited

Loading

alexis-gruet-deel commented Jul 13, 2020 •

edited

Loading

Tetsujinfr commented Sep 28, 2020 •

edited

Loading

JosephHuang913 commented Oct 20, 2020 •

edited

Loading

JosephHuang913 commented Oct 21, 2020 •

edited

Loading

alexis-gruet-deel commented Oct 22, 2020 •

edited

Loading