Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorflow in armv7l #445

Closed
vmayoral opened this issue Dec 8, 2015 · 27 comments
Closed

tensorflow in armv7l #445

vmayoral opened this issue Dec 8, 2015 · 27 comments
Assignees

Comments

@vmayoral
Copy link

vmayoral commented Dec 8, 2015

Hi,

I've cross-compiled tensorflow for armv7l and generated a wheel successfully however when deploying it into an embedded board with the same architecture (e.g.: Raspberry Pi 2), i get the following when executing https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/1%20-%20Introduction/helloworld.py:

erle@erle-brain-2 ~/TensorFlow-Examples/examples/1 - Introduction $ python helloworld.py 
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
pure virtual method called
terminate called without an active exception
I tensorflow/core/common_runtime/direct_session.cc:60] Direct session inter op parallelism threads: 4
Aborted

Digging a bit more:

erle@erle-brain-2 ~/TensorFlow-Examples/examples/1 - Introduction $ gdb -ex r --args python helloworld.py
GNU gdb (Raspbian 7.7.1+dfsg-5) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...(no debugging symbols found)...done.
Starting program: /usr/bin/python helloworld.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".

Program received signal SIGILL, Illegal instruction.
0x73f1cd08 in ?? () from /usr/lib/arm-linux-gnueabihf/libcrypto.so.1.0.0
(gdb) bt
#0  0x73f1cd08 in ?? () from /usr/lib/arm-linux-gnueabihf/libcrypto.so.1.0.0
#1  0x73f193f4 in OPENSSL_cpuid_setup () from /usr/lib/arm-linux-gnueabihf/libcrypto.so.1.0.0
#2  0x76fdf058 in call_init (l=<optimized out>, argc=2, argv=0x7efff184, env=0x7efff190) at dl-init.c:78
#3  0x76fdf134 in _dl_init (main_map=main_map@entry=0x8e9268, argc=2, argv=0x7efff184, env=0x7efff190) at dl-init.c:126
#4  0x76fe36b4 in dl_open_worker (a=<optimized out>) at dl-open.c:577
#5  0x76fdeef0 in _dl_catch_error (objname=0x76fdeef0 <_dl_catch_error+112>, objname@entry=0x7effcc04, errstring=0x76ff6510, errstring@entry=0x7effcc08, 
    mallocedp=0x7effcc04, mallocedp@entry=0x7effcc03, operate=0x7effcc03, args=args@entry=0x7effcc0c) at dl-error.c:187
#6  0x76fe2da4 in _dl_open (file=0x9094e0 "/usr/lib/python2.7/lib-dynload/_hashlib.arm-linux-gnueabihf.so", mode=-2147483646, 
    caller_dlopen=0x10aa94 <_PyImport_GetDynLoadFunc+272>, nsid=-2, argc=2, argv=0x7efff184, env=0x7efff190) at dl-open.c:661
#7  0x76f66ba8 in dlopen_doit (a=0x7effce58) at dlopen.c:66
#8  0x76fdeef0 in _dl_catch_error (objname=0x76fdeef0 <_dl_catch_error+112>, errstring=0x76ff6510, mallocedp=0x4e93a4, operate=0x4e93a0, args=0x7effce58)
    at dl-error.c:187
#9  0x76f672a8 in _dlerror_run (operate=0x76f66b28 <dlopen_doit>, args=args@entry=0x7effce58) at dlerror.c:163
#10 0x76f66c74 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#11 0x0010aa94 in _PyImport_GetDynLoadFunc ()
#12 0x0010a338 in _PyImport_LoadDynamicModule ()
#13 0x00067844 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) Quit
(gdb) quit
A debugging session is active.

cryt* libraries in the machine used to cross compile tensorflow:

root@debian-arm:~/TensorFlow-Examples/examples/1 - Introduction# dpkg -l|grep crypt
ii  libcryptsetup4:armhf          2:1.6.6-5                 armhf        disk encryption support - shared library
ii  libgcrypt20:armhf             1.6.3-2                   armhf        LGPL Crypto library - runtime library
ii  libhogweed2:armhf             2.7.1-5                   armhf        low level cryptographic library (public-key cryptos)
ii  libk5crypto3:armhf            1.12.1+dfsg-19            armhf        MIT Kerberos runtime libraries - Crypto Library
ii  libnettle4:armhf              2.7.1-5                   armhf        low level cryptographic library (symmetric and one-way cryptos)
ii  openssl                       1.0.1k-3+deb8u1           armhf        Secure Sockets Layer toolkit - cryptographic utility
ii  python-cryptography           0.6.1-1                   armhf        Python library exposing cryptographic recipes and primitives (Python 2)

cryt* libraries in the target machine (Raspberry Pi 2):

erle@erle-brain-2 ~/TensorFlow-Examples/examples/1 - Introduction $ dpkg -l|grep crypt
ii  cryptsetup-bin                         2:1.6.6-5                                 armhf        disk encryption support - command line tools
ii  libcryptsetup4:armhf                   2:1.6.6-5                                 armhf        disk encryption support - shared library
ii  libgcrypt20:armhf                      1.6.3-2                                   armhf        LGPL Crypto library - runtime library
ii  libhcrypto4-heimdal:armhf              1.6~rc2+dfsg-9+rpi1                       armhf        Heimdal Kerberos - crypto library
ii  libhogweed2:armhf                      2.7.1-5                                   armhf        low level cryptographic library (public-key cryptos)
ii  libk5crypto3:armhf                     1.12.1+dfsg-19                            armhf        MIT Kerberos runtime libraries - Crypto Library
ii  libmhash2:armhf                        0.9.9.9-7                                 armhf        Library for cryptographic hashing and message authentication
ii  libnettle4:armhf                       2.7.1-5                                   armhf        low level cryptographic library (symmetric and one-way crypos)
ii  libpococrypto9                         1.3.6p1-5                                 armhf        C++ Portable Components (POCO) Crypto library
ii  openssl                                1.0.1k-3+deb8u1                           armhf        Secure Sockets Layer toolkit - cryptographic utility
@vmayoral
Copy link
Author

vmayoral commented Dec 9, 2015

For completeness:

Python 2.7.9 (default, Mar  8 2015, 00:52:26) 
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
pure virtual method called
I tensorflow/core/common_runtime/direct_session.cc:60] Direct session inter op parallelism threads: 4
terminate called without an active exception
Aborted

@danbri
Copy link

danbri commented Dec 17, 2015

Any idea what the problem is here?

@nlothian
Copy link

No expert here, but pretty sure that exception trace is an artifact of the debugger. See https://bugs.launchpad.net/raspbian/+bug/1154042 for a similar issue with a good explanation.

BTW, this is TensorFlow compiled in CPU only mode for the Pi, right?

@girving
Copy link
Contributor

girving commented Mar 8, 2016

Looks like this fell through the cracks. @petewarden: Any thoughts here?

@danbri
Copy link

danbri commented Mar 8, 2016

In the meantime, the somewhat faster Pi v3 shipped. Perhaps that affects the possibilities (and feasibilities) around Tensorflow for Pi?

https://en.wikipedia.org/wiki/Raspberry_Pi says "Raspberry Pi 3 has a new BCM2837 SoC retaining compatibility with the GPU, CPU and connectors of its predecessors BCM2835 (Pi 1) and BCM2836 (Pi 2), so all those projects and tutorials for Pi 1 and Pi 2 hardware should continue to work. The 900 MHz 32-bit quad-core ARM Cortex-A7 CPU complex has been replaced by a 1.2 GHz 64-bit quad-core ARM Cortex-A53. Combining a 33% increase in clock speed with various architectural enhancements, this provides a 50–60% increase in performance in 32-bit mode versus Raspberry Pi 2, or roughly a factor of ten over the original Pi 1."

@DSA101
Copy link

DSA101 commented Mar 8, 2016

From the use case perspective I'd definitely like to see TensorFlow on Pi, at least for the predict function.

I have a Pi that mines text and data from web daily and I am also planning to add NN processing to my scripts. I would train the model on my regular Core i7 dev box with a GPU but would love to run the prediction daily on the Pi (perhaps retraining the model monthly). Why Pi? Because it has more than enough power to run Python scripts and do the data processing that I need, and consumes only 2W at full load (the Pi2 model, Pi3 I believe is more like 3W).

@girving
Copy link
Contributor

girving commented Mar 8, 2016

@martinwicke: What's the state of cmake in contrib? Would it be practical to get tensorflow compiling on a Pi rather than cross compiling?

@petewarden
Copy link
Contributor

My plan is to get Blaze running on a Pi if I can. CMake isn't very pleasant to work with for this, in my experience. I hope to get to this asap.

@samjabrahams
Copy link
Contributor

For what it's worth, I'm attempting to natively compile TensorFlow on a Raspberry Pi 3 running Raspbian. For the most part, I'm using a modified version of these instructions for building on a Jetson TK1, with the main changes being that I'm building Bazel 1.4 instead of 1.0, and I'm not building for Cuda.


Here's where I'm getting hung up right now: I tried using Bazel to build the tutorials_example_trainer binaries with the following command:

bazel build -c opt --local_resources 1024,0.5,1.0 --verbose_failures tensorflow/cc:tutorials_example_trainer

After and 1.5 hours, it threw the following error message:

ERROR: /home/pi/programming/tensorflow/tensorflow/core/kernels/BUILD:639:1: C++ compilation of rule '//tensorflow/core/kernels:argmax_op' failed: gcc failed: error executing command 
  (cd /home/pi/.cache/bazel/_bazel_pi/ddee419d9c6b5440629c2870bc1d9b2e/tensorflow && \
  exec env - \
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/games:/usr/games \
  /usr/bin/gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections '-std=c++0x' -iquote . -iquote bazel-out/local_linux-opt/genfiles -iquote external/bazel_tools -iquote bazel-out/local_linux-opt/genfiles/external/bazel_tools -iquote external/jpeg_archive -iquote bazel-out/local_linux-opt/genfiles/external/jpeg_archive -iquote external/png_archive -iquote bazel-out/local_linux-opt/genfiles/external/png_archive -iquote external/re2 -iquote bazel-out/local_linux-opt/genfiles/external/re2 -iquote external/eigen_archive -iquote bazel-out/local_linux-opt/genfiles/external/eigen_archive -isystem google/protobuf/src -isystem bazel-out/local_linux-opt/genfiles/google/protobuf/src -isystem external/bazel_tools/tools/cpp/gcc3 -isystem external/jpeg_archive/jpeg-9a -isystem bazel-out/local_linux-opt/genfiles/external/jpeg_archive/jpeg-9a -isystem external/png_archive/libpng-1.2.53 -isystem bazel-out/local_linux-opt/genfiles/external/png_archive/libpng-1.2.53 -isystem external/re2 -isystem bazel-out/local_linux-opt/genfiles/external/re2 -isystem third_party/eigen3 -isystem bazel-out/local_linux-opt/genfiles/third_party/eigen3 -isystem external/eigen_archive/eigen-eigen-db7b61411772 -isystem bazel-out/local_linux-opt/genfiles/external/eigen_archive/eigen-eigen-db7b61411772 -fno-exceptions -DEIGEN_AVOID_STL_ARRAY -pthread -no-canonical-prefixes -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' '-frandom-seed=bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.o' -MD -MF bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.d -c tensorflow/core/kernels/argmax_op.cc -o bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.o): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 4: gcc failed: error executing command 
  (cd /home/pi/.cache/bazel/_bazel_pi/ddee419d9c6b5440629c2870bc1d9b2e/tensorflow && \
  exec env - \
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/games:/usr/games \
  /usr/bin/gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections '-std=c++0x' -iquote . -iquote bazel-out/local_linux-opt/genfiles -iquote external/bazel_tools -iquote bazel-out/local_linux-opt/genfiles/external/bazel_tools -iquote external/jpeg_archive -iquote bazel-out/local_linux-opt/genfiles/external/jpeg_archive -iquote external/png_archive -iquote bazel-out/local_linux-opt/genfiles/external/png_archive -iquote external/re2 -iquote bazel-out/local_linux-opt/genfiles/external/re2 -iquote external/eigen_archive -iquote bazel-out/local_linux-opt/genfiles/external/eigen_archive -isystem google/protobuf/src -isystem bazel-out/local_linux-opt/genfiles/google/protobuf/src -isystem external/bazel_tools/tools/cpp/gcc3 -isystem external/jpeg_archive/jpeg-9a -isystem bazel-out/local_linux-opt/genfiles/external/jpeg_archive/jpeg-9a -isystem external/png_archive/libpng-1.2.53 -isystem bazel-out/local_linux-opt/genfiles/external/png_archive/libpng-1.2.53 -isystem external/re2 -isystem bazel-out/local_linux-opt/genfiles/external/re2 -isystem third_party/eigen3 -isystem bazel-out/local_linux-opt/genfiles/third_party/eigen3 -isystem external/eigen_archive/eigen-eigen-db7b61411772 -isystem bazel-out/local_linux-opt/genfiles/external/eigen_archive/eigen-eigen-db7b61411772 -fno-exceptions -DEIGEN_AVOID_STL_ARRAY -pthread -no-canonical-prefixes -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' '-frandom-seed=bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.o' -MD -MF bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.d -c tensorflow/core/kernels/argmax_op.cc -o bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.o): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 4.
Target //tensorflow/cc:tutorials_example_trainer failed to build
INFO: Elapsed time: 5398.942s, Critical Path: 5019.43s

The first message said that the tensorflow/core/kernels:argmax_op build instruction was failing, so I ran Bazel on just that part to see what messages I'd see along the way. Here's the command I ran:

bazel build -c opt --local_resources 1024,0.5,1.0 --verbose_failures tensorflow/core/kernels:argmax_op

And here's the full output, including Bazel messages before the error:

..............................................
WARNING: Sandboxed execution is not supported on your system and thus hermeticity of actions cannot be guaranteed. See http://bazel.io/docs/bazel-user-manual.html#sandboxing for more information. You can turn off this warning via --ignore_unsupported_sandboxing.
INFO: Found 1 target...
INFO: From Compiling google/protobuf/src/google/protobuf/util/internal/field_mask_utility.cc:
google/protobuf/src/google/protobuf/util/internal/field_mask_utility.cc:47:14: warning: 'google::protobuf::util::Status google::protobuf::util::converter::{anonymous}::CreatePublicError(google::protobuf::util::error::Code, const string&)' defined but not used [-Wunused-function]
 util::Status CreatePublicError(util::error::Code code,
              ^
INFO: From Compiling google/protobuf/src/google/protobuf/util/internal/utility.cc:
google/protobuf/src/google/protobuf/util/internal/utility.cc:50:19: warning: 'const google::protobuf::StringPiece google::protobuf::util::converter::{anonymous}::SkipWhiteSpace(google::protobuf::StringPiece)' defined but not used [-Wunused-function]
 const StringPiece SkipWhiteSpace(StringPiece str) {
                   ^
INFO: From Compiling google/protobuf/src/google/protobuf/util/time_util.cc:
google/protobuf/src/google/protobuf/util/time_util.cc:371:6: warning: 'void google::protobuf::{anonymous}::ToUint128(const google::protobuf::Timestamp&, google::protobuf::uint128*, bool*)' defined but not used [-Wunused-function]
 void ToUint128(const Timestamp& value, uint128* result, bool* negative) {
      ^
google/protobuf/src/google/protobuf/util/time_util.cc:396:6: warning: 'void google::protobuf::{anonymous}::ToTimestamp(const google::protobuf::uint128&, bool, google::protobuf::Timestamp*)' defined but not used [-Wunused-function]
 void ToTimestamp(const uint128& value, bool negative, Timestamp* timestamp) {
      ^
INFO: From Compiling tensorflow/core/util/tensor_slice_reader.cc:
In file included from ./tensorflow/core/platform/default/logging.h:23:0,
                 from ./tensorflow/core/platform/logging.h:24,
                 from ./tensorflow/core/lib/core/status.h:24,
                 from ./tensorflow/core/lib/core/errors.h:19,
                 from ./tensorflow/core/framework/tensor_shape.h:24,
                 from ./tensorflow/core/util/tensor_slice_reader.h:26,
                 from tensorflow/core/util/tensor_slice_reader.cc:16:
./tensorflow/core/platform/default/logging.h: In instantiation of 'std::string* tensorflow::internal::Check_LTImpl(const T1&, const T2&, const char*) [with T1 = int; T2 = unsigned int; std::string = std::basic_string<char>]':
tensorflow/core/util/tensor_slice_reader.cc:136:3:   required from here
./tensorflow/core/platform/default/logging.h:197:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
 TF_DEFINE_CHECK_OP_IMPL(Check_LT, < )
                                   ^
./tensorflow/core/platform/macros.h:54:29: note: in definition of macro 'TF_PREDICT_TRUE'
 #define TF_PREDICT_TRUE(x) (x)
                             ^
./tensorflow/core/platform/default/logging.h:197:1: note: in expansion of macro 'TF_DEFINE_CHECK_OP_IMPL'
 TF_DEFINE_CHECK_OP_IMPL(Check_LT, < )
 ^
INFO: From Compiling tensorflow/core/kernels/transpose_functor_cpu.cc:
In file included from ./tensorflow/core/platform/default/logging.h:23:0,
                 from ./tensorflow/core/platform/logging.h:24,
                 from ./tensorflow/core/lib/gtl/array_slice_internal.h:32,
                 from ./tensorflow/core/lib/gtl/array_slice.h:101,
                 from ./tensorflow/core/framework/types.h:33,
                 from ./tensorflow/core/framework/type_traits.h:22,
                 from ./tensorflow/core/framework/allocator.h:25,
                 from ./tensorflow/core/framework/tensor.h:21,
                 from ./tensorflow/core/kernels/transpose_functor.h:19,
                 from tensorflow/core/kernels/transpose_functor_cpu.cc:18:
./tensorflow/core/platform/default/logging.h: In instantiation of 'std::string* tensorflow::internal::Check_EQImpl(const T1&, const T2&, const char*) [with T1 = int; T2 = unsigned int; std::string = std::basic_string<char>]':
tensorflow/core/kernels/transpose_functor_cpu.cc:78:3:   required from here
./tensorflow/core/platform/default/logging.h:194:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
                         == )  // Compilation error with CHECK_EQ(NULL, x)?
                         ^
./tensorflow/core/platform/macros.h:54:29: note: in definition of macro 'TF_PREDICT_TRUE'
 #define TF_PREDICT_TRUE(x) (x)
                             ^
./tensorflow/core/platform/default/logging.h:193:1: note: in expansion of macro 'TF_DEFINE_CHECK_OP_IMPL'
 TF_DEFINE_CHECK_OP_IMPL(Check_EQ,
 ^
INFO: From Compiling tensorflow/core/graph/costmodel.cc:
In file included from ./tensorflow/core/platform/default/logging.h:23:0,
                 from ./tensorflow/core/platform/logging.h:24,
                 from ./tensorflow/core/lib/core/status.h:24,
                 from ./tensorflow/core/framework/op_def_builder.h:25,
                 from ./tensorflow/core/framework/op.h:23,
                 from ./tensorflow/core/graph/graph.h:44,
                 from ./tensorflow/core/graph/costmodel.h:22,
                 from tensorflow/core/graph/costmodel.cc:16:
./tensorflow/core/platform/default/logging.h: In instantiation of 'std::string* tensorflow::internal::Check_EQImpl(const T1&, const T2&, const char*) [with T1 = int; T2 = unsigned int; std::string = std::basic_string<char>]':
tensorflow/core/graph/costmodel.cc:65:9:   required from here
./tensorflow/core/platform/default/logging.h:194:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
                         == )  // Compilation error with CHECK_EQ(NULL, x)?
                         ^
./tensorflow/core/platform/macros.h:54:29: note: in definition of macro 'TF_PREDICT_TRUE'
 #define TF_PREDICT_TRUE(x) (x)
                             ^
./tensorflow/core/platform/default/logging.h:193:1: note: in expansion of macro 'TF_DEFINE_CHECK_OP_IMPL'
 TF_DEFINE_CHECK_OP_IMPL(Check_EQ,
 ^
./tensorflow/core/platform/default/logging.h: In instantiation of 'std::string* tensorflow::internal::Check_LTImpl(const T1&, const T2&, const char*) [with T1 = int; T2 = unsigned int; std::string = std::basic_string<char>]':
tensorflow/core/graph/costmodel.cc:146:3:   required from here
./tensorflow/core/platform/default/logging.h:197:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
 TF_DEFINE_CHECK_OP_IMPL(Check_LT, < )
                                   ^
./tensorflow/core/platform/macros.h:54:29: note: in definition of macro 'TF_PREDICT_TRUE'
 #define TF_PREDICT_TRUE(x) (x)
                             ^
./tensorflow/core/platform/default/logging.h:197:1: note: in expansion of macro 'TF_DEFINE_CHECK_OP_IMPL'
 TF_DEFINE_CHECK_OP_IMPL(Check_LT, < )
 ^
INFO: From Compiling tensorflow/core/common_runtime/function.cc:
In file included from ./tensorflow/core/platform/default/logging.h:23:0,
                 from ./tensorflow/core/platform/logging.h:24,
                 from ./tensorflow/core/lib/gtl/array_slice_internal.h:32,
                 from ./tensorflow/core/lib/gtl/array_slice.h:101,
                 from ./tensorflow/core/framework/types.h:33,
                 from ./tensorflow/core/framework/type_traits.h:22,
                 from ./tensorflow/core/framework/allocator.h:25,
                 from ./tensorflow/core/common_runtime/device.h:35,
                 from ./tensorflow/core/common_runtime/function.h:21,
                 from tensorflow/core/common_runtime/function.cc:16:
./tensorflow/core/platform/default/logging.h: In instantiation of 'std::string* tensorflow::internal::Check_EQImpl(const T1&, const T2&, const char*) [with T1 = int; T2 = unsigned int; std::string = std::basic_string<char>]':
tensorflow/core/common_runtime/function.cc:147:3:   required from here
./tensorflow/core/platform/default/logging.h:194:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
                         == )  // Compilation error with CHECK_EQ(NULL, x)?
                         ^
./tensorflow/core/platform/macros.h:54:29: note: in definition of macro 'TF_PREDICT_TRUE'
 #define TF_PREDICT_TRUE(x) (x)
                             ^
./tensorflow/core/platform/default/logging.h:193:1: note: in expansion of macro 'TF_DEFINE_CHECK_OP_IMPL'
 TF_DEFINE_CHECK_OP_IMPL(Check_EQ,
 ^
./tensorflow/core/platform/default/logging.h: In instantiation of 'std::string* tensorflow::internal::Check_EQImpl(const T1&, const T2&, const char*) [with T1 = unsi                         == )  // Compilation error with CHECK_EQ(NULL, x)?
                         ^
./tensorflow/core/platform/macros.h:54:29: note: in definition of macro 'TF_PREDICT_TRUE'
 #define TF_PREDICT_TRUE(x) (x)
                             ^
./tensorflow/core/platform/default/logging.h:193:1: note: in expansion of macro 'TF_DEFINE_CHECK_OP_IMPL'
 TF_DEFINE_CHECK_OP_IMPL(Check_EQ,
 ^
./tensorflow/core/platform/default/logging.h: In instantiation of 'std::string* tensorflow::internal::Check_LTImpl(const T1&, const T2&, const char*) [with T1 = int; T2 = unsigned int; std::string = std::basic_string<char>]':
tensorflow/core/common_runtime/function.cc:1157:5:   required from here
./tensorflow/core/platform/default/logging.h:197:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
 TF_DEFINE_CHECK_OP_IMPL(Check_LT, < )
                                   ^
./tensorflow/core/platform/macros.h:54:29: note: in definition of macro 'TF_PREDICT_TRUE'
 #define TF_PREDICT_TRUE(x) (x)
                             ^
./tensorflow/core/platform/default/logging.h:197:1: note: in expansion of macro 'TF_DEFINE_CHECK_OP_IMPL'
 TF_DEFINE_CHECK_OP_IMPL(Check_LT, < )
 ^
INFO: From Compiling tensorflow/core/kernels/argmax_op.cc:
gcc: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.9/README.Bugs> for instructions.
ERROR: /home/pi/programming/tensorflow/tensorflow/core/kernels/BUILD:639:1: C++ compilation of rule '//tensorflow/core/kernels:argmax_op' failed: gcc failed: error executing command 
  (cd /home/pi/.cache/bazel/_bazel_pi/ddee419d9c6b5440629c2870bc1d9b2e/tensorflow && \
  exec env - \
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/games:/usr/games \
  /usr/bin/gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections '-std=c++0x' -iquote . -iquote bazel-out/local_linux-opt/genfiles -iquote external/bazel_tools -iquote bazel-out/local_linux-opt/genfiles/external/bazel_tools -iquote external/jpeg_archive -iquote bazel-out/local_linux-opt/genfiles/external/jpeg_archive -iquote external/png_archive -iquote bazel-out/local_linux-opt/genfiles/external/png_archive -iquote external/re2 -iquote bazel-out/local_linux-opt/genfiles/external/re2 -iquote external/eigen_archive -iquote bazel-out/local_linux-opt/genfiles/external/eigen_archive -isystem google/protobuf/src -isystem bazel-out/local_linux-opt/genfiles/google/protobuf/src -isystem external/bazel_tools/tools/cpp/gcc3 -isystem external/jpeg_archive/jpeg-9a -isystem bazel-out/local_linux-opt/genfiles/external/jpeg_archive/jpeg-9a -isystem external/png_archive/libpng-1.2.53 -isystem bazel-out/local_linux-opt/genfiles/external/png_archive/libpng-1.2.53 -isystem external/re2 -isystem bazel-out/local_linux-opt/genfiles/external/re2 -isystem third_party/eigen3 -isystem bazel-out/local_linux-opt/genfiles/third_party/eigen3 -isystem external/eigen_archive/eigen-eigen-db7b61411772 -isystem bazel-out/local_linux-opt/genfiles/external/eigen_archive/eigen-eigen-db7b61411772 -fno-exceptions -DEIGEN_AVOID_STL_ARRAY -pthread -no-canonical-prefixes -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' '-frandom-seed=bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.o' -MD -MF bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.d -c tensorflow/core/kernels/argmax_op.cc -o bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.o): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 4: gcc failed: error executing command 
  (cd /home/pi/.cache/bazel/_bazel_pi/ddee419d9c6b5440629c2870bc1d9b2e/tensorflow && \
  exec env - \
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/games:/usr/games \
  /usr/bin/gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections '-std=c++0x' -iquote . -iquote bazel-out/local_linux-opt/genfiles -iquote external/bazel_tools -iquote bazel-out/local_linux-opt/genfiles/external/bazel_tools -iquote external/jpeg_archive -iquote bazel-out/local_linux-opt/genfiles/external/jpeg_archive -iquote external/png_archive -iquote bazel-out/local_linux-opt/genfiles/external/png_archive -iquote external/re2 -iquote bazel-out/local_linux-opt/genfiles/external/re2 -iquote external/eigen_archive -iquote bazel-out/local_linux-opt/genfiles/external/eigen_archive -isystem google/protobuf/src -isystem bazel-out/local_linux-opt/genfiles/google/protobuf/src -isystem external/bazel_tools/tools/cpp/gcc3 -isystem external/jpeg_archive/jpeg-9a -isystem bazel-out/local_linux-opt/genfiles/external/jpeg_archive/jpeg-9a -isystem external/png_archive/libpng-1.2.53 -isystem bazel-out/local_linux-opt/genfiles/external/png_archive/libpng-1.2.53 -isystem external/re2 -isystem bazel-out/local_linux-opt/genfiles/external/re2 -isystem third_party/eigen3 -isystem bazel-out/local_linux-opt/genfiles/third_party/eigen3 -isystem external/eigen_archive/eigen-eigen-db7b61411772 -isystem bazel-out/local_linux-opt/genfiles/external/eigen_archive/eigen-eigen-db7b61411772 -fno-exceptions -DEIGEN_AVOID_STL_ARRAY -pthread -no-canonical-prefixes -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' '-frandom-seed=bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.o' -MD -MF bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.d -c tensorflow/core/kernels/argmax_op.cc -o bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.o): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 4.
Target //tensorflow/core/kernels:argmax_op failed to build
INFO: Elapsed time: 2301.254s, Critical Path: 1884.37s

Not sure if this helps, but I figured I should share what I've got.

@petewarden
Copy link
Contributor

Thanks for the update! That is helpful to see. I got as far as getting protobuf compiling, but I haven't made it to Bazel or TensorFlow itself yet. I'm hoping to work on that over the next few days.

@samjabrahams
Copy link
Contributor

Should I upload my Pi's Bazel repository/binary/whatever somewhere? It may (or may not) save you some time.

Additionally, I ran the failed gcc command as mentioned in the Bazel error:

gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections '-std=c++0x' -iquote . -iquote bazel-out/local_linux-opt/genfiles -iquote external/bazel_tools -iquote bazel-out/local_linux-opt/genfiles/external/bazel_tools -iquote external/jpeg_archive -iquote bazel-out/local_linux-opt/genfiles/external/jpeg_archive -iquote external/png_archive -iquote bazel-out/local_linux-opt/genfiles/external/png_archive -iquote external/re2 -iquote bazel-out/local_linux-opt/genfiles/external/re2 -iquote external/eigen_archive -iquote bazel-out/local_linux-opt/genfiles/external/eigen_archive -isystem google/protobuf/src -isystem bazel-out/local_linux-opt/genfiles/google/protobuf/src -isystem external/bazel_tools/tools/cpp/gcc3 -isystem external/jpeg_archive/jpeg-9a -isystem bazel-out/local_linux-opt/genfiles/external/jpeg_archive/jpeg-9a -isystem external/png_archive/libpng-1.2.53 -isystem bazel-out/local_linux-opt/genfiles/external/png_archive/libpng-1.2.53 -isystem external/re2 -isystem bazel-out/local_linux-opt/genfiles/external/re2 -isystem third_party/eigen3 -isystem bazel-out/local_linux-opt/genfiles/third_party/eigen3 -isystem external/eigen_archive/eigen-eigen-db7b61411772 -isystem bazel-out/local_linux-opt/genfiles/external/eigen_archive/eigen-eigen-db7b61411772 -fno-exceptions -DEIGEN_AVOID_STL_ARRAY -pthread -no-canonical-prefixes -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' '-frandom-seed=bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.o' -MD -MF bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.d -c tensorflow/core/kernels/argmax_op.cc -o bazel-out/local_linux-opt/bin/tensorflow/core/kernels/_objs/argmax_op/tensorflow/core/kernels/argmax_op.o

And got the following message:

In file included from ./tensorflow/core/kernels/argmax_op.h:20:0,
                 from tensorflow/core/kernels/argmax_op.cc:24:
./third_party/eigen3/unsupported/Eigen/CXX11/Tensor:1:67: fatal error: eigen-eigen-db7b61411772/unsupported/Eigen/CXX11/Tensor: No such file or directory
 #include "eigen-eigen-db7b61411772/unsupported/Eigen/CXX11/Tensor"

This is my project for the next few days, so I'll try to post progress here. Let me know if messages are irrelevant- I'll remove those bits so I'm not clogging this thread.

@samjabrahams
Copy link
Contributor

Well, I added a USB drive as swap space (RIP that drive soon), pulled the latest files from yesterday and tried it again, and after 3 hours it looks like I got tutorials_example_trainer compiled! Going to try compiling the pip package now.

@samjabrahams
Copy link
Contributor

I believe I've got it working. I removed the swap drive after installing and am able to run the base MNIST tutorial scripts from tensorflow/examples/tutorials/mnist. I can also play with the TensorFlow package in a Python REPL, and a loose look at htop indicates that it's using system resources pretty well.

I've got the process documented decently. I'm going to clean the instructions, test them on another Pi, attempt to install straight from the wheel file, and try out some other shenanigans. Hopefully have this knocked out over the weekend!

@petewarden
Copy link
Contributor

Thanks Sam! I managed to get the label_image example compiling and running on my Pi 2, here are my notes:

  • I mostly followed the Jetson instructions too.
  • There was an error in GPUBFCAllocator that meant I had to comment out a couple of lines.
  • I had to link in the rt library to fix a clock_gettime() linking error, by adding "-lrt" to label_image's link_opts.
  • The resulting binary ran extremely slowly, taking over a minute to run the Inception network. I believe this is because the compiler is not using NEON by default (since the Pi 1 doesn't have that). I will be retrying with NEON enabled.

@samjabrahams
Copy link
Contributor

Excellent! I was hoping to try out the process on a Pi 2 as well. Couple responses and additional notes

  • The GPUBFCAllocator compiling issues are most likely due to changes made in ab6ffc9. On my first go-through (on Friday, just before I pulled in that commit), Bazel didn't complain. The next day (after that commit), I started going through the process again on another Pi to make sure I had it documented properly, but I got errors at gpu_bfc_allocator.cc (probably the same as what you saw).
  • I think the clock_gettime() linking problem will be a recurring theme- running the MNIST example gives completely inaccurate time-to-train values now. On second glance, things appear to be working correctly for the MNIST training. Disregard this bullet!
  • I don't have a good idea of how fast the Pi should be running these models- let me know if you see improvements with NEON!
  • I have been able to compile the distributed runtime, so I'm hoping to play with inter-device communication between Pis and a Mac today.

@samjabrahams
Copy link
Contributor

As another data point, I compiled and ran the label_image binary, and it took much less than a minute. I did not play around with any of the compiler settings. Could be a difference between the Pi2 and the Pi3. I'm also running Raspbian 8.0.

Output:

W tensorflow/core/kernels/batch_norm_op.cc:36] Op is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().
I tensorflow/examples/label_image/main.cc:207] military uniform (866): 0.647298
I tensorflow/examples/label_image/main.cc:207] suit (794): 0.0477194
I tensorflow/examples/label_image/main.cc:207] academic gown (896): 0.0232409
I tensorflow/examples/label_image/main.cc:207] bow tie (817): 0.0157354
I tensorflow/examples/label_image/main.cc:207] bolo tie (940): 0.0145024

@vrv
Copy link

vrv commented Mar 13, 2016

Sorry about the RPi breakage, should be fixed in d2a06c2

@samjabrahams
Copy link
Contributor

Thanks @vrv, I rebuilt from source on a RPi3 and can confirm that it appears to work fine.

I'll release an unofficial step-by-step guide on how I built the standard TensorFlow binaries from source specifically for the Raspberry Pi 3, as well as a link to a pre-compiled pip wheel. The wheel worked on a fresh Pi without any headaches, so I'm hoping people can use that if they don't absolutely have to go through the whole process.

Once a "correct" process of building is established, what kind of tests should be run on the build to make you guys comfortable with putting a Raspberry Pi/Raspbian-targeted wheel on PyPi? Or would that be way too much maintenance overhead for you all? I only ask because I think it'd be pretty sexy to be able to pip install tensorflow on a little RPi :P

@samjabrahams
Copy link
Contributor

Here's the process I used to get TensorFlow running.

@vmayoral
Copy link
Author

Awesome work! Thanks @samjabrahams for putting everything together. Many of us tried this out. Glad someone succeeded.

Closing this now.

@samjabrahams
Copy link
Contributor

Thanks @vmayoral for all the effort you put into this on both the TensorFlow and Bazel front- you got the momentum for this rolling in the first place. Let me know if the instructions/wheel file work for you!

@mihow
Copy link

mihow commented Mar 14, 2016

Great work @samjabrahams and everyone else! I look forward testing and I think this is an exciting contribution for the machine learning + physical computing world overall.

@tuong-olli
Copy link

I install tensorflow in Beaglebone black with Docker successfully. But when I predict model, it's failed that:
Illegal instruction (core dumped)

@petewarden
Copy link
Contributor

As an update, I believe the original "pure virtual method called" crash should be addressed by this solution from the Pi StackOverflow board: https://raspberrypi.stackexchange.com/questions/48225/whats-causing-these-crashes-after-cross-compiling

I'm testing it for a hopeful NEON-enabled Pi wheel we're working on at #11675

@petewarden
Copy link
Contributor

As an update, the fix mentioned in the answer above did seem to work.

@voitgxd
Copy link

voitgxd commented Aug 3, 2017

@vmayoral I have the same issue.

sess = tf.Session()
pure virtual method called
terminate called without an active exception
Aborted
How did you fix it?

@akkaman007
Copy link

I followed your step and i got no any error in installing tensorflow , but when i use it , it showed me error

import tensorflow as tf
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named tensorflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests