Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

make test fails miserably on Raspberry Pi (ARM) #4565

Closed
Pitel opened this issue Jan 11, 2013 · 14 comments
Closed

make test fails miserably on Raspberry Pi (ARM) #4565

Pitel opened this issue Jan 11, 2013 · 14 comments
Labels

Comments

@Pitel
Copy link

Pitel commented Jan 11, 2013

[43:54|% 100|+ 175|- 341]: Done

Most of tests fails like this:

=== release test-child-process-buffering ===                    
Path: simple/test-child-process-buffering
node: ../deps/uv/src/unix/linux/linux-core.c:210: uv__io_poll: Assertion `fd >= 0' failed.
Command: out/Release/node /home/pi/node/test/simple/test-child-process-buffering.js
=== release test-child-process-customfd-bounded ===                    
Path: simple/test-child-process-customfd-bounded
node: ../deps/uv/src/unix/linux/linux-core.c:210: uv__io_poll: Assertion `fd >= 0' failed.
Command: out/Release/node /home/pi/node/test/simple/test-child-process-customfd-bounded.js
=== release test-child-process-cwd ===                                
Path: simple/test-child-process-cwd
node: ../deps/uv/src/unix/linux/linux-core.c:210: uv__io_poll: Assertion `fd >= 0' failed.
Command: out/Release/node /home/pi/node/test/simple/test-child-process-cwd.js

It runs much better on my standard x86_64 Ubuntu (just 2 tests failing).

@bnoordhuis
Copy link
Member

What does fd contain? You should be able to inspect it in gdb:

$ ulimit -c unlimited # turn on coredumps
$ out/Release/node test/simple/test-child-process-buffering
$ gdb out/Release/node core
> backtrace full

@Pitel
Copy link
Author

Pitel commented Jan 11, 2013

pi@raspberrypi ~/node $ gdb out/Release/node core
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/pi/node/out/Release/node...done.
[New LWP 8258]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `out/Release/node test/simple/test-child-process-buffering'.
Program terminated with signal 6, Aborted.
#0  0xb6c67bfc in raise () from /lib/arm-linux-gnueabihf/libc.so.6
(gdb) backtrace full
#0  0xb6c67bfc in raise () from /lib/arm-linux-gnueabihf/libc.so.6
No symbol table info available.
#1  0xb6c6b97c in abort () from /lib/arm-linux-gnueabihf/libc.so.6
No symbol table info available.
#2  0xb6d67258 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
No symbol table info available.
#3  0xb6d67258 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

@bnoordhuis
Copy link
Member

Right. You probably need a debug build to get a meaningful backtrace. make -C out BUILDTYPE=Debug compiles out/Debug/node.

@Pitel
Copy link
Author

Pitel commented Jan 11, 2013

pi@raspberrypi ~/node $ ulimit -c unlimited

pi@raspberrypi ~/node $ out/Debug/node test/simple/test-child-process-buffering
node: ../deps/uv/src/unix/linux/linux-core.c:216: uv__io_poll: Assertion `(unsigned) fd < loop->nwatchers' failed.
Neúspěšně ukončen (SIGABRT) (core dumped [obraz paměti uložen])

pi@raspberrypi ~/node $ gdb out/Debug/node core
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/pi/node/out/Debug/node...done.
[New LWP 28422]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `out/Debug/node test/simple/test-child-process-buffering'.
Program terminated with signal 6, Aborted.
#0  0xb6cd7bfc in raise () from /lib/arm-linux-gnueabihf/libc.so.6
(gdb) backtrace full
#0  0xb6cd7bfc in raise () from /lib/arm-linux-gnueabihf/libc.so.6
No symbol table info available.
#1  0xb6cdb97c in abort () from /lib/arm-linux-gnueabihf/libc.so.6
No symbol table info available.
#2  0x00000072 in ?? ()
No symbol table info available.
#3  0x00000072 in ?? ()
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

@TooTallNate
Copy link

I can repro the uv__io_poll failing assert on master branch on my Pi as
well.

On Friday, January 11, 2013, Ing. Jan Kaláb wrote:

pi@raspberrypi ~/node $ ulimit -c unlimited

pi@raspberrypi ~/node $ out/Debug/node test/simple/test-child-process-buffering
node: ../deps/uv/src/unix/linux/linux-core.c:216: uv__io_poll: Assertion `(unsigned) fd < loop->nwatchers' failed.
Neúspì¹nì ukonèen (SIGABRT) (core dumped [obraz pamìti ulo¾en])

pi@raspberrypi ~/node $ gdb out/Debug/node core
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /home/pi/node/out/Debug/node...done.
[New LWP 28422]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `out/Debug/node test/simple/test-child-process-buffering'.
Program terminated with signal 6, Aborted.
#0 0xb6cd7bfc in raise () from /lib/arm-linux-gnueabihf/libc.so.6
(gdb) backtrace full
#0 0xb6cd7bfc in raise () from /lib/arm-linux-gnueabihf/libc.so.6
No symbol table info available.
#1 0xb6cdb97c in abort () from /lib/arm-linux-gnueabihf/libc.so.6
No symbol table info available.
#2 0x00000072 in ?? ()
No symbol table info available.
#3 0x00000072 in ?? ()
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

--
Reply to this email directly or view it on GitHubhttps://github.com//issues/4565#issuecomment-12155859.

@bnoordhuis
Copy link
Member

@Pitel Thanks. Interesting that a debug build doesn't produce much of a useful backtrace either.

@TooTallNate Can you give me a shell account on your Pi? (Provided it's connected to the Internet.)

@TooTallNate
Copy link

@bnoordhuis Send me your SSH pub key and access shall be granted, provided you promise to not take down http://n8.io :p

@bnoordhuis
Copy link
Member

I won't and thanks. :-) My key is here: https://gist.github.com/e8a37ac6983f9fd55d4f

@TooTallNate
Copy link

@bnoordhuis Ok you should be good to go (ssh pi@n8.io).

@hhchristian
Copy link

Change
https://github.com/joyent/libuv/blob/master/src/unix/linux/syscalls.h#L74

struct uv__epoll_event {
  __u32 events;
  __u64 data;
} __attribute__((packed));

to

struct uv__epoll_event {
  __u32 events;
  __u32 data;
} __attribute__((packed));

to fix this issue for ARM.

@bnoordhuis
Copy link
Member

@hhchristian That doesn't look correct to me. sizeof(struct epoll_event) == 12 on all architectures. It might be an alignment issue but that'd be odd, it should be naturally aligned to at least 4 bytes.

I'm logged into Nate's Pi so we'll find out in a couple of hours -- once node is compiled.

@hhchristian
Copy link

The "Raspbian" distribution (Debian for Raspberry Pi) defines epoll_event in

/usr/include/arm-linux-gnueabihf/sys/epoll.h

with the following structure:

typedef union epoll_data
{
  void *ptr;
  int fd;
  uint32_t u32;
  uint64_t u64;
} epoll_data_t;

struct epoll_event
{
  uint32_t events;  /* Epoll events */
  epoll_data_t data;    /* User data variable */
};

This looks sililar to the vanilla kernel definition:
http://www.kernel.org/doc/man-pages/online/pages/man2/epoll_ctl.2.html

Libuv uses another definition of epoll_event in

/src/unix/linux/syscalls.h
struct uv__epoll_event {
  __u32 events;
  __u64 data;
} __attribute__((packed));

A pointer to this structure is passed with the epoll_ctl() syscall.

From my point of view, there's no reason for libuv to use an own definition of this type.
struct epoll_event is only used in uv__epoll_ctl() which is a wrapper to the epoll_ctl() syscall.
epoll_ctl() uses the first 32 bits "event" field of the passed epoll_event structure for bit flags and the "fd" field to store the file descriptor.
When the epoll_data_t union is redefined in libuv as an __u64, the byte order becomes significant.
Raspberry Pi is working in Big Endian format, x86/64 uses Little Endian.

The correct solution would be to use the standard epoll_event structure which is defined in "sys/epoll.h".
If this isn't possible for some reason, the "data" field of uv__epoll_event has to be changed in order to support Little Endian platforms.

@bnoordhuis
Copy link
Member

From my point of view, there's no reason for libuv to use an own definition of this type.

It's a workaround for very old distros (or rather, very old versions of glibc) that don't have epoll syscall wrappers. Probably less relevant now that RHEL 4 is EOL. I might remove it one day.

e4f2a14 should address this; I couldn't get the debug build to link (it was getting OOM killed) and I'm waiting for the release build to compile but the libuv tests are passing.

I was wrong about sizeof(struct epoll_event) always being 12 bytes. 64 bits values require 8 byte alignment on ARM (which kind of makes sense), ergo, the struct is 16 bytes big.

@hhchristian
Copy link

I was wrong about sizeof(struct epoll_event) always being 12 bytes. 64 bits values require 8 byte alignment on ARM (which kind of makes sense), ergo, the struct is 16 bytes big.

The __attribute__((packed)) overrides the standard alignment, so the struct will be 12 bytes big and the data member overlaps the 8 byte alignment. I think this is not good. When libuv brings its own definition of epoll_event, it can or sould also define epoll_data_t.

To get this working on ARM with Big Endian without changing the definition of uv__epoll_event, the access to the data field in void uv__io_poll(uv_loop_t* loop, int timeout) has to bechanged:

https://github.com/joyent/libuv/blob/master/src/unix/linux/linux-core.c#L150
e.data = w->fd;

https://github.com/joyent/libuv/blob/master/src/unix/linux/linux-core.c#L213
fd = pe->data;

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants