Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

monerod Segfaulting on FreeBSD #4063

Closed
marmulak opened this issue Jun 27, 2018 · 17 comments
Closed

monerod Segfaulting on FreeBSD #4063

marmulak opened this issue Jun 27, 2018 · 17 comments

Comments

@marmulak
Copy link

marmulak commented Jun 27, 2018

System Details:

FreeBSD 11.1-RELEASE-p11 i386 (32 bit)
CPU: Intel(R) Atom(TM) CPU N270 @ 1.60GHz (1596.04-MHz 686-class CPU)

Everything on the system is generic prebuilt binaries, from the base system to binary packages installed from the pkg repo.

Problem Scenario:

Installed the monero-cli-0.12.0.0 package from FreeBSD's package repo, which includes monerod. I run monerod on the command line with no arguments. It starts up and segfaults in about 10-15 seconds. Verbose logs don't seem to indicate what the problem is.

Terminal Output:

2018-06-27 11:32:00.996 0x22a14000 INFO global src/daemon/main.cpp:280 Monero 'Lithium Luna' (v0.12.0.0-master-release)
2018-06-27 11:32:00.999 0x22a14000 INFO daemon src/daemon/main.cpp:282 Moving from main() into the daemonize now.
2018-06-27 11:32:01.000 0x22a14000 INFO global src/daemon/protocol.h:53 Initializing cryptonote protocol...
2018-06-27 11:32:01.000 0x22a14000 INFO global src/daemon/protocol.h:58 Cryptonote protocol initialized OK
2018-06-27 11:32:01.003 0x22a14000 TRACE blockchain src/cryptonote_core/blockchain.cpp:161 Blockchain::Blockchain
2018-06-27 11:32:01.005 0x22a14000 INFO global src/daemon/p2p.h:63 Initializing p2p server...
Illegal instruction (core dumped)

GDB Output:

Program terminated with signal 4, Illegal instruction.
#0 0x014d2289 in ?? ()

Steps Taken:

To try to resolve the issue I've been building monero from source, cloning from github recursively and following the build instructions. The binaries build successfully with clang 4.0.0, and the error is exactly the same--segfault after startup.

I fiddled with it for several hours, trying to build with gcc (failed) and 'make debug' also fails. Also tried forcing software AES in the off-chance that that was the problem, but apparently that makes no difference.

This likely points to a bug in monero's code.

@moneromooo-monero
Copy link
Collaborator

"bt" when in gdb after it crashed.

@marmulak
Copy link
Author

This GDB was configured as "i386-marcel-freebsd"...(no debugging symbols found)...
Core was generated by `monerod'.
Program terminated with signal 4, Illegal instruction.
#0 0x014d2289 in ?? ()
(gdb) bt
#0 0x014d2289 in ?? ()
#1 0x0183575c in ?? ()
#2 0x22a0c000 in ?? ()
#3 0x22a2e570 in ?? ()
#4 0x221da1e8 in ?? ()
#5 0x221d155c in ?? ()
#6 0xbfbfdf44 in ?? ()
#7 0x55627415 in ?? ()
#8 0x00000000 in ?? ()
(gdb)

@moneromooo-monero
Copy link
Collaborator

Build a debug binary, and get a trace again.

@marmulak
Copy link
Author

Latest attempt at "make debug":
http://paste.debian.net/hidden/e89e07d6/

@moneromooo-monero
Copy link
Collaborator

moneromooo-monero commented Jun 27, 2018

Looks like a compiler or linker bug.
Check the file is not a 0 byte file, just in case the compiler died and make didn't notice.

@marmulak
Copy link
Author

I tried a little bit different of a compiler/linker configuation, but the error is identical:

Scanning dependencies of target wallet
[ 80%] Linking CXX shared library libwallet.so
CMakeFiles/obj_wallet.dir/wallet2.cpp.o: file not recognized: File format not recognized
clang-6.0: error: linker command failed with exit code 1 (use -v to see invocation)
*** Error code 1

When I go to monero/build/debug/CMakeFiles/ there simply is no obj_wallet.dir directory, let alone the object file. Up until that point everything went smoothly, and obj_wallet built successfully right before this.

@marmulak
Copy link
Author

Nevermind I located the object file. I was looking in the wrong directory. wallet2.cpp.o is 69 megs

@moneromooo-monero
Copy link
Collaborator

Then it's a bug in your toolchain.

@moneromooo-monero
Copy link
Collaborator

moneromooo-monero commented Jun 28, 2018

Out of curiosity, can you try file $file on that file, and on another object file from the monero tree:

file CMakeFiles/obj_wallet.dir/wallet2.cpp.o

@marmulak
Copy link
Author

marmulak commented Jun 28, 2018

Quick update: I'm still working on getting a debug build so I can provide you with a nice stack trace on the segfault. It's taking time because it involved me upgrading FreeBSD to 11.2-RELEASE and setting up gcc6 to attempt a build with that as well. If I can get a working build and it segfaults we can focus on that for this ticket, and I can file any clang build problems as a separate issue.

What may have happened with wallet2.cpp.o is clang may have silently aborted after running out of RAM on my system and delivering an unfinished file, which might explain why it wouldn't link. I've enabled swap on this system so we'll see if my theory is correct. If that fails I'll check the file like you suggested.

@marmulak
Copy link
Author

marmulak commented Jun 28, 2018

From building with clang (after unrecognied format linker error):

$ file ./build/debug/src/wallet/CMakeFiles/obj_wallet.dir/wallet2.cpp.o
./build/debug/src/wallet/CMakeFiles/obj_wallet.dir/wallet2.cpp.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (FreeBSD), stripped

$ file ./build/debug/src/rpc/CMakeFiles/obj_rpc.dir/core_rpc_server.cpp.o
./build/debug/src/rpc/CMakeFiles/obj_rpc.dir/core_rpc_server.cpp.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (FreeBSD), too many section headers (40297)

$ file ./build/debug/src/rpc/CMakeFiles/obj_daemon_messages.dir/message.cpp.o
./build/debug/src/rpc/CMakeFiles/obj_daemon_messages.dir/message.cpp.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (FreeBSD), with debug_info, not stripped

This is a selection of file outputs for compiled objects in the build tree. The output seems to vary a lot. I think the last of the three is what we're supposed to be seeing. (Some other files gave output identical to that one.) I watched my RAM during compile, and with clang/llvm RAM was not the issue.

@moneromooo-monero
Copy link
Collaborator

"too many section headers" may be the problem, if the compiler outputs it and the linker doesn't like it.

@marmulak
Copy link
Author

40297 does seem like a lot of section headers

@iDunk5400
Copy link
Contributor

iDunk5400 commented Jun 28, 2018

Maybe try this

index cab8535..5d5272e 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -820,6 +820,9 @@ elseif(APPLE OR OPENBSD OR ANDROID)
   set(EXTRA_LIBRARIES "")
 elseif(FREEBSD)
   set(EXTRA_LIBRARIES execinfo)
+  if(NOT BUILD_64)
+    set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -Wa,-mbig-obj")
+  endif()
 elseif(DRAGONFLY)
   find_library(COMPAT compat)
   set(EXTRA_LIBRARIES execinfo ${COMPAT})

/EDIT: Nvm, that's Windows only.

@marmulak
Copy link
Author

The patch failed, clang outputting tha -mbig-bj is an invalid argument. Also check this:

$ file build/release/bin/monerod
build/release/bin/monerod: ELF 32-bit LSB shared object, Intel 80386, version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 11.2, FreeBSD-style, not stripped

@marmulak
Copy link
Author

marmulak commented Jun 29, 2018

Hrm, after compiling the latest time, the daemon is now syncing and does not appear to be crashing. I'll let it run for a while just to make sure, but it looks like some code change pulled over the last couple days fixed whatever was going wrong.

@marmulak
Copy link
Author

marmulak commented Jun 29, 2018

It looks like updating to FreeBSD 11.2 has fixed the segfault issue, so we should probably close this issue and open a new one for the "make debug" problem on FreeBSD.

Out of curiosity I downloaded the binary package from the repo and installed it on the updated 11.2 system. It still segfaults like before, so the problem is with that binary. Two things that changed is that FreeBSD's toolchain had a major update in 11.2 (clang/llvm 4.0 -> 6.0, among other things), and also monero source has received some updates. I suppose it's likely the former that made the difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants