Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfaults when building, running #3899

Closed
jasonmhite opened this issue Aug 1, 2013 · 16 comments
Closed

Segfaults when building, running #3899

jasonmhite opened this issue Aug 1, 2013 · 16 comments

Comments

@jasonmhite
Copy link

I'm getting segmentation faults when trying to build on 64-bit Arch Linux. At first, the Arch PKGBUILD file from here was failing with an error

/bin/sh: line 1: 28546 Segmentation fault      (core dumped) /home/jmhite/Build/julia-git/src/julia/usr/bin/julia-release-readline -bf sysimg.jl

and I thought it was because of using makepkg instead of building it directly. So, I tried directly building from a Git clone, but I get the same error. I've tried various combinations of using and not using system libraries, ranging from using no system libraries to using pretty much all system libraries, also with and without MKL, but the error is the same. Note that in particular I tried both using and not using the system version of readline since that was mentioned in the docs.

Anyway, I thought I would try it on a different computer. This one is configured very similarly to the problem machine, except the build works just fine on this one! The only real difference I can think of is that the working computer is a second-gen Intel i7 whereas the problem child is a fourth-gen i7. They have pretty similar sets of packages installed and as far as I remember are otherwise configured very similarly.

Also, I tried copying the successfully built package from the working machine to the problem machine and installing it. The Julia REPL runs fine and I can even do basic stuff like 1+1, but if I try to do some other things it segfaults. E.g., doing Pkg.install("ZMQ") segfaults and exits. But, on the other machine the same command works fine...

What on earth is going on?

@staticfloat
Copy link
Member

Interesting, Can you cd to <julia_root>/base, (where <julia_root> is the root of the git repository) and run /home/jmhite/Build/julia-git/src/julia/usr/bin/julia-release-readline -bf sysimg.jl, and post the output of everything it spits out? If nothing too interesting gets spit out, can you gist the output of strace /home/jmhite/Build/julia-git/src/julia/usr/bin/julia-release-readline -bf sysimg.jl? That will give us an idea of what libraries are being used by Julia while trying to build the system image.

It sounds to me like there is some incompatible library that is being loaded by Julia on one machine, but not on the other, although it's possible there is a problem with the CPU architecture as well. What git hash is your julia building? (git rev-parse HEAD inside of the Julia git repo will give you that)

@jasonmhite
Copy link
Author

All I get from running it directly is

could not open file boot.jl[1]    3149 segmentation fault (core dumped)  ./julia-release-readline -bf sysimg.jl

So, I did a stack trace like you asked for here. The git SHA is b6b4f02.

Both the working and non-working system should have the same libraries and versions installed, which should pretty much be the latest versions, whatever is available in the Arch repos. I can give you a list of installed packages on both machines if you like. But, I also tried building with no system libraries at all and it still failed.

Thanks for the help so far!

@jasonmhite
Copy link
Author

Oh, I see in that stack trace that it's trying to load the version of readline that is shipped with the Intel compilers, I wonder if maybe that is the problem?

Interestingly, if I run julia-release-readline on the working system it also segfaults. For some reason it doesn't seem to kill the build process on that machine (or have any real effect).

@staticfloat
Copy link
Member

Is there a boot.jl in <julia_root>/base/? Additionally, you need to run julia-release-readline within the base directory.

@jasonmhite
Copy link
Author

Ah, ok. There is a boot.jl. I was not running it in the base directory, here's the output when I do:

exports.jl
base.jl
reflection.jl
promotion.jl
build_h.jl
c.jl
range.jl
tuple.jl
cell.jl
expr.jl
error.jl
bool.jl
number.jl
int.jl
operators.jl
pointer.jl
float.jl
reduce.jl
complex.jl
rational.jl
abstractarray.jl
subarray.jl
array.jl
bitarray.jl
intset.jl
dict.jl
set.jl
iterator.jl
inference.jl
osutils.jl
[1]    6386 segmentation fault (core dumped)  ../usr/bin/julia-release-readline -bf sysimg.jl

And the stack trace.

@staticfloat
Copy link
Member

Interesting. Can you build a debug version of Julia (make debug in the main Julia directory), which will likely fail during system image generation again, but the executable will be sitting in <julia_root>/usr/bin/, called julia-debug-readline or somesuch. We'll use that path to build the system executable inside of gdb:

$ cd base/
$ gdb --args ../usr/bin/julia-debug-readline -bf sysimg.jl

Once gdb starts up, just hit r to run the program. It'll crash eventually, at which you'll have a (gdb) prompt, and if you enter bt it will print out a backtrace showing us where in process execution failed. Please post that backtrace, it would be most helpful!

@jasonmhite
Copy link
Author

Here's the backtrace.

I can dig through with GDB more if you tell me what to look for.

@Keno
Copy link
Member

Keno commented Aug 1, 2013

Try commenting in the call to verifyFunction in gf.c and see what happens.

@jasonmhite
Copy link
Author

There isn't a call to verifyFunction in src/gf.c. A quick grep of the code says that verifyFunction only appears in codegen.cpp, around line 251 (and it's already commented out). We're talking about this file and this one, right?

@staticfloat
Copy link
Member

Yes. Comment that back in and remake.
On Aug 1, 2013 5:21 PM, "jasonmhite" notifications@github.com wrote:

There isn't a call to verifyFunction in src/gf.c. A quick grep of the code
says that verifyFunction only appears in codegen.cpp, around line 251 (and
it's already commented out). We're talking about this filehttps://github.com/JuliaLang/julia/blob/master/src/gf.cand this
one https://github.com/JuliaLang/julia/blob/master/src/codegen.cpp,
right?


Reply to this email directly or view it on GitHubhttps://github.com//issues/3899#issuecomment-21979323
.

@staticfloat
Copy link
Member

EDIT: I suppose I should say, (Un)comment that back in and remake. Also, I would do the one before the FPM->run(*f); that is, the one on line 252.

@jasonmhite
Copy link
Author

Sorry for the delay, I was on a business trip.

I commented in lines 251-252 and 256-259 of codegen.cpp. Note that I left line 254 commented out as it gives an error that n_compile is not declared if I leave it in.

Here's the full output from running make.

@jasonmhite
Copy link
Author

Incidentally, I tried building one of the source tarballs (v0.1.2) and I think some of the links that the makefile fetches dependencies from are broken. Wget gives me access denied messages.

@StefanKarpinski
Copy link
Member

We don't, as far as I know, support building from tarballs. GitHub insists on providing tarball downloads, but that isn't a supported way to build julia – if you want to build from source, you have to use git.

@jasonmhite
Copy link
Author

Makes sense. I just wanted to see if the old version would build, but I guess it doesn't matter.

@ViralBShah
Copy link
Member

I think we need to archive the dependencies when we release since some projects do not keep old versions around.

IanButterworth pushed a commit that referenced this issue Jun 4, 2024
Stdlib: Pkg
URL: https://github.com/JuliaLang/Pkg.jl.git
Stdlib branch: master
Julia branch: master
Old commit: ed7a8dca8
New commit: 4e43058c2
Julia version: 1.12.0-DEV
Pkg version: 1.12.0
Bump invoked by: @IanButterworth
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
JuliaLang/Pkg.jl@ed7a8dc...4e43058

```
$ git log --oneline ed7a8dca8..4e43058c2
4e43058c2 Merge pull request #3887 from carlobaldassi/validate_versions
bc7c3207d abort querying more pacakge for hint auto complete (#3913)
a4016aed2 precompile repl switch (#3910)
a48c9c645 Fixed glitch in the manual (#3912)
d875aa213 Add timeout and new tests for resolver
aeb55f7f0 run artifact selection code with minimal compilation (#3899)
0180a0105 avoid doing some checks if the package will not be showed in status output (#3897)
c6c7ed502 improve precompilation for `st` in the Pkg REPL (#3893)
bffd0633c Add version validation during Graph simplification
c2ad07003 Fix padding in resolve's log journal printing
3eb86d29f Revert #2267, with better log message
acdbb727e Small extra check in Graph's check_consistency
1d446c224 Fix small bug in Graph constructor
3efc3cbff Fix show method for VersionSpecs
```

Co-authored-by: Dilum Aluthge <dilum@aluthge.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants