-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion failure in lookup_root()
#48707
Comments
Thanks for the report. Can you provide a reproducer? Due to teaching deadlines it could be ~1 week before I can get to this. |
If you can't provide a reproducer...maybe a bit of explanation may help. The code here is part of the solution for the problem described most completely in the top set of bullets of the OP in #42016. The system image is One way this could break is if you're inserting some weird custom build steps where the standard rules of "either we're building a sysimg or we're precompiling a pkgimg" for some reason aren't true. For example, in the REPL you can force addition of new roots with no module-buildid provenance. But the clean environment of standard package precompilation should not make that possible (I think). |
Thanks @timholy, for the explanation. We're seeing these assertion failures (segfaults if assertions are turned off) after a sequence of warnings like:
Followed by:
Our test suite first runs @vchuravy pointed out to us that the workers must be started with the same arguments as the Julia process that did the precompilation (in particular, bounds checking, code coverage, and optimization level), so we're trying to tweak things so that they don't try to precompile again. This will hopefully either fix these assertion failures or eliminate one possible reason for them. Will keep you posted. Cc: @NHDaly |
Thanks for the extra details. FWIW, I built a debug build of |
🤔 i'm not sure why the test's worker processes should even have to be doing any package compilation in parallel, since from what I understand, Pkg runs here, i think?: So it should be that everything is already precompiled, right? |
This is looking more like a package or Pkg bug and less like a problem in the method roots code. I'm at least temporarily unassigning myself since that's less obviously an area of expertise for me. But I'll be happy to keep poking at it as more info comes in. FWIW I did a |
our tests are running with multiple distributed worker processes, so a race seems likely, indeed. How can we help with debugging the possible pkg bug? |
You can run with |
Thanks, great suggestion. We're seeing messages like these, which imply to us that somehow the
then later:
(We think the second, Do you know why the cached .ji file should be built with |
This is how we're invoking the tests:
@time Pkg.test("RAICode"; julia_args=[Base.julia_cmd().exec[2:end]..., "-p", "$(nworkers)", "--eval", code]) from a julia process started as:
|
|
Yeah, agreed. But doesn't called from: |
Yes, |
thanks Kristoffer. We'll have to do some more digging, it seems :/ thanks |
Running with
FORCE_ASSERTIONS=1
, we're seeing either:Or:
Quite regularly. A full stack trace for one of these is below. Still trying to get an
rr
trace, but hopefully this will be useful anyway.Full stack trace.
The text was updated successfully, but these errors were encountered: