-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework stdlib caching #48069
Rework stdlib caching #48069
Conversation
This was introduced in https://github.com/JuliaLang/julia/pull/45222/files to allow QuickerSort to use randomized pivot selection but still function (albeit with deterministic pivot selection) without Random loaded. I think that the best solution will be to move the complex and high performance sorting algorithms out of base and into a stdlib that depends on Random. The sorting stdlib would still pirate |
That sounds bad. There's already a bunch of headaches with stdlibs type pirating so I don't think adding more is a good idea, or? |
Another TODO is perhaps to move the precompile stuff from Would also be interesting to run a PkgEval here just to see how bad the type piracy from LinearAlgebra and Random is. |
That's indeed my first TODO :) That needs to happen before we do a honest evaluation of the latency of this change. |
A difference between Sorting and Random would be that |
This is side tracking the discussion of the PR a bit but why would we not want |
My understanding is that we need a basic sort for the compiler that doesn't need to be hyper-optimized, so we have a simple implementation for that. Then later, we define all the fancier radixsort et al algorithms that may have other dependencies that are optimized. |
I would want
Yep! It is a heap sort (unstable, n log n, rarely slower than 3x the runtime of Base.sort, and sometimes faster, implementation here). Unfortunately, we can't use it as the default without Sorting loaded because we guarantee stability by default. |
The goal here is to remove stdlibs permanently from the sysimg. So we will have to figure out a solution to this (and yes this might mean including the code from two locations) |
If replacing the method overwrite with an ordinary invalidation would work I could hack that out pretty easily sort.jl """
_select_pivot(lo::Integer, hi::Integer, ::Any) -> deterministic
_select_pivot(lo::Integer, hi::Integer, ::Nothing) -> randomized
This internal method exists to allow the Random stdlib to redefine Base.Sort's `select_pivot`
function without triggering Revise's method overwrite errors. Base.Sort defines the `::Any`
method and Random defines the `::Nothing` method. The function is always called with
`nothing` as its third argument.
"""
_select_pivot(lo::Integer, hi::Integer, ::Any) = typeof(hi-lo)(hash(lo) % (hi-lo+1)) + lo
select_pivot(lo::Integer, hi::Integer) = _select_pivot(lo::Integer, hi::Integer, nothing) Random.jl _select_pivot(lo::Integer, hi::Integer, ::Nothing) = rand(lo:hi) |
Can we just have a separate non-public sorting function that the compiler can use that does whatever? |
We already have |
Just very naive question. Would it make sense to create a single |
Partially yes, I was thinking of something like |
976fbbe
to
195056e
Compare
195056e
to
a1a9194
Compare
12c1f14
to
28e6493
Compare
28e6493
to
50d4632
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Am I missing something? |
Should have been more clear about this, but this PR is no longer removing all stdlibs from the sysimage. |
Are all the empty |
We can move them into a separate folder or something, but at lease my Makefile-fu doesn't let me avoid them... We don't know the cache suffix we will choose. |
New
In discussion with @KristofferC I decided that it might be better to do this step by step.
So for now this just updates the infrastructure to generate cache files for "upgradable stdlibs" like DelimitedFiles and SparseArrays.
Old
This is mostly meant to start a discussion about how to get to this point.
One of my motivations behind package images was to make it possible to upgrade stdlibs
independently of Julia. For this we need to move stdlibs out of the system image and
instead cache them with package images.
Right now REPL is the only sysimage included here since otherwise precompilation will simply hang.
Results
sys.so
shrinks from 173M to 96Mjulia --startup-file=no
startup time improves to 77ms from 127ms on my machinemethod_table_insert