Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Radically smaller sysimage; excise Pkg/REPL from it, and maybe LinearAlgebra (or at least openBLAS) #50833

Closed
PallHaraldsson opened this issue Aug 8, 2023 · 6 comments
Labels
performance Must go faster

Comments

@PallHaraldsson
Copy link
Contributor

PallHaraldsson commented Aug 8, 2023

PR here PallHaraldsson#4, can skip to final comment, over (then) work-in-progress text, most of below here at the top may be invalid:

When I comment out :REPL from sysimg.jl and build Julia (again) apparently nothing bad happens, and the sysimage still shrinks, and with it Julia startup, which is more-or-less linear in the size of the sysimage.

Probably most of REPL is brought in (by why not all?) because of Pkg that depends on it. I can now excise Pkg, if I keep in all its dependencies, including :p7zip_jll, which is missing in sysimg.jl.

I realize that excision alone isn't a big win (note, the REPL still operates so why is it there?), so I'm investigating excising more, e.g. the big-ticket LinearAlgebra, and others, only dropping it makes the sysimage 4% smaller. With it additionally commented out I can still start Julia and I don't run into trouble (it does break matrix multiply until I do using LinearAlgebra so such a simple change would be be a breaking change for 2.0, but I'm looking at what can be done in 1.x).

Another way we could reduce the sysimage, is excise all doc/helps texts from it, 1.5% of it, 2.6 MB (and growing, possibly an overestimate of the gain):

$ strings -n 40 usr_lib_julia/sys.so |wc
  38268  261726 2768701

I do get a smaller sysimage, with excising all but Pkg and LinearAlgebra, and while I think a PR could be made, it's only 0.05% smaller sysimage, though strangely 0.74% faster startup. The big, but breaking, change comes with excising LinearAlgebra additionally (I believe it can though be excised in a non-breaking but that's morework than I like to do, one way would be dropping BLAS and using naive matmul until using LinearAlgebra), or likely (and then non-breaking) if Pkg is excised. Or both, or some of Base additionally.

I got a 1-161.2/174.0 = 7.4% faster startup (for min) with a sysimage just (or rather with it and what it may have brought in) with only Pkg in (I'm not fully sure of my timing, I thought I had disabled the web browser, but I had only susbended the wrong process, not the master process):

$ hyperfine './julia -e ""'
Benchmark 1: ./julia -e ""
  Time (mean ± σ):     182.4 ms ±  16.7 ms    [User: 113.5 ms, System: 69.0 ms]
  Range (min … max):   161.4 ms … 200.9 ms    14 runs

I believe my machine is not loaded (web browser suspended), so variation in min and mean a bit worrying:

$ hyperfine './julia -e ""'
Benchmark 1: ./julia -e ""
  Time (mean ± σ):     199.0 ms ±   3.4 ms    [User: 117.7 ms, System: 81.1 ms]
  Range (min … max):   194.2 ms … 204.1 ms    14 runs

Seemingly I can't excise Pkg, but I would really like to be able to (and its dependencies, e.g. Downloads, that I believe should go in 2.0, and strangely I can excise Pkg's dependencies). I'm unclear of why it fails (it seems to hang), while e.g. excising REPL is ok.

With my past excision of I got:

Base64 ─────────── 5.963757 seconds
..
Pkg ────────────── 20.987424 seconds

Note, these timings change depending on what else is excised.

I'm currently excising all but Pkg, which is the problem, including most importantly :LinearAlgebra which I think accounts for almost all the reduction in sysimage size, because in effect Pkg is holding the reduction back. The reduction, as is, isn't as large as I hoped for, what would it take to reduce to massively? I think Base could also be made smaller (in the sysimage, but I'm not sure how).

Base 48.235718 seconds
Pkg ─ 38.298616 seconds
Stdlibs total 38.307447 seconds
Sysimage built. Summary:
Base ──────── 48.235718 seconds 55.734%
Stdlibs ───── 38.307447 seconds 44.2624%

Note, the timing for Pkg just goes up, the more I excise its dependencies, i.e. what I think is happening it forces them effectively into the sysimage. But I don't want Pkg in the sysimage (nor its dependencies). It's not needed most of the time.

More radical, i.e. excising ALL stdlibs, meaning additionally Pkg, gets me a much better timing (for building) but doesn't work:

Sysimage built. Summary:
Base ────────  47.605037 seconds 99.9933%
Stdlibs ─────   0.000005 seconds 9.92475e-6%
Total ───────  47.608228 seconds
    JULIA usr/lib/julia/sys-o.a
Collecting and executing precompile statements
└ Collect (Basic: ◒ , REPL 0/0: ◒ 382) => Execute ◒ 33┌ Warning: The call to compilecache failed to create a usable precompiled cache file for Markdown [d6f4376e-aef5-505a-96c1-9c027394607a]
│   exception = ErrorException("Required dependency Base64 [2a0f44e3-6c83-55bd-87e4-b1978d98bd5f] failed to load from a cache file.")
└ @ Base loading.jl:1986
└ Collect (Basic: ✓ 913, REPL 0/0: ◐ 402) => Execute ◐ 506┌ Warning: Failed to precompile expression
│   form = precompile(Tuple{typeof(InteractiveUtils.include), String})
│   exception = UndefVarError(:InteractiveUtils)
└ @ nothing nothing:0
└ Collect (Basic: ✓ 913, REPL 0/0: ◓ 402) => Execute ◓ 562 (1 failed)┌ Warning: Failed to precompile expression
│   form = precompile(Tuple{typeof(Base.convert), Type{Function}, InteractiveUtils.var"#2#3"{Bool, InteractiveUtils.var"#8#22", String}})
│   exception = UndefVarError(:InteractiveUtils)
└ @ nothing nothing:0
┌ Warning: Failed to precompile expression
│   form = precompile(Tuple{typeof(InteractiveUtils.define_default_editors)})
│   exception = UndefVarError(:InteractiveUtils)
└ @ nothing nothing:0
┌ Warning: Failed to precompile expression
│   form = precompile(Tuple{typeof(Core.kwcall), NamedTuple{(:wait,), Tuple{Bool}}, typeof(InteractiveUtils.define_editor), Function, Array{String, 1}})
│   exception = UndefVarError(:InteractiveUtils)
└ @ nothing nothing:0
┌ Warning: Failed to precompile expression
│   form = precompile(Tuple{typeof(Core.kwcall), NamedTuple{(:wait,), Tuple{Bool}}, typeof(InteractiveUtils.define_editor), Function, Array{Base.Regex, 1}})
│   exception = UndefVarError(:InteractiveUtils)
└ @ nothing nothing:0
└ Collect (Basic: ✓ 913, REPL 0/0: ◑ 402) => Execute ◑ 659 (5 failed)

@KristofferC, It seemingly hangs there, so I can't excise all, and Pkg is the reason (not Base64 as I first thought), anyone knows why it might hang?

I can actually try to excise Pkg, but keep some of its dependencies in:

I'm now down to, more packages in, with better timing and hopefully smaller sysimage and faster startup (if I get to there...):
Base ───────────── 47.120218 seconds
Markdown ───────── 6.928026 seconds
Base64 ─────────── 0.000132 seconds
InteractiveUtils ─ 0.745049 seconds
Stdlibs total ──── 7.682920 seconds
Sysimage built. Summary:
Base ──────── 47.120218 seconds 85.9757%
Stdlibs ───── 7.682920 seconds 14.0183%
JULIA usr/lib/julia/sys-o.a
Collecting and executing precompile statements
└ Collect (Basic: ✓ 718, REPL 0/0: ◒ 229) => Execute ◒ 477

Seemingly hangs there (at some other time I got hang at 500 not 477, in case it's helpful info). Which of these (and possibly more) are dependencies of Pkg?

@brenhinkeller brenhinkeller added the performance Must go faster label Aug 8, 2023
@ViralBShah
Copy link
Member

I thought removing LinearAlgebra will have a much bigger impact on sysimage size than 4%.

@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Aug 9, 2023

Most of it is in a stdlib that's not going away or smaller (nor is BLAS going away but could, I just looked at reduction in sysimage itself). What's in the sysimage is * and some other operators (all of them?), I only confirmed it went away, and `using LinearAlgebra recovers it (and [presumably] everything).

JuliaLang/Pkg.jl#3570 is blocking further progress here, not LinearAlgebra. EDIT: I can excise Pkg, from the sysimage, but then it no longer works. It should still be possible to NOT have it in the sysimage and working.]

@KristofferC
Copy link
Member

Pkg was already moved out once. It was out back because it had some strange behaviors with the automatic precompilation that is going on. But once that is fixed it should be fine to move it out again.

@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Aug 9, 2023

:p7zip_jll seems to be missing in the sysimg.jl file, I'm trying now with all Pkg dependencies:

I got past the hang with it (and could likely trim now further):

Base  ────────── 50.967063 seconds
Artifacts  ─────  6.826454 seconds
Dates  ─────────  2.372306 seconds
Downloads  ─────  2.946057 seconds
FileWatching  ──  0.000143 seconds
LibGit2  ───────  2.346177 seconds
Libdl  ─────────  0.000141 seconds
Logging  ───────  0.053288 seconds
Markdown  ──────  0.954717 seconds
Printf  ────────  0.000103 seconds
REPL  ────────── 16.221375 seconds
Random  ────────  0.915210 seconds
SHA  ───────────  0.000153 seconds
Serialization  ─  0.343131 seconds
TOML  ──────────  0.086998 seconds
Tar  ───────────  0.377395 seconds
UUIDs  ─────────  0.021489 seconds
p7zip_jll  ─────  0.021181 seconds
Stdlibs total  ─ 33.496940 seconds
Sysimage built. Summary:
Base ────────  50.967063 seconds 60.3381%
Stdlibs ─────  33.496940 seconds 39.6559%
Total ───────  84.469062 seconds
    JULIA usr/lib/julia/sys-o.a
Collecting and executing precompile statements
└ Collect (Basic: ✓ 856, REPL 22/22: ✓ 1024) => Execute ✓ 1220
Precompilation complete. Summary:
Total ───────  67.736688 seconds
Outputting sysimage file...

Predictably I no longer can go into Pkg mode, which is ok for my purposes..., but it should be possible though. Right now the sysimage is 30% smaller with no bad effects (Pkg out is not bad). New startup timing coming, but I want to reduce further first.

This now works:

stdlibs = [:Artifacts, :Dates, :Downloads, :FileWatching, :LibGit2, :Libdl, :Logging, :Markdown,
        :Printf, :REPL, :Random, :SHA, :Serialization, :TOML, :Tar, :UUIDs, :p7zip_jll,

            :LinearAlgebra,  # Excising :LinearAlgebra is also ok, but seemingly then you need to expand the list above, excising it alone gives 4% smaller sysimage, than the full original list. 
    ]

@KristofferC could the order matter?

Note (to myself), it doesn't hang (finishes) if I see this high numbers:
└ Collect (Basic: ✓ 832, REPL 22/22: ✓ 1009) => Execute ✓ 1204

for some reason I see though these numbers working with other list of stdlibs:
└ Collect (Basic: ✓ 832, REPL 22/22: ✓ 1063) => Execute ✓ 1220

PallHaraldsson added a commit to PallHaraldsson/julia that referenced this issue Aug 9, 2023
All of Pkg's dependencies seemingly need to be in, at least :Markdown, even with Pkg excised.

30% smaller sysimage is possible, see at: JuliaLang#50833 (comment)
@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Aug 9, 2023

It works!

It's now 30.2% faster and 37.5% smaller (with just REPL and p7zip_jll, previous record 37.4% with Markdown, Printf, REPL, p7zip_jll). Before 36% smaller and 30% faster (less precise measurement i.e. with web browser on), what's in the PR if I recall.

Base ────── 50.109432 seconds
REPL ────── 24.246528 seconds
p7zip_jll ─ 0.052952 seconds
Stdlibs total 24.309599 seconds
Sysimage built. Summary:
Base ──────── 50.109432 seconds 67.3309%
Stdlibs ───── 24.309599 seconds 32.6643%
Total ─────── 74.422584 seconds

This longer list didn't work (likely since REPL missing) but SHOULD work, and reduce the sysimage a lot, i.e. 18.4066% vs 32.6643% above:
Base ────── 49.924721 seconds
Base64 ──── 6.370239 seconds
Artifacts ─ 0.338055 seconds
Dates ───── 2.366586 seconds
Markdown ── 0.952169 seconds
Printf ──── 0.000098 seconds
Random ──── 1.174201 seconds
p7zip_jll ─ 0.052071 seconds
Stdlibs total 11.263258 seconds
Sysimage built. Summary:
Base ──────── 49.924721 seconds 81.588%
Stdlibs ───── 11.263258 seconds 18.4066%

TODO: Excise Base... well part of Base, could it (or Core) be made a stdlib and lazily loaded on first use? There's a lot in Base that's never used (at least for startup).

@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Aug 9, 2023

I would like my PR considered (and why did the PR not end up at JuliaLang?), i.e. for PkgEval. The list was clearly (mostly) redundant, at least much shorter with fewer levels than:

$ julia contrib/print_sorted_stdlibs.jl
    # Stdlibs sorted in dependency, then alphabetical, order by contrib/print_sorted_stdlibs.jl
    stdlibs = [
        # No dependencies
        :ArgTools,
[..]
        # 1-depth packages
        :CompilerSupportLibraries_jll,
        :DelimitedFiles,
[..]
        # 9-depth packages
        :Statistics,
        :SuiteSparse,

This was attempted #49135 then reverted, but should at some point be retried. I'm ok with Pkg just disabled for now, but in the end it must work.

@PallHaraldsson PallHaraldsson changed the title Radically smaller sysimage; excise REPL (at least from it), and maybe LinearAlgebra... or at least openBLAS Radically smaller sysimage; excise Pkg/REPL from it, and maybe LinearAlgebra (or at least openBLAS) Aug 9, 2023
@vtjnash vtjnash closed this as completed Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests

5 participants