Pall haraldsson patch 1 #2

PallHaraldsson · 2023-08-09T17:50:55Z

32% smaller sysimage. It's 25% faster startup (with web browser running). Possibly more slimmed is coming, and more precise timing.

…liaLang#50374) * ensure GC_FINAL_STATS is consistent with new page metadata layout

This link used to exist in the docs up to v0.2, but it was removed in e91294f as it was pointing to a doc page that was removed in ef0c44d. This change restores the link in the original place, pointing to the up-to-date location.

* Set `VERSION` to `1.11.0-DEV` * move NEWS to HISTORY Co-authored-by: KristofferC <kristoffer.carlsson@juliacomputing.com>

…liaLang#50398)

* Extend ifelse lifting to regular SROA * Fix oracle violation This is a pre-existing bug, but was exposed by my improvements to SROA.

Added in JuliaLang#50168

…Lang#50259) * fix(stdlib/Dates/periods.jl): conversion of empty CompoundPeriod to zero units * add(stdlib/Dates/test/periods.jl): add test for empty CompoundPeriod

Co-authored-by: Gabriel Baraldi <baraldigabriel@gmail.com>

…tdir (JuliaLang#50026)

…args (JuliaLang#50027)

The new commands are \guillemotleft and \guillemotright, respectively. These commands are in line with the corresponding commands defined in the LaTeΧ package csquotes. Co-authored-by: Steven G. Johnson <stevenj@mit.edu>

Our inlining cost model is extremely primitive, though surprisingly functional given its limitations. The basic idea for it was just that we'd give every intrinsic the approximate cost in cycles, such that for sufficiently large functions (>100 cycles), the cost of the extra call would be dwarfed by the cost of the function. However, there's a few problems with this. For one, the real issue is usually not the extra overhead of the call (which is small and well-predicated), but rather the inhibition of optimizations that inlining might have allowed. Additionally, the relevant cost comparison is not generally latency, but rather the size of the resulting binary. Lastly, the latency metric is misleading on modern superscalar architectures, because the core will perform other tasks while the operation is executing. In fact, somewhat counter-intuitively, this means that it is *more* important to inline high-latency instructions to allow the compiler to perform better latency hiding by spreading out the high-latency instructions. We probably need a full-on rethink of the inlining model at some point, but for the time being, this fixes a problem that I ran into in real code by reducing the inlining cost for floating point division to be the same as that of floating point multiplication. The particular case where I saw this was the batched forward AD rule for division, which had 6 calls to div_float. Inlining these provided substantially better performance.

Lower inlining cost of floating point div

)

…aLang#50421) Co-authored-by: Steven G. Johnson <stevenj@mit.edu>

Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>

* Add GC metric `last_incremental_sweep` * Update gc.c * Update gc.c

Note that this defines the lock order as `out` then `in` for streams which may try to take both locks. This is now a mandatory API convention for all future streams. Co-authored-by: Rafael Fourquet <fourquet.rafael@gmail.com>

…ng#50391) X-ref #50377

The descriptions had `i <= 5` while the code block had `i <=3`.

Currently the `compact!`-ion pass fails to fold constant `PiNode` like `PiNode(0.0, Const(0.0))`. This commit fixes it up.

…ompact!`-ion (JuliaLang#50767) In code like below ```julia Base.@assume_effects :nothrow function erase_before_inlining(x, y) z = sin(y) if x return "julia" end return z end let y::Float64 length(erase_before_inlining(true, y)) end ``` the constant prop' can figure out the constant return type of `erase_before_inlining(true, y)` while it is profitable not to inline expand it since otherwise we left some `!:nothrow` callees there (xref: JuliaLang#47305). In order to workaround this problem, this commit makes `compact!`move inlineable constants into argument positions so that the such "inlineable, but safe as a whole" calls to be erased during compaction. This should give us general compile-time performance improvement too as we no longer need to expand the IR for those calls. Requires: - JuliaLang#50764 - JuliaLang#50765 - JuliaLang#50768

fixes JuliaLang#50472

…wing `InexactError`. (JuliaLang#50777)

@vtjnash

If something odd happens during GC (the PC goes to sleep) or a very big transient the heuristics might make a bad decision. What this PR implements is if we try to make our target more than double the one we had before we fallback to a more conservative method. This fixes the new issue @vtjnash found in JuliaLang#40644 for me.

…able (JuliaLang#50618)

fixes JuliaLang#50780 caused by JuliaLang#47013.

…string (JuliaLang#50775)

…iaLang#50497) addresses JuliaLang#50440

@CCall

``` julia> @CCall jl_dump_host_cpu()::Cvoid CPU: znver2 Features: sse3, pclmul, ssse3, fma, cx16, sse4.1, sse4.2, movbe, popcnt, aes, xsave, avx, f16c, rdrnd, fsgsbase, bmi, avx2, bmi2, rdseed, adx, clflushopt, clwb, sha, rdpid, sahf, lzcnt, sse4a, prfchw, mwaitx, xsaveopt, xsavec, xsaves, clzero, wbnoinvd julia> target = only(Base.current_image_targets()) znver2; flags=0; features_en=(sse3, pclmul, ssse3, fma, cx16, sse4.1, sse4.2, movbe, popcnt, aes, xsave, avx, f16c, fsgsbase, bmi, avx2, bmi2, adx, clflushopt, clwb, sha, rdpid, sahf, lzcnt, sse4a, prfchw, mwaitx, xsavec, xsaves, clzero, wbnoinvd) ``` Co-authored-by: Prem Chintalapudi <prem.chintalapudi@gmail.com> Co-authored-by: Jameson Nash <vtjnash@gmail.com>

…when partitioning (JuliaLang#50791)

Followup to JuliaLang#45964, JuliaLang#46506, and https://discourse.julialang.org/t/class-of-variables/83892. The error ``` julia> println(_) ERROR: syntax: all-underscore identifier used as rvalue ``` is hard to interpret if you are not familiar with the term `rvalue`, which is not used in any other context in Julia, and as discussed previously the use here is not clearly matching the wikipedia page referred to in the documentation either. This PR does away with the term `rvalue` by changing the error to ``` ERROR: syntax: all-underscore identifiers are write-only and their values cannot be used in expressions ``` and updates the documentation accordingly.

We don't really use anything meaningful from libm for this to matter much.

@oscardssmith

…uliaLang#50844) Detailed discussion and benchmarks by @oscardssmith in JuliaPackaging/Yggdrasil#7189

…0840)

…#50851) Co-authored-by: Dilum Aluthge <dilum@aluthge.com>

This bumps the build numbers for stdlib and binary dependency JLLs, updates libssh2 to 1.11.0, libgit2 to 1.6.4, and objconv to 2.53. Julia's FreeBSD CI has been running on FreeBSD 13.2 for a while, but until more recently, Yggdrasil was still building FreeBSD binaries using the 12.2 sysroot. The sysroot was updated to 13.2 and I went through and rebuilt the dependencies that Julia uses. The updated build numbers correspond to these rebuilt but otherwise unchanged binaries. The actual version updates are because libssh2 in Yggdrasil was at 1.11.0 so I left it there (its [release notes](https://github.com/libssh2/libssh2/releases/tag/libssh2-1.11.0) suggest it's a safe update), libgit2 had a newer patch version available and needed to be fixed anyway since the Windows build was broken, and objconv needed its Yggdrasil build recipe fixed but Elliot's GitHub mirror of objconv was at 2.53 so I updated to use that.

All of Pkg's dependencies seemingly need to be in, at least :Markdown, even with Pkg excised. 30% smaller sysimage is possible, see at: JuliaLang#50833 (comment)

oscardssmith · 2023-08-09T19:38:50Z

Something appears to be radically wrong with your git history. You appear to have merged a lot of random things in here.

PallHaraldsson · 2023-08-09T21:00:49Z

Yes, I'm not sure why, I didn't merge anything here intentionally, nor meant to make a PR at my own repo. I believe I synced (correctly?) my repo here though rather recently. I'm not sure why my PR didn't go to JuliaLang in the first place as intended, nor how to fix my repo here...

@CCall

…#51489) This exposes the GC "stop the world" API to the user, for causing a thread to quickly stop executing Julia code. This adds two APIs (that will need to be exported and documented later): ``` julia> @CCall jl_safepoint_suspend_thread(#=tid=#1::Cint, #=magicnumber=#2::Cint)::Cint # roughly tkill(1, SIGSTOP) julia> @CCall jl_safepoint_resume_thread(#=tid=#1::Cint)::Cint # roughly tkill(1, SIGCONT) ``` You can even suspend yourself, if there is another task to resume you 10 seconds later: ``` julia> ccall(:jl_enter_threaded_region, Cvoid, ()) julia> t = @task let; Libc.systemsleep(10); print("\nhello from $(Threads.threadid())\n"); @CCall jl_safepoint_resume_thread(0::Cint)::Cint; end; ccall(:jl_set_task_tid, Cint, (Any, Cint), t, 1); schedule(t); julia> @time @CCall jl_safepoint_suspend_thread(0::Cint, 2::Cint)::Cint hello from 2 10 seconds (6 allocations: 264 bytes) 1 ``` The meaning of the magic number is actually the kind of stop that you want: ``` // n.b. suspended threads may still run in the GC or GC safe regions // but shouldn't be observable, depending on which enum the user picks (only 1 and 2 are typically recommended here) // waitstate = 0 : do not wait for suspend to finish // waitstate = 1 : wait for gc_state != 0 (JL_GC_STATE_WAITING or JL_GC_STATE_SAFE) // waitstate = 2 : wait for gc_state != 0 (JL_GC_STATE_WAITING or JL_GC_STATE_SAFE) and that GC is not running on that thread // waitstate = 3 : wait for full suspend (gc_state == JL_GC_STATE_WAITING) -- this may never happen if thread is sleeping currently // if another thread comes along and calls jl_safepoint_resume, we also return early // return new suspend count on success, 0 on failure ``` Only magic number 2 is currently meaningful to the user though. The difference between waitstate 1 and 2 is only relevant in C code which is calling this from JL_GC_STATE_SAFE, since otherwise it is a priori known that GC isn't running, else we too would be running the GC. But the distinction of those states might be useful if we have a concurrent collector. Very important warning: if the stopped thread is holding any locks (e.g. for codegen or types) that you then attempt to acquire, your thread will deadlock. This is very likely, unless you are very careful. A future update to this API may try to change the waitstate to give the option to wait for the thread to release internal or known locks.

pchintalapudi and others added 30 commits July 2, 2023 13:10

Expose PassBuilder callback registration via C api (JuliaLang#50390)

ecca2c5

ensure GC_FINAL_STATS is consistent with new page metadata layout (Ju…

43bf2c8

…liaLang#50374) * ensure GC_FINAL_STATS is consistent with new page metadata layout

Set VERSION to 1.11.0-DEV (JuliaLang#50314)

8a4ab11

* Set `VERSION` to `1.11.0-DEV` * move NEWS to HISTORY Co-authored-by: KristofferC <kristoffer.carlsson@juliacomputing.com>

Implement new GC heuristics.

5f36833

add replace(io, str, patterns...) (JuliaLang#48625)

ce1b420

allow show_tuple_as_call to be used for abstract call signature (Ju…

02272f0

…liaLang#50398)

Extend ifelse lifting to regular SROA (JuliaLang#50403)

7fc8646

* Extend ifelse lifting to regular SROA * Fix oracle violation This is a pre-existing bug, but was exposed by my improvements to SROA.

Fix weird dispatch of * with zero arguments (JuliaLang#50411)

d70ee20

Add devdocs/jit.md to the menu (JuliaLang#50310)

63fefe0

Added in JuliaLang#50168

fix conversion of empty Dates.CompoundPeriod() to zero units (Julia…

1279de6

…Lang#50259) * fix(stdlib/Dates/periods.jl): conversion of empty CompoundPeriod to zero units * add(stdlib/Dates/test/periods.jl): add test for empty CompoundPeriod

Remove dynamic dispatch from _wait/wait2 (JuliaLang#50202)

e025877

Co-authored-by: Gabriel Baraldi <baraldigabriel@gmail.com>

Check input expresion in numbered prompt (JuliaLang#50064)

7e3c706

Use tempdir() to store heap snapshot files instead of abspatch ~= roo…

877b368

…tdir (JuliaLang#50026)

Docs: Windows build devdocs clean up (JuliaLang#49760)

929a845

enhance Timer call taking callback to accept any timeout arg and kw…

fb2ceea

…args (JuliaLang#50027)

Add CPU feature helper function (JuliaLang#50402)

435c1c1

gc: fix time unit in jl_print_gc_stats (JuliaLang#50417)

23c0418

Relax some of the atomics for sweeping

2f42cd5

update halfpages pointer after actually sweeping pages (JuliaLang#50387)

c09a199

Merge pull request JuliaLang#50428 from JuliaLang/kf/divinlinecost

2360140

Lower inlining cost of floating point div

avoid potential type-instability in _replace_(str, ...) (JuliaLang#50424

0b54ded

)

Add paragraph on abstract Function fields to performance tips (Juli…

5579566

…aLang#50421) Co-authored-by: Steven G. Johnson <stevenj@mit.edu>

Fix some inference checks in reduce tests (JuliaLang#50437)

77ce343

remove type parameter from AbstractTriangular (JuliaLang#26307)

feb2988

Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>

Add GC metric last_incremental_sweep (JuliaLang#50190)

64ab537

* Add GC metric `last_incremental_sweep` * Update gc.c * Update gc.c

copyuntil(out::IO, in::IO, delim) (JuliaLang#48273)

c14d4bb

Note that this defines the lock order as `out` then `in` for streams which may try to take both locks. This is now a mandatory API convention for all future streams. Co-authored-by: Rafael Fourquet <fourquet.rafael@gmail.com>

Define Base.isstored for Diagonals and Triangular matrices (JuliaLa…

b6bfe98

…ng#50391) X-ref #50377

frankebel and others added 24 commits August 3, 2023 09:24

docs: fix wrong description (JuliaLang#50774)

1b9eeca

The descriptions had `i <= 5` while the code block had `i <=3`.

ssair: compact! constant PiNode (JuliaLang#50768)

b3e8bd0

Currently the `compact!`-ion pass fails to fold constant `PiNode` like `PiNode(0.0, Const(0.0))`. This commit fixes it up.

Support rounding Irrationals (JuliaLang#45598)

881e08b

Optimize PLT and jl_load_and_lookup calls (JuliaLang#50745)

c03a346

re-allow non-string values in ENV get! (JuliaLang#50771)

6dd763b

fixes JuliaLang#50472

Minor refactor to image generation (JuliaLang#50779)

f337c3d

Clarify that trunc/ceil/floor are allowed to round without thro…

2c45e3b

…wing `InexactError`. (JuliaLang#50777)

Model :consistent for compilerbarrier (JuliaLang#50793)

67d600c

inference: continue const-prop' when concrete-eval returns non-inline…

117ef2e

…able (JuliaLang#50618)

fix bit_map! with aliasing (JuliaLang#50781)

3e04129

fixes JuliaLang#50780 caused by JuliaLang#47013.

Move round(T::Type, x) docstring above round(z::Complex, ...) doc…

9f9e989

…string (JuliaLang#50775)

add warning to Iterators.filter about assumptions on predicate (Jul…

8b5e3e9

…iaLang#50497) addresses JuliaLang#50440

Make symbols internal in jl_create_native, and only externalize them …

8b8da91

…when partitioning (JuliaLang#50791)

Test that we emit names properly in code_llvm (JuliaLang#50818)

c40ecd3

Remove libm from versioninfo(). (JuliaLang#50841)

aca081d

We don't really use anything meaningful from libm for this to matter much.

Bump OpenBLAS binaries to use the new GEMM multithreading threshold (J…

626f687

…uliaLang#50844) Detailed discussion and benchmarks by @oscardssmith in JuliaPackaging/Yggdrasil#7189

Small refactorings, make imaging_mode easier to grep for (JuliaLang#5…

275a2a9

…0840)

🤖 [master] Bump the Pkg stdlib from 2c04d5a98 to b044bf6a2 (JuliaLang…

d99f249

…#50851) Co-authored-by: Dilum Aluthge <dilum@aluthge.com>

Smaller sysimage, Pkg excised

3ee14cb

All of Pkg's dependencies seemingly need to be in, at least :Markdown, even with Pkg excised. 30% smaller sysimage is possible, see at: JuliaLang#50833 (comment)

PallHaraldsson mentioned this pull request Aug 9, 2023

Radically smaller sysimage; excise Pkg/REPL from it, and maybe LinearAlgebra (or at least openBLAS) JuliaLang/julia#50833

Closed

PallHaraldsson closed this Aug 9, 2023

PallHaraldsson deleted the PallHaraldsson-patch-1 branch August 9, 2023 21:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pall haraldsson patch 1 #2

Pall haraldsson patch 1 #2

PallHaraldsson commented Aug 9, 2023 •

edited

Loading

oscardssmith commented Aug 9, 2023

PallHaraldsson commented Aug 9, 2023

Pall haraldsson patch 1 #2

Pall haraldsson patch 1 #2

Conversation

PallHaraldsson commented Aug 9, 2023 • edited Loading

oscardssmith commented Aug 9, 2023

PallHaraldsson commented Aug 9, 2023

PallHaraldsson commented Aug 9, 2023 •

edited

Loading