-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allocates when precompiled #35972
Comments
Minimized from
|
As mentioned in JuliaGraphics/Colors.jl#425, this problem is more serious than the impression given by the MWE.
This problem seems to have nothing to do with type instability. The problem occurs even if the concrete type is specified explicitly. As far as I can tell, not only |
The difference in the likelihood of problems depending on the type of function may be due to the difference in the cost of inlining. If so, this could be a variation of #34055. |
IIRC, in the typical cases, this could be avoided by not running However, at least in Julia v1.6.1, there seems to be a case where |
There is a difference in whether the trigger is a precompilation or a JIT compilation, but the phenomenon is similar to #35537. |
FYI, here is an MWE that shows that this is related to inlining. module Issue35972
lg20(x) = x > 0.0 ? -f20(1.0 - x) : 0.0
lg21(x) = x > 0.0 ? -f21(1.0 - x) : 0.0
# inline-worthy
f20(x) = @evalpoly(x,
0.0, 1/1, 1/2, 1/3, 1/4, 1/5, 1/6, 1/7, 1/8, 1/9,
1/10, 1/11, 1/12, 1/13, 1/14, 1/15, 1/16, 1/17, 1/18, 1/19,
1/20)
# not inline-worthy
f21(x) = @evalpoly(x,
0.0, 1/1, 1/2, 1/3, 1/4, 1/5, 1/6, 1/7, 1/8, 1/9,
1/10, 1/11, 1/12, 1/13, 1/14, 1/15, 1/16, 1/17, 1/18, 1/19,
1/20, 1/21)
precompile(lg20, (Float64,))
precompile(lg21, (Float64,))
end # module julia> using Issue35972
julia> x = rand(1_000_000);
julia> Issue35972.lg20.(x);
julia> Issue35972.lg21.(x);
julia> @time Issue35972.lg20.(x);
0.004678 seconds (4 allocations: 7.629 MiB)
julia> @time Issue35972.lg21.(x);
0.022517 seconds (2.00 M allocations: 38.147 MiB)
julia> versioninfo()
Julia Version 1.8.0-DEV.333
Commit c12e63fba3 (2021-08-11 02:09 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, tigerlake) |
In the demo in the OP (and every other one I've tried), running the second
Also worth noting that the line number for the statement after the |
This is also interesting: julia> using M, BenchmarkTools, Profile
julia> @code_llvm M.f(3.2)
; @ /tmp/pkgs/M/src/M.jl:3 within `f`
define double @julia_f_273(double %0) #0 {
top:
%1 = alloca {}*, align 8
%gcframe7 = alloca [3 x {}*], align 16
%gcframe7.sub = getelementptr inbounds [3 x {}*], [3 x {}*]* %gcframe7, i64 0, i64 0
%2 = bitcast [3 x {}*]* %gcframe7 to i8*
call void @llvm.memset.p0i8.i32(i8* nonnull align 16 dereferenceable(24) %2, i8 0, i32 24, i1 false)
%thread_ptr = call i8* asm "movq %fs:0, $0", "=r"() #6
%ppgcstack_i8 = getelementptr i8, i8* %thread_ptr, i64 -8
%ppgcstack = bitcast i8* %ppgcstack_i8 to {}****
%pgcstack = load {}***, {}**** %ppgcstack, align 8
; ┌ @ operators.jl:378 within `>`
; │┌ @ float.jl:490 within `<` @ float.jl:444
%3 = bitcast [3 x {}*]* %gcframe7 to i64*
store i64 4, i64* %3, align 16
%4 = getelementptr inbounds [3 x {}*], [3 x {}*]* %gcframe7, i64 0, i64 1
%5 = bitcast {}** %4 to {}***
%6 = load {}**, {}*** %pgcstack, align 8
store {}** %6, {}*** %5, align 8
%7 = bitcast {}*** %pgcstack to {}***
store {}** %gcframe7.sub, {}*** %7, align 8
%8 = fcmp ule double %0, 0.000000e+00
; └└
br i1 %8, label %L8, label %L6
L6: ; preds = %top
%ptls_field3 = getelementptr inbounds {}**, {}*** %pgcstack, i64 2305843009213693954
%9 = bitcast {}*** %ptls_field3 to i8**
%ptls_load45 = load i8*, i8** %9, align 8
%10 = call noalias nonnull {}* @jl_gc_pool_alloc(i8* %ptls_load45, i32 1392, i32 16) #2
%11 = bitcast {}* %10 to i64*
%12 = getelementptr inbounds i64, i64* %11, i64 -1
store atomic i64 140704265234768, i64* %12 unordered, align 8
%13 = bitcast {}* %10 to double*
store double %0, double* %13, align 8
%14 = getelementptr inbounds [3 x {}*], [3 x {}*]* %gcframe7, i64 0, i64 2
store {}* %10, {}** %14, align 16
store {}* %10, {}** %1, align 8
%15 = call nonnull {}* @jl_invoke({}* inttoptr (i64 140704310911760 to {}*), {}** nonnull %1, i32 1, {}* inttoptr (i64 140704435100720 to {}*))
%16 = bitcast {}* %15 to double*
%17 = load double, double* %16, align 8
%18 = load {}*, {}** %4, align 8
%19 = bitcast {}*** %pgcstack to {}**
store {}* %18, {}** %19, align 8
ret double %17
L8: ; preds = %top
%20 = load {}*, {}** %4, align 8
%21 = bitcast {}*** %pgcstack to {}**
store {}* %20, {}** %21, align 8
ret double 0.000000e+00
}
julia> x2 = fill(1.0, 1000000);
julia> @btime M.f.($x2);
24.999 ms (2000002 allocations: 38.15 MiB)
julia> @code_llvm M.f(3.2)
; @ /tmp/pkgs/M/src/M.jl:3 within `f`
define double @julia_f_457(double %0) #0 {
top:
; ┌ @ operators.jl:378 within `>`
; │┌ @ float.jl:490 within `<` @ float.jl:444
%1 = fcmp ule double %0, 0.000000e+00
; └└
br i1 %1, label %L8, label %L6
L6: ; preds = %top
%2 = call double @j_cbrt_459(double %0) #0
ret double %2
L8: ; preds = %top
ret double 0.000000e+00
} Changing that first |
Partial progress: the difference seems to come down to the fact that we get to Lines 3696 to 3699 in eb83c4d
handled = false when it's loaded from the cache, but with handled = true when we define the module in-session. This is because the CodeInstance isn't cached (it's not internal ) and so Line 3654 in eb83c4d
jl_nothing .
|
And more progress. With this diff that puts debugging statements in both codegen and type-inference: diff --git a/base/compiler/typeinfer.jl b/base/compiler/typeinfer.jl
index a3895c380f..052c0a4baa 100644
--- a/base/compiler/typeinfer.jl
+++ b/base/compiler/typeinfer.jl
@@ -850,9 +850,12 @@ end
# compute (and cache) an inferred AST and return type
function typeinf_ext(interp::AbstractInterpreter, mi::MethodInstance)
method = mi.def::Method
+ println(mi)
for i = 1:2 # test-and-lock-and-test
i == 2 && ccall(:jl_typeinf_begin, Cvoid, ())
code = get(code_cache(interp), mi, nothing)
+ println(i)
+ println(code)
if code isa CodeInstance
# see if this code already exists in the cache
inf = code.inferred
@@ -895,14 +898,22 @@ function typeinf_ext(interp::AbstractInterpreter, mi::MethodInstance)
end
end
if ccall(:jl_get_module_infer, Cint, (Any,), method.module) == 0
+ println("retrieving ", mi)
return retrieve_code_info(mi)
end
+ println("inferring ", mi)
lock_mi_inference(interp, mi)
frame = InferenceState(InferenceResult(mi), #=cached=#true, interp)
frame === nothing && return nothing
typeinf(interp, frame)
ccall(:jl_typeinf_end, Cvoid, ())
frame.src.inferred || return nothing
+ print("for ", mi, ", the cache is ")
+ if isdefined(mi, :cache)
+ println(typeof(mi.cache.inferred))
+ else
+ println("not cached")
+ end
return frame.src
end
diff --git a/src/codegen.cpp b/src/codegen.cpp
index d05080ee42..022d1958bb 100644
--- a/src/codegen.cpp
+++ b/src/codegen.cpp
@@ -3622,6 +3622,7 @@ static jl_cgval_t emit_invoke(jl_codectx_t &ctx, jl_expr_t *ex, jl_value_t *rt)
bool handled = false;
jl_cgval_t result;
+ jl_(ex);
if (lival.constant) {
jl_method_instance_t *mi = (jl_method_instance_t*)lival.constant;
assert(jl_is_method_instance(mi));
@@ -3643,6 +3644,7 @@ static jl_cgval_t emit_invoke(jl_codectx_t &ctx, jl_expr_t *ex, jl_value_t *rt)
else {
jl_value_t *ci = ctx.params->lookup(mi, ctx.world, ctx.world); // TODO: need to use the right pair world here
jl_code_instance_t *codeinst = (jl_code_instance_t*)ci;
+ jl_printf(JL_STDOUT, "Is the CodeInstance lookup nothing? %d\n", ci == jl_nothing);
if (ci != jl_nothing && codeinst->invoke != jl_fptr_sparam) { // check if we know we definitely can't handle this specptr
if (codeinst->invoke == jl_fptr_const_return) {
result = mark_julia_const(codeinst->rettype_const); with the debugging to Core.Compiler not built (hit Ctrl-C after using Issue35972, MethodAnalysis
cd("/home/tim/src/julia-master/base")
include("compiler/compiler.jl")
x = rand(1_000_000);
map(Issue35972.lg21, x); # map is less compile-intensive than broadcast yields the following final section:
but then next: mi = methodinstance(evalpoly, (Float64, NTuple{22,Float64}))
julia> mi.cache
Core.CodeInstance(MethodInstance for evalpoly(::Float64, ::NTuple{22, Float64}), #undef, 0x000000000000360b, 0xffffffffffffffff, Float64, #undef, nothing, true, true, Ptr{Nothing} @0x00007f0400263df0, Ptr{Nothing} @0x00007f0400263d50) The inferring collect_similar(Array{Float64, 1}, Base.Generator{Array{Float64, 1}, typeof(Issue35972.lg21)}) from collect_similar(Any, Any)
for collect_similar(Array{Float64, 1}, Base.Generator{Array{Float64, 1}, typeof(Issue35972.lg21)}) from collect_similar(Any, Any), the cache is Array{UInt8, 1}
Expr(:invoke, _collect(Array{Float64, 1}, Base.Generator{Array{Float64, 1}, typeof(Issue35972.lg21)}, Base.EltypeUnknown, Base.HasShape{1}) from _collect(Any, Any, Base.EltypeUnknown, Union{Base.HasLength, Base.HasShape{N} where N}), Base._collect, Core.Argument(n=2), Core.Argument(n=3), quote Base.EltypeUnknown() end, quote Base.HasShape{1}() end)
Is the CodeInstance lookup nothing? 0
Expr(:invoke, evalpoly(Float64, Tuple{Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64}) from evalpoly(Any, Tuple), Base.Math.evalpoly, SSAValue(75), (0, 1, 0.5, 0.333333, 0.25, 0.2, 0.166667, 0.142857, 0.125, 0.111111, 0.1, 0.0909091, 0.0833333, 0.0769231, 0.0714286, 0.0666667, 0.0625, 0.0588235, 0.0555556, 0.0526316, 0.05, 0.047619))
Is the CodeInstance lookup nothing? 0
Expr(:invoke, evalpoly(Float64, Tuple{Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64, Float64}) from evalpoly(Any, Tuple), Base.Math.evalpoly, SSAValue(24), (0, 1, 0.5, 0.333333, 0.25, 0.2, 0.166667, 0.142857, 0.125, 0.111111, 0.1, 0.0909091, 0.0833333, 0.0769231, 0.0714286, 0.0666667, 0.0625, 0.0588235, 0.0555556, 0.0526316, 0.05, 0.047619))
Is the CodeInstance lookup nothing? 0 |
vs when it's loaded from the precompile cache:
Guess: since it already has the MethodInstance it's (inappropriately) not inferring |
(A briefer version of this was also posted to slack internals) To help decide how to fix this (either (1) run inference before codegen, or (2) cache the needed Base diff --git a/src/dump.c b/src/dump.c
index 8c9d68f518..ce42448159 100644
--- a/src/dump.c
+++ b/src/dump.c
@@ -505,6 +505,8 @@ enum METHOD_SERIALIZATION_MODE {
METHOD_EXTERNAL_MT = 2,
};
+static int codeinst_count[3]; // number of external, thunk, and workslist code instances, respectively
+
static void jl_serialize_value_(jl_serializer_state *s, jl_value_t *v, int as_literal) JL_GC_DISABLED
{
if (jl_serialize_generic(s, v)) {
@@ -691,6 +693,7 @@ static void jl_serialize_value_(jl_serializer_state *s, jl_value_t *v, int as_li
internal = 1;
else if (module_in_worklist(mi->def.method->module))
internal = 2;
+ codeinst_count[internal] += 1;
write_uint8(s->s, internal);
if (!internal) {
// also flag this in the backref table as special
@@ -2240,6 +2243,7 @@ JL_DLLEXPORT int jl_save_incremental(const char *fname, jl_array_t *worklist)
return 1;
}
JL_GC_PUSH2(&mod_array, &udeps);
+ codeinst_count[0] = codeinst_count[1] = codeinst_count[2] = 0;
mod_array = jl_get_loaded_modules();
serializer_worklist = worklist;
@@ -2349,6 +2353,7 @@ JL_DLLEXPORT int jl_save_incremental(const char *fname, jl_array_t *worklist)
}
write_int32(&f, 0); // mark the end of the source text
ios_close(&f);
+ jl_printf(JL_STDOUT, "external: %d; module-thunk: %d; worklist: %d\n", codeinst_count[0], codeinst_count[1], codeinst_count[2]);
JL_GC_POP();
return 0; Results: https://gist.github.com/timholy/502a5a545a69f492b6180050a0b745e6 (the packages were chosen as ones that now have substantial precompilation) |
I don't fully understand the precompilation, but empirically it seems to me that the inference is context dependent. If external methods are cached by individual packages, will the behavior change depending on the order of loading? I agree with the idea of caching more So, IMO, it would be better to achieve (1) first. |
xref #42016 |
This was from an effort to play around with precompilation some more. There's a tricky balance where if we precompile too much of `CSV.File`, then we run into the Base bug where there are Base methods missing and cause painful allocations via invoke (JuliaLang/julia#35972). The state of precompilation for CSV.jl is thus pretty disappointing: precompiling currently takes ~7.5s (which is fine, this is the number we're fine to have keep growing), loading CSV.jl precompiled takes ~2.5s (seems a little high, but not terrible), but the first call to `CSV.File` takes ~6.5s :sad_face:. At this point, I've spent too much time and don't understand the ins and outs of precompilation interactions with later runtime code to debug further, but hopefully we can recruit help from some experts to take CSV.jl precompilation to the next level. As-is, it's workable, so let's go with this for now and get this darn release out.
This was from an effort to play around with precompilation some more. There's a tricky balance where if we precompile too much of `CSV.File`, then we run into the Base bug where there are Base methods missing and cause painful allocations via invoke (JuliaLang/julia#35972). The state of precompilation for CSV.jl is thus pretty disappointing: precompiling currently takes ~7.5s (which is fine, this is the number we're fine to have keep growing), loading CSV.jl precompiled takes ~2.5s (seems a little high, but not terrible), but the first call to `CSV.File` takes ~6.5s :sad_face:. At this point, I've spent too much time and don't understand the ins and outs of precompilation interactions with later runtime code to debug further, but hopefully we can recruit help from some experts to take CSV.jl precompilation to the next level. As-is, it's workable, so let's go with this for now and get this darn release out.
Putting the string interpolation behind a `@noinline` is the key change, but it doesn't hurt to hoise the range creation. Of the few remaining allocations, about half will be fixed by JuliaLang/julia#43990, as they are another instance of JuliaLang/julia#35972
Putting the string interpolation behind a `@noinline` is the key change, but it doesn't hurt to hoist the range creation. Of the few remaining allocations, about half will be fixed by JuliaLang/julia#43990, as they are another instance of JuliaLang/julia#35972
Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations needed from other code (Base, other packages) were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred. This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes #42016 Fixes #35972 Issue #35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation.
commit 88a1cbf Author: Tim Holy <tim.holy@gmail.com> Date: Sat Feb 19 03:14:45 2022 -0600 Copy over CodeInstances too commit 48c91b3 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 14:22:02 2022 -0600 Exclude MethodInstances that don't link to worklist module This should prevent us from serializing too much code. commit 3241c4c Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 12:23:12 2022 -0600 Add invalidation test commit ead1fd9 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 10:23:52 2022 -0600 Fix a failure to invalidate commit b44a8fc Author: Tim Holy <tim.holy@gmail.com> Date: Thu Jan 27 02:54:47 2022 -0600 Serialize external CodeInstances Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations needed from other code (Base, other packages) were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred. This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes JuliaLang#42016 Fixes JuliaLang#35972 Issue JuliaLang#35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation.
commit 88a1cbf Author: Tim Holy <tim.holy@gmail.com> Date: Sat Feb 19 03:14:45 2022 -0600 Copy over CodeInstances too commit 48c91b3 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 14:22:02 2022 -0600 Exclude MethodInstances that don't link to worklist module This should prevent us from serializing too much code. commit 3241c4c Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 12:23:12 2022 -0600 Add invalidation test commit ead1fd9 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 10:23:52 2022 -0600 Fix a failure to invalidate commit b44a8fc Author: Tim Holy <tim.holy@gmail.com> Date: Thu Jan 27 02:54:47 2022 -0600 Serialize external CodeInstances Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations needed from other code (Base, other packages) were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred. This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes JuliaLang#42016 Fixes JuliaLang#35972 Issue JuliaLang#35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation.
commit 88a1cbf Author: Tim Holy <tim.holy@gmail.com> Date: Sat Feb 19 03:14:45 2022 -0600 Copy over CodeInstances too commit 48c91b3 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 14:22:02 2022 -0600 Exclude MethodInstances that don't link to worklist module This should prevent us from serializing too much code. commit 3241c4c Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 12:23:12 2022 -0600 Add invalidation test commit ead1fd9 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 10:23:52 2022 -0600 Fix a failure to invalidate commit b44a8fc Author: Tim Holy <tim.holy@gmail.com> Date: Thu Jan 27 02:54:47 2022 -0600 Serialize external CodeInstances Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations needed from other code (Base, other packages) were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred. This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes JuliaLang#42016 Fixes JuliaLang#35972 Issue JuliaLang#35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation.
commit 88a1cbf Author: Tim Holy <tim.holy@gmail.com> Date: Sat Feb 19 03:14:45 2022 -0600 Copy over CodeInstances too commit 48c91b3 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 14:22:02 2022 -0600 Exclude MethodInstances that don't link to worklist module This should prevent us from serializing too much code. commit 3241c4c Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 12:23:12 2022 -0600 Add invalidation test commit ead1fd9 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 10:23:52 2022 -0600 Fix a failure to invalidate commit b44a8fc Author: Tim Holy <tim.holy@gmail.com> Date: Thu Jan 27 02:54:47 2022 -0600 Serialize external CodeInstances Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations needed from other code (Base, other packages) were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred. This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes JuliaLang#42016 Fixes JuliaLang#35972 Issue JuliaLang#35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation.
commit 88a1cbf Author: Tim Holy <tim.holy@gmail.com> Date: Sat Feb 19 03:14:45 2022 -0600 Copy over CodeInstances too commit 48c91b3 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 14:22:02 2022 -0600 Exclude MethodInstances that don't link to worklist module This should prevent us from serializing too much code. commit 3241c4c Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 12:23:12 2022 -0600 Add invalidation test commit ead1fd9 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 10:23:52 2022 -0600 Fix a failure to invalidate commit b44a8fc Author: Tim Holy <tim.holy@gmail.com> Date: Thu Jan 27 02:54:47 2022 -0600 Serialize external CodeInstances Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations needed from other code (Base, other packages) were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred. This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes JuliaLang#42016 Fixes JuliaLang#35972 Issue JuliaLang#35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation.
commit 88a1cbf Author: Tim Holy <tim.holy@gmail.com> Date: Sat Feb 19 03:14:45 2022 -0600 Copy over CodeInstances too commit 48c91b3 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 14:22:02 2022 -0600 Exclude MethodInstances that don't link to worklist module This should prevent us from serializing too much code. commit 3241c4c Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 12:23:12 2022 -0600 Add invalidation test commit ead1fd9 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 10:23:52 2022 -0600 Fix a failure to invalidate commit b44a8fc Author: Tim Holy <tim.holy@gmail.com> Date: Thu Jan 27 02:54:47 2022 -0600 Serialize external CodeInstances Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations needed from other code (Base, other packages) were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred. This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes JuliaLang#42016 Fixes JuliaLang#35972 Issue JuliaLang#35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation.
commit 88a1cbf Author: Tim Holy <tim.holy@gmail.com> Date: Sat Feb 19 03:14:45 2022 -0600 Copy over CodeInstances too commit 48c91b3 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 14:22:02 2022 -0600 Exclude MethodInstances that don't link to worklist module This should prevent us from serializing too much code. commit 3241c4c Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 12:23:12 2022 -0600 Add invalidation test commit ead1fd9 Author: Tim Holy <tim.holy@gmail.com> Date: Thu Feb 17 10:23:52 2022 -0600 Fix a failure to invalidate commit b44a8fc Author: Tim Holy <tim.holy@gmail.com> Date: Thu Jan 27 02:54:47 2022 -0600 Serialize external CodeInstances Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations needed from other code (Base, other packages) were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred. This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes JuliaLang#42016 Fixes JuliaLang#35972 Issue JuliaLang#35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation.
Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations of methods from Base or previously-loaded packages were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred, as long as there is a backedge linking it back to a method owned by a module being precompiled. (The latter condition ensures it will actually be called by package methods, and not merely transiently generated for the purpose of, e.g., metaprogramming or variable initialization.) This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes #42016 Fixes #35972 Issue #35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation.
Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations of methods from Base or previously-loaded packages were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred, as long as there is a backedge linking it back to a method owned by a module being precompiled. (The latter condition ensures it will actually be called by package methods, and not merely transiently generated for the purpose of, e.g., metaprogramming or variable initialization.) This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes #42016 Fixes #35972 Issue #35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation. (cherry picked from commit df81bf9)
Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations of methods from Base or previously-loaded packages were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred, as long as there is a backedge linking it back to a method owned by a module being precompiled. (The latter condition ensures it will actually be called by package methods, and not merely transiently generated for the purpose of, e.g., metaprogramming or variable initialization.) This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes JuliaLang#42016 Fixes JuliaLang#35972 Issue JuliaLang#35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation.
Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations of methods from Base or previously-loaded packages were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred, as long as there is a backedge linking it back to a method owned by a module being precompiled. (The latter condition ensures it will actually be called by package methods, and not merely transiently generated for the purpose of, e.g., metaprogramming or variable initialization.) This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes #42016 Fixes #35972 Issue #35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation. (cherry picked from commit df81bf9)
Prior to this PR, Julia's precompiled `*.ji` files saved just two categories of code: unspecialized method definitions and type-specialized code for the methods defined by the package. Any novel specializations of methods from Base or previously-loaded packages were not saved, and therefore effectively thrown away. This PR caches all the code---internal or external---called during package definition that hadn't been previously inferred, as long as there is a backedge linking it back to a method owned by a module being precompiled. (The latter condition ensures it will actually be called by package methods, and not merely transiently generated for the purpose of, e.g., metaprogramming or variable initialization.) This makes precompilation more intuitive (now it saves all relevant inference results), and substantially reduces latency for inference-bound packages. Closes JuliaLang#42016 Fixes JuliaLang#35972 Issue JuliaLang#35972 arose because codegen got started without re-inferring some discarded CodeInstances. This forced the compiler to insert a `jl_invoke`. This PR fixes the issue because needed CodeInstances are no longer discarded by precompilation.
This still seems to happen with FunctionWrappers. Here's my test package: module FWAllocs
using PrecompileTools
using FunctionWrappers
function test_precomp(N)
fw = FunctionWrappers.FunctionWrapper{Float64, Tuple{Float64}}(cos)
res = 0.0
for i in 1:N
res += fw(i*0.1)
end
return res
end
function test_no_precomp(N)
fw = FunctionWrappers.FunctionWrapper{Float64, Tuple{Float64}}(cos)
res = 0.0
for i in 1:N
res += fw(i*0.1)
end
return res
end
@setup_workload begin
@compile_workload begin
test_precomp(10)
end
end
end and here's what happens on Julia 1.10.4: julia> using FWAllocs
julia> @time FWAllocs.test_precomp(10)
0.053401 seconds (96.14 k allocations: 5.794 MiB, 99.86% compilation time)
8.177847573818267
julia> @time FWAllocs.test_precomp(10)
0.000004 seconds (22 allocations: 376 bytes)
8.177847573818267
julia> @time FWAllocs.test_precomp(20)
0.000011 seconds (42 allocations: 696 bytes)
8.377322108212503
julia> @time FWAllocs.test_no_precomp(20)
0.000007 seconds (2 allocations: 56 bytes)
8.377322108212503
julia> Also, is it odd that |
I confirmed that the julia> versioninfo()
Julia Version 1.12.0-DEV.738
Commit 67c9989e83 (2024-06-15 01:40 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 8 × 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
WORD_SIZE: 64
LLVM: libLLVM-17.0.6 (ORCJIT, tigerlake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores) Since the original problem with this issue seems to have been resolved, I think there is something else causing it. |
Of course, since |
If I make a package
M
with this content ofsrc/M.jl
:and run
the result is
Without
precompile(f, (Float64,))
the excessive allocations go away.The text was updated successfully, but these errors were encountered: