Allow inlining methods with unmatched type parameters #45062

ianatol · 2022-04-22T22:55:38Z

Rebase of #44656 to solve some inference regressions in #44803

As Keno predicted, there are some regressions in this PR as well, including quite a few test suite failures that I will be fixing up.

Additionally, some needed future work would be to teach SROA how to fold svec_ref more generally, as it is currently very specific

The original description follows below.

Currently we do not allow inlining any methods that have unmatched
type parameters. The original reason for this restriction is that
I didn't really know what to put for an inlined :static_parameter,
so I just had inlining bail. As a result, in code like:

f(x) = Val{x}()

the call to Val{x}() would not be inlined unless x was known
through constant propagation.

This PR attempts to remidy that. A new builtin is added that computes
the static parameters for a given method/argument list. Additionally,
sroa gains the ability to simplify and fold this builtin. As a result,
inlining can insert an expression that computes the correct values
for the inlinees static parameters.

The change benchmarks favorably:

Before:

julia> function foo()
          for i = 1:10000
              Base.donotdelete(Val{i}())
          end
       end
foo (generic function with 1 method)

julia> @time foo()
  0.375567 seconds (4.24 M allocations: 274.440 MiB, 14.67% gc time, 72.96% compilation time)

julia> @time foo()
  0.012387 seconds (9.49 k allocations: 148.266 KiB)

After:

julia> function foo()
          for i = 1:10000
              Base.donotdelete(Val{i}())
          end
       end
foo (generic function with 1 method)

julia> @time foo()
  0.003058 seconds (29.47 k allocations: 1.546 MiB)

julia> @time foo()
  0.001200 seconds (9.49 k allocations: 148.266 KiB)

Note that this particular benchmark could also be fixed by #44654,
but this change is more general.

There is a potential downside, which is that we remove a specialization
barrier here. We already do that in the case when all type parameters
are matched, so it's not eggregious. However, there is anectodal
knowledge in the community that extra type parameters force specialization.
Some of that is due to the handling of type parameters in the specialization
code, but some of it might also be due to inlining's prior refusal
to perform this inlining. We'll have to keep an eye out for any
regressions.

ianatol · 2022-04-27T02:00:22Z

Re: partial reverting of #44512:

choltype from LinearAlgebra was causing BoundsErrors during test suite runs.

Looking further, it appeared that in the following IR,

Expr(:call, Main.eltype, Core.Argument(n=2)),
  Expr(:call, Main.oneunit, SSAValue(1)),
  Expr(:call, Main.sqrt, SSAValue(2)),
  Expr(:call, Main.typeof, SSAValue(3)),
  Expr(:call, Main.promote_type, SSAValue(4), Main.Float32),
  Core.ReturnNode(val=SSAValue(5))]

we were choosing to inline promote_type with the following match: promote_type(DataType, Type{Float32}) from promote_type(Type{T}, Type{T}) where {T}.

This triggered _compute_sparams where
m = promote_type(Type{T}, Type{T}) where {T}
tt = Tuple{typeof(Base.promote_type), Type{Float64}, Type{Float32}}, which resulted in env = svec(), and our BoundsError.

Keno helped me to figure out that this was due to inlining not being able to discern which signature was right due to the spec_types of promote_type(Type{T}, Type{T}) where {T} and promote_type(Type{T}, Type{S}) where {T,S} being identical in our case. However, DataType in promote_type(DataType, Type{Float32}) would later become Float64, meaning that this selection was actually incorrect.

By bringing back this branch that was removed in #44512, we were able to be a bit more fine-grained about when we could bypass validating sparams (as well as choosing not to inline when we had a direct match of sig and spec_types).

ianatol · 2022-04-27T20:10:49Z

@nanosoldier runtests(ALL, vs = ":master")

nanosoldier · 2022-04-28T03:57:32Z

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

ianatol · 2022-04-28T16:19:11Z

@nanosoldier runtests(["AlgebraicPetri", "AlphaStableDistributions", "Earth2014", "Evolutionary", "FameSVD", "FlashWeave", "FractionalSystems", "GeoDatasets", "GeometricFlux", "Glimmer", "GraphSignals", "GridapEmbedded", "GridapGmsh", "Hecke", "Infernal", "InformationGeometry", "IntensityScans", "JSONLines", "JetPack", "Kahuna", "LoggingExtras", "MatrixMarket", "Metida", "MultiScaleTreeGraph", "NeuralArithmetic", "NeuralOperators", "ODEInterface", "ODEInterfaceDiffEq", "OptimKit", "OptimizationAlgorithms", "Oracle", "Org", "PDENLPModels", "ParallelAnalysis", "PhaseSpaceTools", "Plasma", "PoreMatMod", "Quadrature", "Relief", "StochasticRounding", "VIDA", "ValueShapes", "YAAD"], vs = ":master")

nanosoldier · 2022-04-28T17:31:31Z

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

ianatol · 2022-04-28T22:27:50Z

GeometricFlux, GraphSignals, GridapEmbedded, GridapGmsh, JetPack, NeuralOperators, Org, PDENLPModels, and ValueShapes are all caused by the same issue with FillArrays where we end up with compute_sparams on

Tuple{Type{FillArrays.Fill{T, N, Axes} where Axes where N where T}, T, Vararg{Integer, N}} where N where T
and
Tuple{Type{FillArrays.Fill{T, N, Axes} where Axes where N where T}, Float32, Tuple{Int64}}

Seems similar to the case listed above, but still investigating where a possibly incorrect inlining choice happens

ianatol · 2022-05-02T21:53:57Z

@nanosoldier runtests(["Evolutionary", "GeometricFlux", "GraphSignals", "GridapEmbedded", "GridapGmsh", "Hecke", "InformationGeometry", "JetPack", "Metida", "NeuralOperators", "ODEInterface", "ODEInterfaceDiffEq", "Org", "PDENLPModels", "ValueShapes"], vs = ":master")

nanosoldier · 2022-05-02T23:05:22Z

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

ianatol · 2022-05-05T00:12:06Z

@nanosoldier runtests(["Hecke", "Metida", "ODEInterface", "Org"], vs = ":master")

ianatol · 2022-05-05T00:15:56Z

@aviatesk Could I ask for your review here? Specifically on 3a6b36c where I revert some of #44512, as well as on 884e5ea, where I somewhat broadly restrict the cases in which we can perform this inlining. It feels like we could do better, but I'm not familiar enough with this part of the codebase to fully understand

nanosoldier · 2022-05-05T01:20:31Z

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

ianatol · 2022-05-05T22:52:42Z

Re: test failures in Metida and ODEInterface:

The ccall and cfunctions showing up in the test aren't having their types properly calculated during inlining since we are entering ssa_substitute_op! while spvals::SSAValue.

My idea is that we can do something similar to svec_ref, where we insert a node (in this case a call to jl_instantiate_type_in_env) where spvals is still an SSAValue. Then, before emitting the substituted ccall, we replace spvals in our inserted nodes with our calculation of the intersection.

aviatesk · 2022-05-06T05:44:13Z

@aviatesk Could I ask for your review here? Specifically on 3a6b36c where I revert some of #44512, as well as on 884e5ea, where I somewhat broadly restrict the cases in which we can perform this inlining. It feels like we could do better, but I'm not familiar enough with this part of the codebase to fully understand

Sure, I will look into this later today or the weekend.

base/compiler/ssair/inlining.jl

base/compiler/ssair/ir.jl

base/compiler/ssair/inlining.jl

base/compiler/ssair/passes.jl

aviatesk · 2022-05-06T09:09:24Z

@nanosoldier runbenchmarks(!"scalar", vs=":master")

aviatesk · 2022-05-06T09:14:48Z

test/compiler/inline.jl

+# basic tests for inlining of `apply_type` in the presence of unmatched type parameters
+f44656(x) = Val{x}()
+
+function g44656()
+    for i = 1:10000
+        Base.donotdelete(Val{i}())
+    end
+end
+
+let srcs = (code_typed1(f44656, (Any,)),
+            code_typed1(g44656))
+    for src in srcs
+        @test count(isnew, src.code) == 1
+        @test count(iscall((src, Core.apply_type), src.code)) == 0
+    end
+end


We should add more test cases. E.g. since this PR adds a change on the inlining pass for constant prop'ed callsite, we should have a corresponding test case.

base/compiler/ssair/inlining.jl

aviatesk · 2022-05-06T09:23:46Z

By bringing back this branch that was removed in #44512, we were able to be a bit more fine-grained about when we could bypass validating sparams (as well as choosing not to inline when we had a direct match of sig and spec_types).

#44512 removed the previous (somewhat random imho) special casings, and you can recover some if needed. But I'd like to revert it in a way at least we understand which optimizable cases handled by #44512 are now ignored, and which are still optimized. Rather as far as I understand I believe what essentially needed is the only_method information specifically, and it can co-exist with the code after #44512?

nanosoldier · 2022-05-06T15:32:22Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

vtjnash · 2022-09-09T18:14:57Z

base/compiler/ssair/inlining.jl

+            nonva_args = argexprs[1:end-1]
+            va_arg = argexprs[end]
+            tuple_call = Expr(:call, TOP_TUPLE, def, nonva_args...)
+            tuple_type = tuple_tfunc(Any[argextype(arg, compact) for arg in nonva_args])
+            tupl = insert_node_here!(compact, NewInstruction(tuple_call, tuple_type, topline))
+            apply_iter_expr = Expr(:call, Core._apply_iterate, iterate, Core._compute_sparams, tupl, va_arg)
+            sparam_vals = insert_node_here!(compact,
+                effect_free(NewInstruction(apply_iter_expr, SimpleVector, topline)))


this seems like it would add significant cost (and allocations and new DataType's) that were not present before inlining? are we sure this is profitable?

Prior to the fix_va_argexprs! call that you have above, the argexprs list was required to be a known length and did not have or need a va_arg

Addressed in #46700.

vtjnash · 2022-09-09T18:19:39Z

base/compiler/ssair/inlining.jl

+        elseif head === :foreigncall && isa(spvals, SimpleVector)
            @assert !isa(spsig, UnionAll) || !isempty(spvals)


shouldn't this be an assert? if it managed to get this far, but didn't run this fixup code here, it will generate corrupt code later and possibly segfault later at runtime

I'm assuming you're talking about the isa? If so, I don't think so - there can be :foreigncalls that got inlined from a callee that don't need the fixup here (though it's not harmful either).

vtjnash · 2022-09-12T17:31:45Z

base/compiler/ssair/passes.jl

@@ -720,6 +720,97 @@ function perform_lifting!(compact::IncrementalCompact,
    return stmt_val # N.B. should never happen
 end

+function lift_svec_ref!(compact::IncrementalCompact, idx::Int, stmt::Expr)


This should probably be removed and replaced with an inference tfunc for _svec_ref, since it is rather unfortunate (generally) when optimization ends up with a more precise answer than inference; it tends to confuse and annoy people, since it is usually less reliable and accurate.

This lifts svec_ref(_compute_sparams(...), ...) which is inserted by inlining and does not exist at inference time.

It seems possible that the optimizer could have already lifted that static_parameter expr into a typeof call inside the method, before it reached here. Is the optimizer not accurately marking the svec_ref with the result type constant (if known from inference) when it inserts it? Still seems confusing that inference couldn't determine this static_parameter value when the IR was simple, but optimization could figure it out after the IR was made more complicated.

If the optimizer knows the type of the static parameter, it shouldn't insert the svec_ref (it currently does on master, but that's a bug fixed in #46703). Note that the optimizer here is not figuring out the value of the static parameters, but merely an SSAValue that will contain that value at runtime, which is a different analysis.

Ah, okay, I thought this was attempting to forward the type. It makes more sense that this is attempting to lift the computation of T through a call that just gets the parameter back directly, e.g. eltype(A{T}) -> T

vtjnash · 2022-09-12T17:33:48Z

base/compiler/ssair/passes.jl

+    m = m.val
+    isa(m, Method) || return nothing
+    # TODO: More general structural analysis of the intersection
+    length(def.args) >= 3 || return nothing


this must be earlier in the function, before you access def.args[2]

vtjnash · 2022-09-12T17:47:34Z

base/compiler/ssair/passes.jl

+    i = findfirst(j->has_typevar(sig.parameters[j], tvar), 1:length(sig.parameters))
+    i === nothing && return nothing
+    _any(j->has_typevar(sig.parameters[j], tvar), i+1:length(sig.parameters)) && return nothing
+
+    arg = sig.parameters[i]
+    isa(arg, DataType) || return nothing
+
+    rarg = def.args[2 + i]
+    isa(rarg, SSAValue) || return nothing
+    argdef = compact[rarg][:inst]
+    if isexpr(argdef, :new)
+        rarg = argdef.args[1]
+        isa(rarg, SSAValue) || return nothing
+        argdef = compact[rarg][:inst]
+    end
+
+    is_known_call(argdef, Core.apply_type, compact) || return nothing
+    length(argdef.args) == 3 || return nothing
+
+    applyT = argextype(argdef.args[2], compact)
+    isa(applyT, Const) || return nothing
+    applyT = applyT.val
+
+    isa(applyT, UnionAll) || return nothing
+    applyTvar = applyT.var
+    applyTbody = applyT.body
+
+    isa(applyTbody, DataType) || return nothing
+    applyTbody.name == arg.name || return nothing
+    length(applyTbody.parameters) == length(arg.parameters) == 1 || return nothing
+    applyTbody.parameters[1] === applyTvar || return nothing
+    arg.parameters[1] === tvar || return nothing
+    return argdef.args[3]


I need to think about this logic more, but this code needs substantial comment about the intent here. As written, I don't think the structural assumptions here are valid outside of trivial common cases (though it appears to try to limit itself to only the most trivial of trivial cases). But with so sparse comments, it is hard to review whether it is doing only what it intends and not including cases that it won't handle properly

vtjnash · 2022-09-12T17:48:48Z

base/compiler/ssair/passes.jl

+            end
+            return
+        elseif is_known_call(def, Core._compute_sparams, compact)
+            res = _lift_svec_ref(def, compact)


how can you lift an svec_ref for valI without including the valI for the index of the value that you are referencing as an argument?

It currently assumes valI == 1 and that the svec is inbounds, but yeah, that should be checked (and generalized in the future).

vtjnash · 2022-09-12T20:05:08Z

base/compiler/ssair/passes.jl

+
+    if isa(vec, SimpleVector)
+        if valI <= length(val)
+            compact[idx] = vec[valI]


This is missing a QuoteNode

`Core._svec_ref` has accepted `boundscheck`-value as the first argument since it was added in #45062. Nonetheless, `Core._svec_ref` simply calls `jl_svec_ref` in either the interpreter or the codegen, and thus the `boundscheck` value isn't utilized in any optimizations. Rather, even worse, this `boundscheck`-argument negatively influences the effect analysis (xref #50167 for details) and has caused type inference regressions as reported in #50544. For these reasons, this commit simply eliminates the `boundscheck` argument from `Core._svec_ref`. Consequently, `getindex(::SimpleVector, ::Int)` is now being concrete-eval eligible. closes #50544

The deleted branch was added in #45062, although it had not been tested. I tried the following diff to find cases optimized by that, but I just found the handling proved to be in vain in all cases I tried. ```diff diff --git a/base/compiler/ssair/inlining.jl b/base/compiler/ssair/inlining.jl index 318b21b09b..7e42a65aa4 100644 --- a/base/compiler/ssair/inlining.jl +++ b/base/compiler/ssair/inlining.jl @@ -1473,6 +1473,14 @@ function compute_inlining_cases(@nospecialize(info::CallInfo), flag::UInt32, sig handle_any_const_result!(cases, result, match, argtypes, info, flag, state; allow_abstract=true, allow_typevars=true) fully_covered = handled_all_cases = match.fully_covers + if length(cases) == 1 && fully_covered + println("first case: ", only_method) + elseif length(cases) == 1 + atype = argtypes_to_type(sig.argtypes) + if atype isa DataType && cases[1].sig isa DataType + println("second case: ", only_method) + end + end elseif !handled_all_cases # if we've not seen all candidates, union split is valid only for dispatch tuples filter!(case::InliningCase->isdispatchtuple(case.sig), cases) ```

Circa #45062 and #46975

The deleted branch was added in #45062, although it had not been tested. I tried the following diff to find cases optimized by that, but I just found the handling proved to be in vain in all cases I tried. ```diff diff --git a/base/compiler/ssair/inlining.jl b/base/compiler/ssair/inlining.jl index 318b21b09b..7e42a65aa4 100644 --- a/base/compiler/ssair/inlining.jl +++ b/base/compiler/ssair/inlining.jl @@ -1473,6 +1473,14 @@ function compute_inlining_cases(@nospecialize(info::CallInfo), flag::UInt32, sig handle_any_const_result!(cases, result, match, argtypes, info, flag, state; allow_abstract=true, allow_typevars=true) fully_covered = handled_all_cases = match.fully_covers + if length(cases) == 1 && fully_covered + println("first case: ", only_method) + elseif length(cases) == 1 + atype = argtypes_to_type(sig.argtypes) + if atype isa DataType && cases[1].sig isa DataType + println("second case: ", only_method) + end + end elseif !handled_all_cases # if we've not seen all candidates, union split is valid only for dispatch tuples filter!(case::InliningCase->isdispatchtuple(case.sig), cases) ```

ianatol added the compiler:optimizer Optimization passes (mostly in base/compiler/ssair/) label Apr 23, 2022

ianatol force-pushed the kf/sparaminline branch from c3da8e6 to b2ea6d1 Compare April 27, 2022 01:30

ianatol force-pushed the kf/sparaminline branch 2 times, most recently from a8dbd6a to cc1f9e0 Compare May 2, 2022 21:53

aviatesk self-requested a review May 6, 2022 05:44

aviatesk self-assigned this May 6, 2022

aviatesk reviewed May 6, 2022

View reviewed changes

base/compiler/ssair/inlining.jl Show resolved Hide resolved

ianatol force-pushed the kf/sparaminline branch 3 times, most recently from 3b7c5d8 to 5e92b7d Compare May 10, 2022 00:04

ianatol force-pushed the kf/sparaminline branch from 1c585d1 to fa66298 Compare May 27, 2022 21:27

ianatol force-pushed the kf/sparaminline branch from fa66298 to 8a02c91 Compare June 29, 2022 18:47

aviatesk mentioned this pull request Sep 6, 2022

inlining: relax finalizer inlining control-flow restriction #46651

Merged

vtjnash reviewed Sep 9, 2022

View reviewed changes

vtjnash reviewed Sep 12, 2022

View reviewed changes

base/compiler/ssair/passes.jl

if isa(vec, SimpleVector)

if valI <= length(val)

compact[idx] = vec[valI]

Copy link

Member

vtjnash Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing a QuoteNode

ianatol mentioned this pull request Sep 19, 2022

Minor cleanup to follow up 45062 #46832

Merged

vtjnash mentioned this pull request Oct 19, 2022

WIP: Allow inlining method matches with unmatched type parameters #44656

Closed

aviatesk mentioned this pull request Dec 14, 2022

Reduce invalidations when loading JuliaData packages #47889

Merged

t-bltg mentioned this pull request Mar 9, 2023

Simple type-unstable code: 150x regression wrt 1.8 #48612

Closed

N5N3 mentioned this pull request Mar 27, 2023

1.9-rc1 regression - StaticArray allocates Core.SimpleVector objects, runtime dispatch #49145

Open

maleadt mentioned this pull request Jun 11, 2023

order of magnitude type-unstable performance regression #50130

Open

aviatesk mentioned this pull request Jul 15, 2023

remove :boundscheck argument from Core._svec_ref #50561

Merged

aviatesk mentioned this pull request Nov 9, 2023

inlining: remove ineffective handling for unmatched params #52092

Merged

vtjnash added a commit that referenced this pull request Dec 5, 2023

delete unused code from simplevector

1b4f67f

Circa #45062 and #46975

vtjnash mentioned this pull request Dec 5, 2023

delete unused code from simplevector #52412

Merged

aviatesk pushed a commit that referenced this pull request Dec 6, 2023

delete unused code from simplevector (#52412)

b5abac4

Circa #45062 and #46975

aviatesk pushed a commit that referenced this pull request Dec 6, 2023

delete unused code from simplevector (#52412)

c2186fe

Circa #45062 and #46975

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow inlining methods with unmatched type parameters #45062

Allow inlining methods with unmatched type parameters #45062

ianatol commented Apr 22, 2022 •

edited by aviatesk

Loading

ianatol commented Apr 27, 2022 •

edited

Loading

ianatol commented Apr 27, 2022

nanosoldier commented Apr 28, 2022

ianatol commented Apr 28, 2022

nanosoldier commented Apr 28, 2022

ianatol commented Apr 28, 2022 •

edited

Loading

ianatol commented May 2, 2022

nanosoldier commented May 2, 2022

ianatol commented May 5, 2022

ianatol commented May 5, 2022

nanosoldier commented May 5, 2022

ianatol commented May 5, 2022

aviatesk commented May 6, 2022

aviatesk commented May 6, 2022

aviatesk May 6, 2022

aviatesk commented May 6, 2022 •

edited

Loading

nanosoldier commented May 6, 2022

vtjnash Sep 9, 2022

vtjnash Sep 9, 2022

Keno Sep 12, 2022

vtjnash Sep 9, 2022

Keno Sep 12, 2022

vtjnash Sep 12, 2022

Keno Sep 12, 2022

vtjnash Sep 12, 2022

Keno Sep 12, 2022

vtjnash Sep 12, 2022

vtjnash Sep 12, 2022

vtjnash Sep 12, 2022

vtjnash Sep 12, 2022

Keno Sep 12, 2022

vtjnash Sep 12, 2022

		elseif head === :foreigncall && isa(spvals, SimpleVector)
		@assert !isa(spsig, UnionAll) \|\| !isempty(spvals)

Allow inlining methods with unmatched type parameters #45062

Allow inlining methods with unmatched type parameters #45062

Conversation

ianatol commented Apr 22, 2022 • edited by aviatesk Loading

ianatol commented Apr 27, 2022 • edited Loading

ianatol commented Apr 27, 2022

nanosoldier commented Apr 28, 2022

ianatol commented Apr 28, 2022

nanosoldier commented Apr 28, 2022

ianatol commented Apr 28, 2022 • edited Loading

ianatol commented May 2, 2022

nanosoldier commented May 2, 2022

ianatol commented May 5, 2022

ianatol commented May 5, 2022

nanosoldier commented May 5, 2022

ianatol commented May 5, 2022

aviatesk commented May 6, 2022

aviatesk commented May 6, 2022

Choose a reason for hiding this comment

aviatesk commented May 6, 2022 • edited Loading

nanosoldier commented May 6, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ianatol commented Apr 22, 2022 •

edited by aviatesk

Loading

ianatol commented Apr 27, 2022 •

edited

Loading

ianatol commented Apr 28, 2022 •

edited

Loading

aviatesk commented May 6, 2022 •

edited

Loading