-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure to inline length(::SimpleVector)
when the inference result is Core.Const
#37230
Conversation
`fieldcount` is not evaluated at compile time, and it gets called by the `getindex` and `setindex!` methods so runs for each array-access. This circumvents the performance problem by evaluating the condition at compile time and storing it as a type parameter.
It seems that julia> @generated _fieldcount(::Type{T}) where {T} = fieldcount(T)
_fieldcount (generic function with 1 method)
julia> @code_warntype _fieldcount(Rational)
Variables
#self#::Core.Compiler.Const(_fieldcount, false)
#unused#::Type{Rational}
Body::Int64
1 ─ return 2
julia> @code_warntype fieldcount(Rational)
Variables
#self#::Core.Compiler.Const(fieldcount, false)
t@_2::Type{Rational}
names::Any
@_4::Int64
types::Any
abstr::Bool
t@_7::Any
@_8::Bool
@_9::Bool
@_10::Bool
@_11::Bool
Body::Int64
1 ── nothing
│ (t@_7 = t@_2)
│ Core.NewvarNode(:(names))
│ Core.NewvarNode(:(@_4))
│ Core.NewvarNode(:(types))
│ Core.NewvarNode(:(abstr))
│ %7 = (t@_7::Type{Rational} isa Base.UnionAll)::Bool
└─── goto #3 if not %7
2 ── (@_8 = %7)
└─── goto #4
3 ── (@_8 = t@_7::Type{Rational} isa Base.Union)
4 ┄─ goto #8 if not @_8
5 ── (t@_7 = Base.argument_datatype(t@_7::Type{Rational}))
│ %14 = (t@_7 === Base.nothing)::Bool
└─── goto #7 if not %14
6 ── %16 = Base.ArgumentError("type does not have a definite number of fields")::ArgumentError
└─── Base.throw(%16)
7 ┄─ (t@_7 = Core.typeassert(t@_7, Base.DataType))
└─── goto #10
8 ── %20 = t@_7::Type{Rational}::Type{Rational}
│ %21 = Core.apply_type(Base.Union)::Core.Compiler.Const(Union{}, false)
│ %22 = (%20 == %21)::Bool
└─── goto #10 if not %22
9 ── %24 = Base.ArgumentError("The empty type does not have a well-defined number of fields since it does not have instances.")::ArgumentError
└─── Base.throw(%24)
10 ┄ %26 = (t@_7::Union{DataType, Type{Rational}} isa Base.DataType)::Bool
│ %27 = !%26::Bool
└─── goto #12 if not %27
11 ─ %29 = Base.TypeError(:fieldcount, Base.DataType, t@_7::Type{Rational})::Core.Compiler.PartialStruct(TypeError, Any[Core.Compiler.Const(:fieldcount, false), String, Core.Compiler.Const(DataType, false), Type{Rational}])
└─── Base.throw(%29)
12 ┄ %31 = Base.getproperty(t@_7::DataType, :name)::Core.TypeName
│ %32 = (%31 === Base.NamedTuple_typename)::Bool
└─── goto #21 if not %32
13 ─ %34 = Base.getproperty(t@_7::DataType, :parameters)::Core.SimpleVector
│ %35 = Base.indexed_iterate(%34, 1)::Tuple{Any,Int64}
│ (names = Core.getfield(%35, 1))
│ (@_4 = Core.getfield(%35, 2))
│ %38 = Base.indexed_iterate(%34, 2, @_4)::Tuple{Any,Int64}
│ (types = Core.getfield(%38, 1))
│ %40 = (names isa Base.Tuple)::Bool
└─── goto #15 if not %40
14 ─ %42 = Base.length(names::Tuple)::Int64
└─── return %42
15 ─ %44 = (types isa Base.DataType)::Bool
└─── goto #17 if not %44
16 ─ (@_9 = types::DataType <: Base.Tuple)
└─── goto #18
17 ─ (@_9 = false)
18 ┄ goto #20 if not @_9
19 ─ %50 = Base.fieldcount(types)::Int64
└─── return %50
20 ─ (abstr = true)
└─── goto #28
21 ─ %54 = Base.getproperty(t@_7::DataType, :abstract)::Bool
└─── goto #23 if not %54
22 ─ (@_10 = %54)
└─── goto #27
23 ─ %58 = Base.getproperty(t@_7::DataType, :name)::Core.TypeName
│ %59 = Base.getproperty(Base.Tuple, :name)::Core.Compiler.Const(Tuple, false)
│ %60 = (%58 === %59)::Bool
└─── goto #25 if not %60
24 ─ (@_11 = Base.isvatuple(t@_7::DataType))
└─── goto #26
25 ─ (@_11 = false)
26 ┄ (@_10 = @_11)
27 ┄ (abstr = @_10)
28 ┄ goto #30 if not abstr
29 ─ %68 = Base.ArgumentError("type does not have a definite number of fields")::ArgumentError
└─── Base.throw(%68)
30 ┄ %70 = Base.isdefined(t@_7::DataType, :types)::Bool
└─── goto #32 if not %70
31 ─ %72 = Base.getproperty(t@_7::DataType, :types)::Core.SimpleVector
│ %73 = Base.length(%72)::Int64
└─── return %73
32 ─ %75 = Base.getproperty(t@_7::DataType, :name)::Core.TypeName
│ %76 = Base.getproperty(%75, :names)::Core.SimpleVector
│ %77 = Base.length(%76)::Int64
└─── return %77 (Technically |
If the compiler can't evaluate fieldcount at compile time anymore, that's a bug we should look into. |
It seems to be intentional though: Line 648 in e0cc82b
|
|
Codecov Report
@@ Coverage Diff @@
## master #37230 +/- ##
=======================================
Coverage 86.15% 86.15%
=======================================
Files 356 356
Lines 66950 66950
=======================================
Hits 57682 57682
Misses 9268 9268 Continue to review full report at Codecov.
|
Shall I drop the change to the |
I looked at this a bit more; check out the first few lines of julia> using FixedPointNumbers, ColorTypes
julia> code_typed(Base._getindex_ra, (Base.ReinterpretArray{N0f8,1,Gray{N0f8},Vector{Gray{N0f8}}}, Int64, Tuple{}))
1-element Vector{Any}:
CodeInfo(
1 ── goto #4 if not true
2 ── invoke Base.fieldcount($(Expr(:static_parameter, 1))::Any)::Core.Const(1, false)
│ invoke Base.fieldcount($(Expr(:static_parameter, 3))::Any)::Core.Const(1, false)
└─── goto #4 if not false
3 ── nothing::Nothing
4 ┄─ goto #9 if not $(Expr(:boundscheck))
5 ── %7 = Core.tuple(i1)::Tuple{Int64}
│ %8 = Base.getfield(a, :parent)::Vector{Gray{N0f8}} So, the compiler knows that the answer will be 1, but it still doesn't eliminate the calls and you can see from the first statement that they always get executed. |
Any chance this is #37147? (I haven't bisected, that's just a guess.) |
Maybe it thinks the function could throw? |
julia> code_typed(fieldcount, Tuple{Type{Base.RefValue{Int}}})
1-element Vector{Any}:
CodeInfo(
1 ── goto #3 if not false
2 ── nothing::Nothing
3 ┄─ goto #5 if not false
4 ── nothing::Nothing
5 ┄─ goto #7 if not false
6 ── nothing::Nothing
7 ┄─ %7 = Base.not_int(true)::Bool
└─── goto #9 if not %7
8 ── nothing::Nothing
9 ┄─ goto #11 if not false
10 ─ nothing::Nothing
11 ┄ %12 = π (t@_2, Core.Const(Base.RefValue{Int64}, false))
│ %13 = Base.getfield(%12, :abstract)::Bool
└─── goto #13 if not %13
12 ─ goto #16
13 ─ goto #15 if not false
14 ─ nothing::Nothing
15 ┄ %18 = π (false, Core.Const(false, false))
16 ┄ %19 = φ (#12 => %13, #15 => %18)::Bool
└─── goto #18 if not %19
17 ─ %21 = Base.ArgumentError("type does not have a definite number of fields")::Any
│ Base.throw(%21)::Union{}
└─── unreachable
18 ─ %24 = π (t@_2, Core.Const(Base.RefValue{Int64}, false))
│ %25 = Base.getfield(%24, :types)::Core.SimpleVector
│ %26 = Base.length(%25)::Core.Const(1, false)
└─── return %26
) => Int64 Note that the |
You can try: --- a/base/compiler/tfuncs.jl
+++ b/base/compiler/tfuncs.jl
@@ -28,6 +28,7 @@ const DATATYPE_NAME_FIELDINDEX = fieldindex(DataType, :name)
const DATATYPE_PARAMETERS_FIELDINDEX = fieldindex(DataType, :parameters)
const DATATYPE_TYPES_FIELDINDEX = fieldindex(DataType, :types)
const DATATYPE_SUPER_FIELDINDEX = fieldindex(DataType, :super)
+const DATATYPE_ABSTRACT_FIELDINDEX = fieldindex(DataType, :abstract)
const DATATYPE_MUTABLE_FIELDINDEX = fieldindex(DataType, :mutable)
const DATATYPE_INSTANCE_FIELDINDEX = fieldindex(DataType, :instance)
@@ -611,6 +612,7 @@ is_dt_const_field(fld::Int) = (
fld == DATATYPE_PARAMETERS_FIELDINDEX ||
fld == DATATYPE_TYPES_FIELDINDEX ||
fld == DATATYPE_SUPER_FIELDINDEX ||
+ fld == DATATYPE_ABSTRACT_FIELDINDEX ||
fld == DATATYPE_MUTABLE_FIELDINDEX ||
fld == DATATYPE_INSTANCE_FIELDINDEX
) Not sure if it's missing from this list for a good reason, although I can't think of one. |
Hm: julia> Int.abstract
false
julia> Int.abstract = true
true
julia> Int.abstract
true although this also happens on 1.5. But on 1.5, bits of julia> code_typed(Base._getindex_ra, (Base.ReinterpretArray{N0f8,1,Gray{N0f8},Vector{Gray{N0f8}}}, Int64, Tuple{}); debuginfo=:source)
1-element Array{Any,1}:
CodeInfo(
@ reinterpretarray.jl:144 within `_getindex_ra'
1 ── %1 = $(Expr(:static_parameter, 1))::Core.Compiler.Const(Normed{UInt8,8}, false)
│ ┌ @ reflection.jl:719 within `fieldcount'
│ │ %2 = π (%1, Type{Normed{UInt8,8}})
│ │┌ @ Base.jl:28 within `getproperty'
│ ││ %3 = Base.getfield(%2, :abstract)::Bool
│ │└
└───│ goto #3 if not %3
2 ──│ goto #4
3 ──│ nothing::Nothing
4 ┄─│ %7 = φ (#2 => %3, #3 => false)::Bool
│ │ @ reflection.jl:721 within `fieldcount'
└───│ goto #6 if not %7
│ @ reflection.jl:722 within `fieldcount'
│┌ @ boot.jl:288 within `ArgumentError'
5 ──││ %9 = %new(Core.ArgumentError, "type does not have a definite number of fields")::ArgumentError
│ │└
│ │ Base.throw(%9)::Union{}
└───│ $(Expr(:unreachable))::Union{}
│ @ reflection.jl:725 within `fieldcount'
6 ┄─│ %12 = π (%1, Type{Normed{UInt8,8}})
│ │┌ @ Base.jl:28 within `getproperty'
│ ││ %13 = Base.getfield(%12, :types)::Core.SimpleVector
│ │└
│ │┌ @ essentials.jl:594 within `length'
│ ││ %14 = $(Expr(:gc_preserve_begin, :(%13)))
│ ││ @ essentials.jl:595 within `length'
│ ││┌ @ pointer.jl:147 within `pointer_from_objref'
│ │││ $(Expr(:foreigncall, :(:jl_value_ptr), Ptr{Nothing}, svec(Any), 0, :(:ccall), :(%13)))::Ptr{Nothing}
│ ││└
│ ││ @ essentials.jl:596 within `length'
│ ││ $(Expr(:gc_preserve_end, :(%14)))
│ │└
└───│ goto #7
└
... so it actually looks like a change in inlining. It's also worth noting that on 1.5, much of |
Datatypes are technically mutable, but practically not. I also can't think of a good read why the |
@martinholters, that definitely fixes part of the performance problem, as julia> @btime rand(Gray{N0f8}, 10^3);
59.179 μs (1 allocation: 1.06 KiB) which is about 4x better but still bad compared to 1.5: julia> @btime rand(Gray{N0f8}, 10^3);
3.495 μs (1 allocation: 1.06 KiB) I can see calls to
so the lack of inlining seems to be the other problem. Adding a julia> @btime rand(Gray{N0f8}, 10^3);
49.107 μs (1 allocation: 1.06 KiB)
julia> code_typed(Base._getindex_ra, (Base.ReinterpretArray{N0f8,1,Gray{N0f8},Vector{Gray{N0f8}}}, Int64, Tuple{}); debuginfo=:source)
1-element Vector{Any}:
CodeInfo(
@ reinterpretarray.jl:144 within `_getindex_ra'
1 ── goto #4 if not true
2 ── %2 = $(Expr(:static_parameter, 1))::Core.Const(N0f8, false)
│ ┌ @ reflection.jl:678 within `fieldcount'
│ │ %3 = π (%2, Type{N0f8})
│ │┌ @ Base.jl:28 within `getproperty'
│ ││ %4 = Base.getfield(%3, :types)::Core.SimpleVector
│ │└
│ │ Base.length(%4)::Int64
│ └
│ %6 = $(Expr(:static_parameter, 3))::Core.Const(Gray{N0f8}, false)
│ ┌ @ reflection.jl:678 within `fieldcount'
│ │ %7 = π (%6, Type{Gray{N0f8}})
│ │┌ @ Base.jl:28 within `getproperty'
│ ││ %8 = Base.getfield(%7, :types)::Core.SimpleVector
│ │└
│ │ Base.length(%8)::Int64
│ └
└─── goto #4 if not false
3 ── nothing::Nothing
└
@ reinterpretarray.jl:147 within `_getindex_ra'
... (where line 147 is the next block after the fieldcount), so it got rid of everything but the length. However, when you profile it you discover that My guess is that on 1.5 somehow all this gets hoisted out of the loop in the callee? And runs only once? But even better would be for it to not run at all. |
Would it be reasonable to mark |
It's constant if initialized. We should be able to work with that. |
Isn't everything in const_datatype_getfield_tfunc(@nospecialize(sv), fld::Int) =
isdefined(sv, fld) ? Const(getfield(sv, fld)) : Union{} ? |
Actually, I think inference should probably just initialize the field of it wants to look at it and then treat it as const. No reason not to really. |
Also worth noting: it never uses the result of |
OK, so the difference seems to be that 1.5 is willing to inline the call to julia> code_typed(fieldcount, (Type{VersionNumber},))
1-element Vector{Any}:
CodeInfo(
1 ── goto #3 if not false
2 ── nothing::Nothing
3 ┄─ goto #5 if not false
4 ── nothing::Nothing
5 ┄─ goto #7 if not false
6 ── nothing::Nothing
7 ┄─ %7 = Base.not_int(true)::Bool
└─── goto #9 if not %7
8 ── nothing::Nothing
9 ┄─ goto #11 if not false
10 ─ nothing::Nothing
11 ┄ %12 = π (t@_2, Core.Const(VersionNumber, false))
│ %13 = Base.getfield(%12, :abstract)::Bool
└─── goto #13 if not %13
12 ─ goto #16
13 ─ goto #15 if not false
14 ─ nothing::Nothing
15 ┄ %18 = π (false, Core.Const(false, false))
16 ┄ %19 = φ (#12 => %13, #15 => %18)::Bool
└─── goto #18 if not %19
17 ─ %21 = Base.ArgumentError("type does not have a definite number of fields")::Any
│ Base.throw(%21)::Union{}
└─── unreachable
18 ─ %24 = π (t@_2, Core.Const(VersionNumber, false))
│ %25 = Base.getfield(%24, :types)::Core.SimpleVector
│ %26 = Base.length(%25)::Core.Const(5, false)
└─── return %26
) => Int64 and here's 1.5: julia> code_typed(fieldcount, (Type{VersionNumber},))
1-element Array{Any,1}:
CodeInfo(
1 ── goto #3 if not false
2 ── nothing::Nothing
3 ┄─ goto #5 if not false
4 ── nothing::Nothing
5 ┄─ goto #7 if not false
6 ── nothing::Nothing
7 ┄─ %7 = Base.not_int(true)::Bool
└─── goto #9 if not %7
8 ── nothing::Nothing
9 ┄─ goto #11 if not false
10 ─ nothing::Nothing
11 ┄ %12 = π (t@_2, Core.Compiler.Const(VersionNumber, false))
│ %13 = Base.getfield(%12, :abstract)::Bool
└─── goto #13 if not %13
12 ─ goto #16
13 ─ goto #15 if not false
14 ─ nothing::Nothing
15 ┄ %18 = π (false, Core.Compiler.Const(false, false))
16 ┄ %19 = φ (#12 => %13, #15 => %18)::Bool
└─── goto #18 if not %19
17 ─ %21 = %new(Core.ArgumentError, "type does not have a definite number of fields")::ArgumentError
│ Base.throw(%21)::Union{}
└─── $(Expr(:unreachable))::Union{}
18 ┄ %24 = π (t@_2, Core.Compiler.Const(VersionNumber, false))
│ %25 = Base.getfield(%24, :types)::Core.SimpleVector
│ %26 = $(Expr(:gc_preserve_begin, :(%25)))
│ %27 = $(Expr(:foreigncall, :(:jl_value_ptr), Ptr{Nothing}, svec(Any), 0, :(:ccall), :(%25)))::Ptr{Nothing}
│ %28 = Base.bitcast(Ptr{Int64}, %27)::Ptr{Int64}
│ %29 = Base.pointerref(%28, 1, 1)::Int64
│ $(Expr(:gc_preserve_end, :(%26)))
└─── return %29
) => Int64 Any thoughts on how to go about fixing that? Note it's left as a The cost model doesn't seem to complain about julia> Base.print_statement_costs(length, (Core.SimpleVector,))
length(v::Core.SimpleVector) in Base at essentials.jl:601
1 ─ 0 %1 = $(Expr(:gc_preserve_begin, Core.Argument(2)))
│ 20 %2 = $(Expr(:foreigncall, :(:jl_value_ptr), Ptr{Nothing}, svec(Any), 0, :(:ccall), Core.Argument(2)))::Ptr{Nothing}
│ 1 %3 = Base.bitcast(Ptr{Int64}, %2)::Ptr{Int64}
│ 4 %4 = Base.pointerref(%3, 1, 1)::Int64
│ 0 $(Expr(:gc_preserve_end, :(%1)))
└── 0 return %4 so all seems well there. |
Ah... it’s this issue. I also noticed it on another branch. The length I lines just fine but not when a constant return value was inferred. |
With --- a/base/compiler/abstractinterpretation.jl
+++ b/base/compiler/abstractinterpretation.jl
@@ -982,7 +982,7 @@ function abstract_call_known(interp::AbstractInterpreter, @nospecialize(f),
return CallMeta(abstract_call_known(interp, <:, fargs, argtypes, sv).rt, false)
elseif la == 2 && isa(argtypes[2], Const) && isa(argtypes[2].val, SimpleVector) && istopfunction(f, :length)
# mark length(::SimpleVector) as @pure
- return CallMeta(Const(length(argtypes[2].val)), false)
+ return CallMeta(Const(length(argtypes[2].val)), nothing)
elseif la == 3 && isa(argtypes[2], Const) && isa(argtypes[3], Const) &&
isa(argtypes[2].val, SimpleVector) && isa(argtypes[3].val, Int) && istopfunction(f, :getindex)
# mark getindex(::SimpleVector, i::Int) as @pure on top of #37293 I get julia> code_typed(fieldcount, (Type{VersionNumber},))
1-element Vector{Any}:
CodeInfo(
1 ── goto #3 if not false
2 ── nothing::Nothing
3 ┄─ goto #5 if not false
4 ── nothing::Nothing
5 ┄─ goto #7 if not false
6 ── nothing::Nothing
7 ┄─ %7 = Base.not_int(true)::Bool
└─── goto #9 if not %7
8 ── nothing::Nothing
9 ┄─ goto #11 if not false
10 ─ nothing::Nothing
11 ┄ goto #13 if not false
12 ─ nothing::Nothing
13 ┄ goto #15 if not false
14 ─ nothing::Nothing
15 ┄ %16 = π (false, Core.Const(false, false))
└─── goto #17 if not %16
16 ─ nothing::Nothing
17 ┄ %19 = π (t@_2, Core.Const(VersionNumber, false))
│ %20 = Base.getfield(%19, :types)::Core.SimpleVector
│ %21 = $(Expr(:gc_preserve_begin, :(%20)))
│ %22 = $(Expr(:foreigncall, :(:jl_value_ptr), Ptr{Nothing}, svec(Any), 0, :(:ccall), :(%20)))::Ptr{Nothing}
│ %23 = Base.bitcast(Ptr{Int64}, %22)::Ptr{Int64}
│ %24 = Base.pointerref(%23, 1, 1)::Int64
│ $(Expr(:gc_preserve_end, :(%21)))
└─── return %24
) => Int64 This also goes a long way performance-wise: julia> @btime rand(Gray{N0f8}, 10^3);
3.645 μs (1 allocation: 1.06 KiB) # #37293 + diff above
4.817 μs (1 allocation: 1.06 KiB) # 1.5.1
355.971 μs (1 allocation: 1.06 KiB) # master (41e603e583) However, I'm not sure whether setting the |
It seems that the constant return value was lost? |
True, it should just hardwire the result into the caller. |
It seems, but it isn't. There's just no place to put/annotate it. But note that it looks the same on 1.5. And: julia> foo() = fieldcount(VersionNumber) + 1
foo (generic function with 1 method)
julia> @code_typed foo()
CodeInfo(
1 ─ %1 = Main.VersionNumber::Core.Const(VersionNumber, false)
│ %2 = π (%1, Type{VersionNumber})
│ %3 = Base.getfield(%2, :types)::Core.SimpleVector
│ %4 = $(Expr(:gc_preserve_begin, :(%3)))
│ $(Expr(:foreigncall, :(:jl_value_ptr), Ptr{Nothing}, svec(Any), 0, :(:ccall), :(%3)))::Ptr{Nothing}
│ $(Expr(:gc_preserve_end, :(%4)))
└── return 6
) => Int64 Note the Furthermore: julia> foo() = length(VersionNumber.types) + 1
foo (generic function with 1 method)
julia> @code_llvm foo()
; @ REPL[4]:1 within `foo'
define i64 @julia_foo_306() {
top:
; ┌ @ Base.jl:28 within `getproperty'
%0 = load %jl_value_t*, %jl_value_t** inttoptr (i64 140179246653912 to %jl_value_t**), align 8
%1 = icmp eq %jl_value_t* %0, null
br i1 %1, label %fail, label %pass
fail: ; preds = %top
call void @jl_throw(%jl_value_t* inttoptr (i64 140179251335200 to %jl_value_t*))
unreachable
pass: ; preds = %top
; └
ret i64 6
} It think there is an isdefined check left from accessing |
No, this is an explicit marker not to inline. The idea was that if you know it's const, you also know that it's nothrow, and ideally we wouldn't have inlining to method queries that inference doesn't do. It's fine to change for now, but maybe add a TODO comment to put it back once we have the ability to remember that it's nothrow. |
fieldcount
at constructionlength(::SimpleVector)
when the inference result is Core.Const
Apparently I'd added this a couple days later in #37333, but didn't connect the dots back to here. |
EDIT: analysis reveals that this ~100x performance regression is due to a new failure to inline
length(::SimpleVector)
when the result isCore.Const
. While this was submitted as a PR it's now basically an issue report.JuliaGraphics/ColorTypes.jl#122 has noted that the performance of ReinterpretArrays can be pretty awful in some circumstances. I profiled and noticed that most of the time is taken by
fieldcount
. I'm guessing that in @Keno's implementation he was expecting it to be elided at compile time; if this happened previously, it doesn't seem to be happening now.This works around the problem by doing that evaluation at construction time, and by doing it only once you largely eliminate the cost. That said, if someone would prefer to have the fieldcount evaluate at compile time, that would be great too.