Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to inline length(::SimpleVector) when the inference result is Core.Const #37230

Closed
wants to merge 2 commits into from

Conversation

timholy
Copy link
Member

@timholy timholy commented Aug 27, 2020

EDIT: analysis reveals that this ~100x performance regression is due to a new failure to inline length(::SimpleVector) when the result is Core.Const. While this was submitted as a PR it's now basically an issue report.

JuliaGraphics/ColorTypes.jl#122 has noted that the performance of ReinterpretArrays can be pretty awful in some circumstances. I profiled and noticed that most of the time is taken by fieldcount. I'm guessing that in @Keno's implementation he was expecting it to be elided at compile time; if this happened previously, it doesn't seem to be happening now.

This works around the problem by doing that evaluation at construction time, and by doing it only once you largely eliminate the cost. That said, if someone would prefer to have the fieldcount evaluate at compile time, that would be great too.

`fieldcount` is not evaluated at compile time, and it gets called by the
`getindex` and `setindex!` methods so runs for each array-access.
This circumvents the performance problem by evaluating the condition at compile time
and storing it as a type parameter.
@simeonschaub
Copy link
Member

It seems that fieldcount explicitly has a @nospecialize annotation, which prevents constant prop here. I don't know why it was put in in the first place, but another way to fix this, if perhaps a bit hacky, would be to define the following generated function, which does constant propagate:

julia> @generated _fieldcount(::Type{T}) where {T} = fieldcount(T)
_fieldcount (generic function with 1 method)

julia> @code_warntype _fieldcount(Rational)
Variables
  #self#::Core.Compiler.Const(_fieldcount, false)
  #unused#::Type{Rational}

Body::Int64
1return 2

julia> @code_warntype fieldcount(Rational)
Variables
  #self#::Core.Compiler.Const(fieldcount, false)
  t@_2::Type{Rational}
  names::Any
  @_4::Int64
  types::Any
  abstr::Bool
  t@_7::Any
  @_8::Bool
  @_9::Bool
  @_10::Bool
  @_11::Bool

Body::Int64
1 ──       nothing
│          (t@_7 = t@_2)
│          Core.NewvarNode(:(names))
│          Core.NewvarNode(:(@_4))
│          Core.NewvarNode(:(types))
│          Core.NewvarNode(:(abstr))
│    %7  = (t@_7::Type{Rational} isa Base.UnionAll)::Bool
└───       goto #3 if not %7
2 ──       (@_8 = %7)
└───       goto #4
3 ──       (@_8 = t@_7::Type{Rational} isa Base.Union)
4 ┄─       goto #8 if not @_8
5 ──       (t@_7 = Base.argument_datatype(t@_7::Type{Rational}))
│    %14 = (t@_7 === Base.nothing)::Bool
└───       goto #7 if not %14
6 ── %16 = Base.ArgumentError("type does not have a definite number of fields")::ArgumentError
└───       Base.throw(%16)
7 ┄─       (t@_7 = Core.typeassert(t@_7, Base.DataType))
└───       goto #10
8 ── %20 = t@_7::Type{Rational}::Type{Rational}%21 = Core.apply_type(Base.Union)::Core.Compiler.Const(Union{}, false)
│    %22 = (%20 == %21)::Bool
└───       goto #10 if not %22
9 ── %24 = Base.ArgumentError("The empty type does not have a well-defined number of fields since it does not have instances.")::ArgumentError
└───       Base.throw(%24)
10%26 = (t@_7::Union{DataType, Type{Rational}} isa Base.DataType)::Bool%27 = !%26::Bool
└───       goto #12 if not %27
11%29 = Base.TypeError(:fieldcount, Base.DataType, t@_7::Type{Rational})::Core.Compiler.PartialStruct(TypeError, Any[Core.Compiler.Const(:fieldcount, false), String, Core.Compiler.Const(DataType, false), Type{Rational}])
└───       Base.throw(%29)
12%31 = Base.getproperty(t@_7::DataType, :name)::Core.TypeName%32 = (%31 === Base.NamedTuple_typename)::Bool
└───       goto #21 if not %32
13%34 = Base.getproperty(t@_7::DataType, :parameters)::Core.SimpleVector%35 = Base.indexed_iterate(%34, 1)::Tuple{Any,Int64}
│          (names = Core.getfield(%35, 1))
│          (@_4 = Core.getfield(%35, 2))
│    %38 = Base.indexed_iterate(%34, 2, @_4)::Tuple{Any,Int64}
│          (types = Core.getfield(%38, 1))
│    %40 = (names isa Base.Tuple)::Bool
└───       goto #15 if not %40
14%42 = Base.length(names::Tuple)::Int64
└───       return %42
15%44 = (types isa Base.DataType)::Bool
└───       goto #17 if not %44
16 ─       (@_9 = types::DataType <: Base.Tuple)
└───       goto #18
17 ─       (@_9 = false)
18 ┄       goto #20 if not @_9
19%50 = Base.fieldcount(types)::Int64
└───       return %50
20 ─       (abstr = true)
└───       goto #28
21%54 = Base.getproperty(t@_7::DataType, :abstract)::Bool
└───       goto #23 if not %54
22 ─       (@_10 = %54)
└───       goto #27
23%58 = Base.getproperty(t@_7::DataType, :name)::Core.TypeName%59 = Base.getproperty(Base.Tuple, :name)::Core.Compiler.Const(Tuple, false)
│    %60 = (%58 === %59)::Bool
└───       goto #25 if not %60
24 ─       (@_11 = Base.isvatuple(t@_7::DataType))
└───       goto #26
25 ─       (@_11 = false)
26 ┄       (@_10 = @_11)
27 ┄       (abstr = @_10)
28 ┄       goto #30 if not abstr
29%68 = Base.ArgumentError("type does not have a definite number of fields")::ArgumentError
└───       Base.throw(%68)
30%70 = Base.isdefined(t@_7::DataType, :types)::Bool
└───       goto #32 if not %70
31%72 = Base.getproperty(t@_7::DataType, :types)::Core.SimpleVector%73 = Base.length(%72)::Int64
└───       return %73
32%75 = Base.getproperty(t@_7::DataType, :name)::Core.TypeName%76 = Base.getproperty(%75, :names)::Core.SimpleVector%77 = Base.length(%76)::Int64
└───       return %77

(Technically fieldcount is a generic function, so it could theoretically be overloaded by the user and invalidate the generated function, but if you are doing that, you're already asking for trouble.)

@Keno
Copy link
Member

Keno commented Aug 27, 2020

If the compiler can't evaluate fieldcount at compile time anymore, that's a bug we should look into.

@simeonschaub
Copy link
Member

It seems to be intentional though:

function fieldcount(@nospecialize t)

@Keno
Copy link
Member

Keno commented Aug 27, 2020

@nospecialize prevents compiling, it doesn't prevent inference from looking through it.

@codecov-commenter
Copy link

Codecov Report

Merging #37230 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #37230   +/-   ##
=======================================
  Coverage   86.15%   86.15%           
=======================================
  Files         356      356           
  Lines       66950    66950           
=======================================
  Hits        57682    57682           
  Misses       9268     9268           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e0cc82b...f35c762. Read the comment docs.

@timholy
Copy link
Member Author

timholy commented Aug 27, 2020

Shall I drop the change to the ReinterpretArray struct and let you fix it in the compiler? The change in specialization of the error messages is independently useful.

@timholy
Copy link
Member Author

timholy commented Aug 28, 2020

I looked at this a bit more; check out the first few lines of _getindex_ra:

julia> using FixedPointNumbers, ColorTypes

julia> code_typed(Base._getindex_ra, (Base.ReinterpretArray{N0f8,1,Gray{N0f8},Vector{Gray{N0f8}}}, Int64, Tuple{}))
1-element Vector{Any}:
 CodeInfo(
1 ──       goto #4 if not true
2 ──       invoke Base.fieldcount($(Expr(:static_parameter, 1))::Any)::Core.Const(1, false)
│          invoke Base.fieldcount($(Expr(:static_parameter, 3))::Any)::Core.Const(1, false)
└───       goto #4 if not false
3 ──       nothing::Nothing
4 ┄─       goto #9 if not $(Expr(:boundscheck))
5 ── %7  = Core.tuple(i1)::Tuple{Int64}%8  = Base.getfield(a, :parent)::Vector{Gray{N0f8}}

So, the compiler knows that the answer will be 1, but it still doesn't eliminate the calls and you can see from the first statement that they always get executed.

@timholy
Copy link
Member Author

timholy commented Aug 28, 2020

Any chance this is #37147? (I haven't bisected, that's just a guess.)

@KristofferC
Copy link
Member

KristofferC commented Aug 28, 2020

Maybe it thinks the function could throw?

@martinholters
Copy link
Member

julia> code_typed(fieldcount, Tuple{Type{Base.RefValue{Int}}})
1-element Vector{Any}:
 CodeInfo(
1 ──       goto #3 if not false
2 ──       nothing::Nothing
3 ┄─       goto #5 if not false
4 ──       nothing::Nothing
5 ┄─       goto #7 if not false
6 ──       nothing::Nothing
7 ┄─ %7  = Base.not_int(true)::Bool
└───       goto #9 if not %7
8 ──       nothing::Nothing
9 ┄─       goto #11 if not false
10nothing::Nothing
11%12 = π (t@_2, Core.Const(Base.RefValue{Int64}, false))
│    %13 = Base.getfield(%12, :abstract)::Bool
└───       goto #13 if not %13
12 ─       goto #16
13 ─       goto #15 if not false
14nothing::Nothing
15%18 = π (false, Core.Const(false, false))
16%19 = φ (#12 => %13, #15 => %18)::Bool
└───       goto #18 if not %19
17%21 = Base.ArgumentError("type does not have a definite number of fields")::Any
│          Base.throw(%21)::Union{}
└───       unreachable
18%24 = π (t@_2, Core.Const(Base.RefValue{Int64}, false))
│    %25 = Base.getfield(%24, :types)::Core.SimpleVector%26 = Base.length(%25)::Core.Const(1, false)
└───       return %26
) => Int64

Note that the Base.getfield(%12, :abstract) is not inferred as Const, so yes, this function is believed to possibly throw.

@martinholters
Copy link
Member

You can try:

--- a/base/compiler/tfuncs.jl
+++ b/base/compiler/tfuncs.jl
@@ -28,6 +28,7 @@ const DATATYPE_NAME_FIELDINDEX = fieldindex(DataType, :name)
 const DATATYPE_PARAMETERS_FIELDINDEX = fieldindex(DataType, :parameters)
 const DATATYPE_TYPES_FIELDINDEX = fieldindex(DataType, :types)
 const DATATYPE_SUPER_FIELDINDEX = fieldindex(DataType, :super)
+const DATATYPE_ABSTRACT_FIELDINDEX = fieldindex(DataType, :abstract)
 const DATATYPE_MUTABLE_FIELDINDEX = fieldindex(DataType, :mutable)
 const DATATYPE_INSTANCE_FIELDINDEX = fieldindex(DataType, :instance)
 
@@ -611,6 +612,7 @@ is_dt_const_field(fld::Int) = (
      fld == DATATYPE_PARAMETERS_FIELDINDEX ||
      fld == DATATYPE_TYPES_FIELDINDEX ||
      fld == DATATYPE_SUPER_FIELDINDEX ||
+     fld == DATATYPE_ABSTRACT_FIELDINDEX ||
      fld == DATATYPE_MUTABLE_FIELDINDEX ||
      fld == DATATYPE_INSTANCE_FIELDINDEX
     )

Not sure if it's missing from this list for a good reason, although I can't think of one.

@timholy
Copy link
Member Author

timholy commented Aug 28, 2020

Hm:

julia> Int.abstract
false

julia> Int.abstract = true
true

julia> Int.abstract
true

although this also happens on 1.5.

But on 1.5, bits of fieldcount get inlined into _getindex_ra:

julia> code_typed(Base._getindex_ra, (Base.ReinterpretArray{N0f8,1,Gray{N0f8},Vector{Gray{N0f8}}}, Int64, Tuple{}); debuginfo=:source)
1-element Array{Any,1}:
 CodeInfo(
     @ reinterpretarray.jl:144 within `_getindex_ra'
1 ── %1   = $(Expr(:static_parameter, 1))::Core.Compiler.Const(Normed{UInt8,8}, false)
│   ┌ @ reflection.jl:719 within `fieldcount'
│   │ %2   = π (%1, Type{Normed{UInt8,8}})
│   │┌ @ Base.jl:28 within `getproperty'
│   ││ %3   = Base.getfield(%2, :abstract)::Bool
│   │└
└───│        goto #3 if not %3
2 ──│        goto #4
3 ──│        nothing::Nothing
4 ┄─│ %7   = φ (#2 => %3, #3 => false)::Bool
│   │ @ reflection.jl:721 within `fieldcount'
└───│        goto #6 if not %7
    │ @ reflection.jl:722 within `fieldcount'
    │┌ @ boot.jl:288 within `ArgumentError'
5 ──││ %9   = %new(Core.ArgumentError, "type does not have a definite number of fields")::ArgumentError
│   │└
│   │        Base.throw(%9)::Union{}
└───│        $(Expr(:unreachable))::Union{}
    │ @ reflection.jl:725 within `fieldcount'
6 ┄─│ %12  = π (%1, Type{Normed{UInt8,8}})
│   │┌ @ Base.jl:28 within `getproperty'
│   ││ %13  = Base.getfield(%12, :types)::Core.SimpleVector
│   │└
│   │┌ @ essentials.jl:594 within `length'
│   ││ %14  = $(Expr(:gc_preserve_begin, :(%13)))
│   ││ @ essentials.jl:595 within `length'
│   ││┌ @ pointer.jl:147 within `pointer_from_objref'
│   │││        $(Expr(:foreigncall, :(:jl_value_ptr), Ptr{Nothing}, svec(Any), 0, :(:ccall), :(%13)))::Ptr{Nothing}
│   ││└
│   ││ @ essentials.jl:596 within `length'
│   ││        $(Expr(:gc_preserve_end, :(%14)))
│   │└
└───│        goto #7...

so it actually looks like a change in inlining.

It's also worth noting that on 1.5, much of code_native's output is traced to lines of fieldcount, so we're definitely not doing very well optimizing this.

@Keno
Copy link
Member

Keno commented Aug 28, 2020

Datatypes are technically mutable, but practically not. I also can't think of a good read why the abstract field would have been omitted from the list

@timholy
Copy link
Member Author

timholy commented Aug 28, 2020

@martinholters, that definitely fixes part of the performance problem, as fieldcount is known to not throw. However, it's still called. Now I get

julia> @btime rand(Gray{N0f8}, 10^3);
  59.179 μs (1 allocation: 1.06 KiB)

which is about 4x better but still bad compared to 1.5:

julia> @btime rand(Gray{N0f8}, 10^3);
  3.495 μs (1 allocation: 1.06 KiB)

I can see calls to fieldcount even in the native code:

; ┌ @ reinterpretarray.jl:144 within `_getindex_ra'
	movq	$8, (%rbx)
	movq	-15784(%rax), %rcx
	movq	%rcx, 8(%rbx)
	movq	%rbx, %rcx
	movq	%rcx, -15784(%rax)
	leaq	-15784(%rax), %r15
	movabsq	$fieldcount, %r13
	movabsq	$140046555294128, %rdi  # imm = 0x7F5F212E29B0
	vzeroupper
	callq	*%r13
	movabsq	$140046560702864, %rdi  # imm = 0x7F5F2180B190
	callq	*%r13

so the lack of inlining seems to be the other problem. Adding a @_inline_meta in the body of fieldcount gives me:

julia> @btime rand(Gray{N0f8}, 10^3);
  49.107 μs (1 allocation: 1.06 KiB)

julia> code_typed(Base._getindex_ra, (Base.ReinterpretArray{N0f8,1,Gray{N0f8},Vector{Gray{N0f8}}}, Int64, Tuple{}); debuginfo=:source)
1-element Vector{Any}:
 CodeInfo(
     @ reinterpretarray.jl:144 within `_getindex_ra'
1 ──       goto #4 if not true
2 ── %2  = $(Expr(:static_parameter, 1))::Core.Const(N0f8, false)
│   ┌ @ reflection.jl:678 within `fieldcount'
│   │ %3  = π (%2, Type{N0f8})
│   │┌ @ Base.jl:28 within `getproperty'
│   ││ %4  = Base.getfield(%3, :types)::Core.SimpleVector
│   │└
│   │       Base.length(%4)::Int64
│   └
│    %6  = $(Expr(:static_parameter, 3))::Core.Const(Gray{N0f8}, false)
│   ┌ @ reflection.jl:678 within `fieldcount'
│   │ %7  = π (%6, Type{Gray{N0f8}})
│   │┌ @ Base.jl:28 within `getproperty'
│   ││ %8  = Base.getfield(%7, :types)::Core.SimpleVector
│   │└
│   │       Base.length(%8)::Int64
│   └
└───       goto #4 if not false
3 ──       nothing::Nothing

     @ reinterpretarray.jl:147 within `_getindex_ra'
...

(where line 147 is the next block after the fieldcount), so it got rid of everything but the length. However, when you profile it you discover that length(T.types) is accounting for the majority of the runtime.

My guess is that on 1.5 somehow all this gets hoisted out of the loop in the callee? And runs only once? But even better would be for it to not run at all.

@timholy
Copy link
Member Author

timholy commented Aug 28, 2020

Would it be reasonable to mark .types const too? I don't even know what this field is for.

@yuyichao
Copy link
Contributor

types is the only one that is not constant. Ref #37184 #31877

@Keno
Copy link
Member

Keno commented Aug 28, 2020

It's constant if initialized. We should be able to work with that.

@martinholters
Copy link
Member

Isn't everything in DataType constant once it reaches Julia land? Can't we just do

const_datatype_getfield_tfunc(@nospecialize(sv), fld::Int) =
    isdefined(sv, fld) ? Const(getfield(sv, fld)) : Union{}

?

@Keno
Copy link
Member

Keno commented Aug 28, 2020

Actually, I think inference should probably just initialize the field of it wants to look at it and then treat it as const. No reason not to really.

@timholy
Copy link
Member Author

timholy commented Aug 28, 2020

Also worth noting: it never uses the result of length(t.types), yet it feels obliged to keep the call. It's also not inlined, despite being really cheap. I don't have time to look at this any more today, but I'm betting we need to look at the cost-computation of length(::SimpleVector). (Side note: we might want to put a utility function for the cost-analysis in https://docs.julialang.org/en/latest/devdocs/inference/#The-inlining-algorithm-(inline_worthy)-1 into compiler/utilities, and test it, just to keep it from bit-rotting.)

@timholy
Copy link
Member Author

timholy commented Aug 29, 2020

OK, so the difference seems to be that 1.5 is willing to inline the call to length(::SimpleVector) whereas master is not. Here's master (the interesting stuff starts at statement 26):

julia> code_typed(fieldcount, (Type{VersionNumber},))
1-element Vector{Any}:
 CodeInfo(
1 ──       goto #3 if not false
2 ──       nothing::Nothing
3 ┄─       goto #5 if not false
4 ──       nothing::Nothing
5 ┄─       goto #7 if not false
6 ──       nothing::Nothing
7 ┄─ %7  = Base.not_int(true)::Bool
└───       goto #9 if not %7
8 ──       nothing::Nothing
9 ┄─       goto #11 if not false
10nothing::Nothing
11%12 = π (t@_2, Core.Const(VersionNumber, false))
│    %13 = Base.getfield(%12, :abstract)::Bool
└───       goto #13 if not %13
12 ─       goto #16
13 ─       goto #15 if not false
14nothing::Nothing
15%18 = π (false, Core.Const(false, false))
16%19 = φ (#12 => %13, #15 => %18)::Bool
└───       goto #18 if not %19
17%21 = Base.ArgumentError("type does not have a definite number of fields")::Any
│          Base.throw(%21)::Union{}
└───       unreachable
18%24 = π (t@_2, Core.Const(VersionNumber, false))
│    %25 = Base.getfield(%24, :types)::Core.SimpleVector%26 = Base.length(%25)::Core.Const(5, false)
└───       return %26
) => Int64

and here's 1.5:

julia> code_typed(fieldcount, (Type{VersionNumber},))
1-element Array{Any,1}:
 CodeInfo(
1 ──       goto #3 if not false
2 ──       nothing::Nothing
3 ┄─       goto #5 if not false
4 ──       nothing::Nothing
5 ┄─       goto #7 if not false
6 ──       nothing::Nothing
7 ┄─ %7  = Base.not_int(true)::Bool
└───       goto #9 if not %7
8 ──       nothing::Nothing
9 ┄─       goto #11 if not false
10nothing::Nothing
11%12 = π (t@_2, Core.Compiler.Const(VersionNumber, false))
│    %13 = Base.getfield(%12, :abstract)::Bool
└───       goto #13 if not %13
12 ─       goto #16
13 ─       goto #15 if not false
14nothing::Nothing
15%18 = π (false, Core.Compiler.Const(false, false))
16%19 = φ (#12 => %13, #15 => %18)::Bool
└───       goto #18 if not %19
17%21 = %new(Core.ArgumentError, "type does not have a definite number of fields")::ArgumentError
│          Base.throw(%21)::Union{}
└───       $(Expr(:unreachable))::Union{}
18%24 = π (t@_2, Core.Compiler.Const(VersionNumber, false))
│    %25 = Base.getfield(%24, :types)::Core.SimpleVector%26 = $(Expr(:gc_preserve_begin, :(%25)))
│    %27 = $(Expr(:foreigncall, :(:jl_value_ptr), Ptr{Nothing}, svec(Any), 0, :(:ccall), :(%25)))::Ptr{Nothing}%28 = Base.bitcast(Ptr{Int64}, %27)::Ptr{Int64}%29 = Base.pointerref(%28, 1, 1)::Int64$(Expr(:gc_preserve_end, :(%26)))
└───       return %29
) => Int64

Any thoughts on how to go about fixing that? Note it's left as a :call statement, not an :invoke.

The cost model doesn't seem to complain about length(::SimpleVector). Using #37275,

julia> Base.print_statement_costs(length, (Core.SimpleVector,))
length(v::Core.SimpleVector) in Base at essentials.jl:601
10 %1 = $(Expr(:gc_preserve_begin, Core.Argument(2)))
│   20 %2 = $(Expr(:foreigncall, :(:jl_value_ptr), Ptr{Nothing}, svec(Any), 0, :(:ccall), Core.Argument(2)))::Ptr{Nothing}1 %3 = Base.bitcast(Ptr{Int64}, %2)::Ptr{Int64}4 %4 = Base.pointerref(%3, 1, 1)::Int640      $(Expr(:gc_preserve_end, :(%1)))
└──  0      return %4

so all seems well there.

@yuyichao
Copy link
Contributor

Ah... it’s this issue. I also noticed it on another branch. The length I lines just fine but not when a constant return value was inferred.

@timholy timholy added the regression Regression in behavior compared to a previous version label Aug 31, 2020
@timholy timholy added this to the 1.6 features milestone Aug 31, 2020
@martinholters
Copy link
Member

With

--- a/base/compiler/abstractinterpretation.jl
+++ b/base/compiler/abstractinterpretation.jl
@@ -982,7 +982,7 @@ function abstract_call_known(interp::AbstractInterpreter, @nospecialize(f),
         return CallMeta(abstract_call_known(interp, <:, fargs, argtypes, sv).rt, false)
     elseif la == 2 && isa(argtypes[2], Const) && isa(argtypes[2].val, SimpleVector) && istopfunction(f, :length)
         # mark length(::SimpleVector) as @pure
-        return CallMeta(Const(length(argtypes[2].val)), false)
+        return CallMeta(Const(length(argtypes[2].val)), nothing)
     elseif la == 3 && isa(argtypes[2], Const) && isa(argtypes[3], Const) &&
             isa(argtypes[2].val, SimpleVector) && isa(argtypes[3].val, Int) && istopfunction(f, :getindex)
         # mark getindex(::SimpleVector, i::Int) as @pure

on top of #37293 I get length to inline, too:

julia> code_typed(fieldcount, (Type{VersionNumber},))
1-element Vector{Any}:
 CodeInfo(
1 ──       goto #3 if not false
2 ──       nothing::Nothing
3 ┄─       goto #5 if not false
4 ──       nothing::Nothing
5 ┄─       goto #7 if not false
6 ──       nothing::Nothing
7 ┄─ %7  = Base.not_int(true)::Bool
└───       goto #9 if not %7
8 ──       nothing::Nothing
9 ┄─       goto #11 if not false
10nothing::Nothing
11 ┄       goto #13 if not false
12nothing::Nothing
13 ┄       goto #15 if not false
14nothing::Nothing
15%16 = π (false, Core.Const(false, false))
└───       goto #17 if not %16
16nothing::Nothing
17%19 = π (t@_2, Core.Const(VersionNumber, false))
│    %20 = Base.getfield(%19, :types)::Core.SimpleVector%21 = $(Expr(:gc_preserve_begin, :(%20)))
│    %22 = $(Expr(:foreigncall, :(:jl_value_ptr), Ptr{Nothing}, svec(Any), 0, :(:ccall), :(%20)))::Ptr{Nothing}%23 = Base.bitcast(Ptr{Int64}, %22)::Ptr{Int64}%24 = Base.pointerref(%23, 1, 1)::Int64$(Expr(:gc_preserve_end, :(%21)))
└───       return %24
) => Int64

This also goes a long way performance-wise:

julia> @btime rand(Gray{N0f8}, 10^3);
  3.645 μs (1 allocation: 1.06 KiB) # #37293 + diff above
  4.817 μs (1 allocation: 1.06 KiB) # 1.5.1
  355.971 μs (1 allocation: 1.06 KiB) # master (41e603e583)

However, I'm not sure whether setting the CallMeta.info to nothing instead of false has any unintended consequences here. @Keno?

@yuyichao
Copy link
Contributor

It seems that the constant return value was lost?

@timholy
Copy link
Member Author

timholy commented Aug 31, 2020

True, it should just hardwire the result into the caller.

@martinholters
Copy link
Member

It seems that the constant return value was lost?

It seems, but it isn't. There's just no place to put/annotate it. But note that it looks the same on 1.5. And:

julia> foo() = fieldcount(VersionNumber) + 1
foo (generic function with 1 method)

julia> @code_typed foo()
CodeInfo(
1%1 = Main.VersionNumber::Core.Const(VersionNumber, false)
│   %2 = π (%1, Type{VersionNumber})
│   %3 = Base.getfield(%2, :types)::Core.SimpleVector%4 = $(Expr(:gc_preserve_begin, :(%3)))
│        $(Expr(:foreigncall, :(:jl_value_ptr), Ptr{Nothing}, svec(Any), 0, :(:ccall), :(%3)))::Ptr{Nothing}$(Expr(:gc_preserve_end, :(%4)))
└──      return 6
) => Int64

Note the return 6 here.

Furthermore:

julia> foo() = length(VersionNumber.types) + 1
foo (generic function with 1 method)

julia> @code_llvm foo()

;  @ REPL[4]:1 within `foo'
define i64 @julia_foo_306() {
top:
; ┌ @ Base.jl:28 within `getproperty'
   %0 = load %jl_value_t*, %jl_value_t** inttoptr (i64 140179246653912 to %jl_value_t**), align 8
   %1 = icmp eq %jl_value_t* %0, null
   br i1 %1, label %fail, label %pass

fail:                                             ; preds = %top
   call void @jl_throw(%jl_value_t* inttoptr (i64 140179251335200 to %jl_value_t*))
   unreachable

pass:                                             ; preds = %top
; └
  ret i64 6
}

It think there is an isdefined check left from accessing types, but the length code is completely elided.

@Keno
Copy link
Member

Keno commented Aug 31, 2020

However, I'm not sure whether setting the CallMeta.info to nothing instead of false has any unintended consequences here. @Keno?

No, this is an explicit marker not to inline. The idea was that if you know it's const, you also know that it's nothrow, and ideally we wouldn't have inlining to method queries that inference doesn't do. It's fine to change for now, but maybe add a TODO comment to put it back once we have the ability to remember that it's nothrow.

@timholy timholy added the DO NOT MERGE Do not merge this PR! label Sep 13, 2020
@timholy timholy added the performance Must go faster label Oct 13, 2020
@timholy timholy changed the title ReinterpretArray: improve performance by evaluating fieldcount at construction Failure to inline length(::SimpleVector) when the inference result is Core.Const Oct 13, 2020
vtjnash added a commit that referenced this pull request Oct 27, 2020
@vtjnash
Copy link
Member

vtjnash commented Oct 27, 2020

once we have the ability to remember that it's nothrow

Apparently I'd added this a couple days later in #37333, but didn't connect the dots back to here.

vtjnash added a commit that referenced this pull request Oct 28, 2020
@timholy timholy deleted the teh/reinterparray branch November 3, 2020 22:56
@DilumAluthge DilumAluthge removed the DO NOT MERGE Do not merge this PR! label Jun 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster regression Regression in behavior compared to a previous version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants