Performance regression in `vcat(::Vector, ::Number)` in v1.9 #50831

nickrobinson251 · 2023-08-08T11:08:23Z

Seems to be fixed in v1.10

v1.8

julia> VERSION
v"1.8.4"

julia> v = collect(Any, 1:1000);

julia> i = 1;

julia> using BenchmarkTools

julia> @btime vcat($v, $i);
  515.272 ns (1 allocation: 8.00 KiB)

julia> @which vcat(v, i)
vcat(A::Union{Number, LinearAlgebra.AbstractTriangular{T, A} where {T, A<:(Matrix)}, LinearAlgebra.Adjoint{<:Any, <:Vector}, LinearAlgebra.Bidiagonal, LinearAlgebra.Diagonal, LinearAlgebra.Hermitian{T, A} where {T, A<:(Matrix)}, LinearAlgebra.SymTridiagonal, LinearAlgebra.Symmetric{T, A} where {T, A<:(Matrix)}, LinearAlgebra.Transpose{<:Any, <:Vector}, LinearAlgebra.Tridiagonal, Matrix, Vector}...) in LinearAlgebra at /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/special.jl:419

v1.9

julia> VERSION
v"1.9.2"

julia> v = collect(Any, 1:1000);

julia> i = 1;

julia> using BenchmarkTools

julia> @btime vcat($v, $i);
  1.050 μs (13 allocations: 8.28 KiB)

julia> @which vcat(v, i)
vcat(A::Union{Number, Matrix, Vector}...)
     @ Base array.jl:1957

v1.10

julia> VERSION
v"1.10.0-beta1"

julia> v = collect(Any, 1:1000);

julia> i = 1;

julia> using BenchmarkTools

julia> @btime vcat($v, $i);
  521.382 ns (1 allocation: 8.00 KiB)

julia> @which vcat(v, i)
vcat(A::Union{Number, AbstractArray}...)
     @ Base abstractarray.jl:1990

nickrobinson251 · 2023-08-08T11:24:15Z

silly workaround is to replace i with [i]

julia> VERSION
v"1.8.4"

julia> i_vec = [i];

julia> @btime vcat($v, $i);
  515.092 ns (1 allocation: 8.00 KiB)

julia> @btime vcat($v, $i_vec);
  517.818 ns (1 allocation: 8.00 KiB)
  
julia> @btime vcat($v, [$i]);
  532.624 ns (2 allocations: 8.06 KiB)

julia> VERSION
v"1.9.2"

julia> i_vec = [i];

julia> @btime vcat($v, $i);
  1.054 μs (13 allocations: 8.28 KiB)

julia> @btime vcat($v, $i_vec);
  516.581 ns (1 allocation: 8.00 KiB)
  
julia> @btime vcat($v, [$i]);
  543.651 ns (2 allocations: 8.06 KiB)

julia> VERSION
v"1.10.0-beta1"

julia> @btime vcat($v, $i);
  523.656 ns (1 allocation: 8.00 KiB)

julia> @btime vcat($v, $i_vec);
  522.568 ns (1 allocation: 8.00 KiB)
  
julia> @btime vcat($v, [$i]);
  544.312 ns (2 allocations: 8.06 KiB)

EDIT: hmm, although this workaround doesn't seem to recover performance in my real use-case (which is a little tricky to reduce down... will work on it)

NHDaly · 2023-08-08T16:03:27Z

Can we also try this as a workaround?:

julia> @btime push!(copy($v), $i);
  454.703 ns (2 allocations: 24.81 KiB)

EDIT: thanks, i meant push not append 👍

nickrobinson251 · 2023-08-08T16:32:18Z

yes, push!(copy($v), $i) recovers the time on this example (and seems to be a better workaround, as it also recovers the time on the real use-case!)

So does append!(copy($v), $i) (although that relies on i having a length, which it does in this example but not in 1 of the real use-cases where i found this)

julia> VERSION
v"1.8.4"

julia> @btime vcat($v, $i);
  503.625 ns (1 allocation: 8.00 KiB)

julia> @btime push!(copy($v), $i);
  472.148 ns (2 allocations: 24.81 KiB)
  
julia> @btime append!(copy($v), $i);
  472.862 ns (2 allocations: 24.81 KiB)

julia> VERSION
v"1.9.2"

julia> @btime vcat($v, $i);
  1.058 μs (13 allocations: 8.28 KiB)

julia> @btime push!(copy($v), $i);
  483.031 ns (2 allocations: 24.81 KiB)
  
julia> @btime append!(copy($v), $i);
  483.087 ns (2 allocations: 24.81 KiB)

julia> VERSION
v"1.10.0-beta1"

julia> @btime vcat($v, $i);
  521.052 ns (1 allocation: 8.00 KiB)

julia> @btime push!(copy($v), $i);
  573.370 ns (2 allocations: 24.81 KiB)

julia> @btime append!(copy($v), $i);
  579.710 ns (2 allocations: 24.81 KiB)

NHDaly · 2023-08-08T17:54:12Z

Bisect points to 3f04640, which backported #48933, as introducing this regression for 1.9.

I'll try again to see if i can get bisect to tell me where it was fixed.

NHDaly · 2023-08-08T17:58:33Z

(CC: @staticfloat for the above)

staticfloat · 2023-08-08T18:15:55Z

@dkarrasch may know, as he authored the original PR.

nickrobinson251 · 2023-08-09T13:39:01Z

btw, for our real-use we just rolled our own function for now (and i'm sure there are other good ways to write this same thing):

function vcat_element(v::Vector{T}, x::T) where T
    len = length(v)
    new_v = Vector{T}(undef, len + 1)
    unsafe_copyto!(new_v, 1, v, 1, len)
    new_v[end] = x
    return new_v
end

nickrobinson251 · 2023-08-09T13:39:55Z

Also, i see a few other vcat perf issues being reported for recent releases e.g.

Should we add some more vcat-related benchmarks somewhere?

NHDaly · 2023-08-10T02:46:53Z

More benchmarks sounds good.

Also sorry I forgot to report back.. Bisect results were confusing as to finding which commit fixed this on the 1.10 branch. After looking more closely at the bisect log, I realized why it was confusing:

The original commit, e6c84a1, didn't cause a regression on the 1.10 branch! The regression only occurred on the 1.9 branch during the backport.

So either there was a bug during backporting, or somehow this commit was depending on something only present in 1.10 to make it effective.

NHDaly · 2023-08-10T02:47:22Z

I'm not sure how best to find out what commits in 1.10 were needed for the original commit to not cause a regression? I guess we could do a bisect that replays the backport onto various commits inbetween 1.9 and 1.10 to see when it wouldn't have caused a regression?

NHDaly · 2023-08-10T16:21:55Z

Okay, I ran that bisect! Here's the bisect scripts I used, in case it's helpful to anyone in the future:

# Usage: ./bisect-script.sh <julia script>

# assert that the script is given
if [ -z "$1" ]; then
    echo "Usage: ./bisect-script.sh <julia script>"
    exit 1
fi

git cherry-pick -m1 e6c84a1eae1f75cad1c62f62f2074c0f1bf124d1  # <-- this was the extra part I added for this
make clean
make -j || (make cleanall; make -j)
./julia --startup=no $1 && (
    git reset --hard HEAD~  # cleanup <-- also added this
    exit 0
) || (
    git reset --hard HEAD~  # cleanup <-- and this
    exit 1
)

@info VERSION

const v = collect(Any, 1:1000);
const i = 1;

using Test

f(i) = vcat(v, i)
# warmup
@test length(f(i)) == 1001
# final actual test
# NOTE: Swapped the test while bisecting for the fix. "failing" means it's fixed.
@test @allocations(f(i)) > 5

NHDaly · 2023-08-10T16:23:21Z

Bisect pointed to 1543cdd as the first commit inbetween 1.9 and 1.10, where backporting the vcat change doesn't introduce a regression.

That's coming from #48720.

I guess adding these two lines is what prevented the regression?:

+checkindex(::Type{Bool}, inds::IdentityUnitRange, i::Real) = checkindex(Bool, inds.indices, i)
+checkindex(::Type{Bool}, inds::OneTo{T}, i::T) where {T<:BitInteger} = unsigned(i - one(i)) < unsigned(last(inds))

NHDaly · 2023-08-10T16:25:36Z

The question now is how to proceed.

I'm very far away from this kind of performance tuning work. I see at least these two options:

Back-out the original change that introduced this regression in 1.9. My understanding is that this was a bug fix, though? So we probably don't want to do that?
Backport faster iteration over subarrays (views) #48720 into 1.9 as well, to resolve the perf regression.
- Is there a chance that this can itself cause more harm though? I obviously prefer avoiding introducing new changes, since they can have their own unintended side effects....

Thanks!!
Tagging @N5N3, @matthias314, @staticfloat, and @KristofferC who could maybe help decide how to proceed? Thanks

KristofferC · 2023-08-10T16:36:15Z

If it fixes it, backporting #48720 doesnt seem so bad.

NHDaly · 2023-08-10T17:36:39Z

I have confirmed that backporting #48720 fixes the regression locally. I'll mark it for backporting then. 👍

Thank you!

NHDaly · 2023-08-10T17:38:19Z

... Do you understand why it fixes the regression? That would probably be good to understand before backporting, to ensure that we aren't introducing any other complexities?

NHDaly · 2023-08-29T04:17:57Z

This should be fixed in 1.9.3. Leaving a note here for us to confirm.

nickrobinson251 · 2023-08-31T09:42:25Z

If I wanted to follow-up about adding some benchmarks related to vcat performance, where/how would i do that?

KristofferC · 2023-08-31T11:29:28Z

https://github.com/JuliaCI/BaseBenchmarks.jl probably

brenhinkeller added the regression 1.9 Regression in the 1.9 release label Aug 8, 2023

mbauman added the performance Must go faster label Aug 8, 2023

mbauman changed the title ~~Regression in vcat(::Vector, ::Number) in v1.9~~ Performance regression in vcat(::Vector, ::Number) in v1.9 Aug 8, 2023

NHDaly mentioned this issue Aug 10, 2023

faster iteration over subarrays (views) #48720

Merged

NHDaly mentioned this issue Aug 23, 2023

[release-1.9] Backports for Julia 1.9 #50977

Merged

31 tasks

vtjnash closed this as completed Aug 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance regression in `vcat(::Vector, ::Number)` in v1.9 #50831

Performance regression in `vcat(::Vector, ::Number)` in v1.9 #50831

nickrobinson251 commented Aug 8, 2023

nickrobinson251 commented Aug 8, 2023 •

edited

Loading

NHDaly commented Aug 8, 2023 •

edited

Loading

nickrobinson251 commented Aug 8, 2023 •

edited

Loading

NHDaly commented Aug 8, 2023 •

edited

Loading

NHDaly commented Aug 8, 2023

staticfloat commented Aug 8, 2023

nickrobinson251 commented Aug 9, 2023

nickrobinson251 commented Aug 9, 2023 •

edited

Loading

NHDaly commented Aug 10, 2023

NHDaly commented Aug 10, 2023

NHDaly commented Aug 10, 2023

NHDaly commented Aug 10, 2023

NHDaly commented Aug 10, 2023

KristofferC commented Aug 10, 2023 •

edited by oscardssmith

Loading

NHDaly commented Aug 10, 2023

NHDaly commented Aug 10, 2023

NHDaly commented Aug 29, 2023

nickrobinson251 commented Aug 31, 2023

KristofferC commented Aug 31, 2023

Performance regression in vcat(::Vector, ::Number) in v1.9 #50831

Performance regression in vcat(::Vector, ::Number) in v1.9 #50831

Comments

nickrobinson251 commented Aug 8, 2023

nickrobinson251 commented Aug 8, 2023 • edited Loading

NHDaly commented Aug 8, 2023 • edited Loading

nickrobinson251 commented Aug 8, 2023 • edited Loading

NHDaly commented Aug 8, 2023 • edited Loading

NHDaly commented Aug 8, 2023

staticfloat commented Aug 8, 2023

nickrobinson251 commented Aug 9, 2023

nickrobinson251 commented Aug 9, 2023 • edited Loading

NHDaly commented Aug 10, 2023

NHDaly commented Aug 10, 2023

NHDaly commented Aug 10, 2023

NHDaly commented Aug 10, 2023

NHDaly commented Aug 10, 2023

KristofferC commented Aug 10, 2023 • edited by oscardssmith Loading

NHDaly commented Aug 10, 2023

NHDaly commented Aug 10, 2023

NHDaly commented Aug 29, 2023

nickrobinson251 commented Aug 31, 2023

KristofferC commented Aug 31, 2023

Performance regression in `vcat(::Vector, ::Number)` in v1.9 #50831

Performance regression in `vcat(::Vector, ::Number)` in v1.9 #50831

nickrobinson251 commented Aug 8, 2023 •

edited

Loading

NHDaly commented Aug 8, 2023 •

edited

Loading

nickrobinson251 commented Aug 8, 2023 •

edited

Loading

NHDaly commented Aug 8, 2023 •

edited

Loading

nickrobinson251 commented Aug 9, 2023 •

edited

Loading

KristofferC commented Aug 10, 2023 •

edited by oscardssmith

Loading