Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression in vcat(::Vector, ::Number) in v1.9 #50831

Closed
nickrobinson251 opened this issue Aug 8, 2023 · 19 comments
Closed

Performance regression in vcat(::Vector, ::Number) in v1.9 #50831

nickrobinson251 opened this issue Aug 8, 2023 · 19 comments
Labels
performance Must go faster regression 1.9 Regression in the 1.9 release

Comments

@nickrobinson251
Copy link
Contributor

Seems to be fixed in v1.10

v1.8

julia> VERSION
v"1.8.4"

julia> v = collect(Any, 1:1000);

julia> i = 1;

julia> using BenchmarkTools

julia> @btime vcat($v, $i);
  515.272 ns (1 allocation: 8.00 KiB)

julia> @which vcat(v, i)
vcat(A::Union{Number, LinearAlgebra.AbstractTriangular{T, A} where {T, A<:(Matrix)}, LinearAlgebra.Adjoint{<:Any, <:Vector}, LinearAlgebra.Bidiagonal, LinearAlgebra.Diagonal, LinearAlgebra.Hermitian{T, A} where {T, A<:(Matrix)}, LinearAlgebra.SymTridiagonal, LinearAlgebra.Symmetric{T, A} where {T, A<:(Matrix)}, LinearAlgebra.Transpose{<:Any, <:Vector}, LinearAlgebra.Tridiagonal, Matrix, Vector}...) in LinearAlgebra at /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/special.jl:419

v1.9

julia> VERSION
v"1.9.2"

julia> v = collect(Any, 1:1000);

julia> i = 1;

julia> using BenchmarkTools

julia> @btime vcat($v, $i);
  1.050 μs (13 allocations: 8.28 KiB)

julia> @which vcat(v, i)
vcat(A::Union{Number, Matrix, Vector}...)
     @ Base array.jl:1957

v1.10

julia> VERSION
v"1.10.0-beta1"

julia> v = collect(Any, 1:1000);

julia> i = 1;

julia> using BenchmarkTools

julia> @btime vcat($v, $i);
  521.382 ns (1 allocation: 8.00 KiB)

julia> @which vcat(v, i)
vcat(A::Union{Number, AbstractArray}...)
     @ Base abstractarray.jl:1990
@nickrobinson251
Copy link
Contributor Author

nickrobinson251 commented Aug 8, 2023

silly workaround is to replace i with [i]

julia> VERSION
v"1.8.4"

julia> i_vec = [i];

julia> @btime vcat($v, $i);
  515.092 ns (1 allocation: 8.00 KiB)

julia> @btime vcat($v, $i_vec);
  517.818 ns (1 allocation: 8.00 KiB)
  
julia> @btime vcat($v, [$i]);
  532.624 ns (2 allocations: 8.06 KiB)
julia> VERSION
v"1.9.2"

julia> i_vec = [i];

julia> @btime vcat($v, $i);
  1.054 μs (13 allocations: 8.28 KiB)

julia> @btime vcat($v, $i_vec);
  516.581 ns (1 allocation: 8.00 KiB)
  
julia> @btime vcat($v, [$i]);
  543.651 ns (2 allocations: 8.06 KiB)
julia> VERSION
v"1.10.0-beta1"

julia> @btime vcat($v, $i);
  523.656 ns (1 allocation: 8.00 KiB)

julia> @btime vcat($v, $i_vec);
  522.568 ns (1 allocation: 8.00 KiB)
  
julia> @btime vcat($v, [$i]);
  544.312 ns (2 allocations: 8.06 KiB)

EDIT: hmm, although this workaround doesn't seem to recover performance in my real use-case (which is a little tricky to reduce down... will work on it)

@brenhinkeller brenhinkeller added the regression 1.9 Regression in the 1.9 release label Aug 8, 2023
@NHDaly
Copy link
Member

NHDaly commented Aug 8, 2023

Can we also try this as a workaround?:

julia> @btime push!(copy($v), $i);
  454.703 ns (2 allocations: 24.81 KiB)

EDIT: thanks, i meant push not append 👍

@mbauman mbauman added the performance Must go faster label Aug 8, 2023
@mbauman mbauman changed the title Regression in vcat(::Vector, ::Number) in v1.9 Performance regression in vcat(::Vector, ::Number) in v1.9 Aug 8, 2023
@nickrobinson251
Copy link
Contributor Author

nickrobinson251 commented Aug 8, 2023

yes, push!(copy($v), $i) recovers the time on this example (and seems to be a better workaround, as it also recovers the time on the real use-case!)

So does append!(copy($v), $i) (although that relies on i having a length, which it does in this example but not in 1 of the real use-cases where i found this)

julia> VERSION
v"1.8.4"

julia> @btime vcat($v, $i);
  503.625 ns (1 allocation: 8.00 KiB)

julia> @btime push!(copy($v), $i);
  472.148 ns (2 allocations: 24.81 KiB)
  
julia> @btime append!(copy($v), $i);
  472.862 ns (2 allocations: 24.81 KiB)
julia> VERSION
v"1.9.2"

julia> @btime vcat($v, $i);
  1.058 μs (13 allocations: 8.28 KiB)

julia> @btime push!(copy($v), $i);
  483.031 ns (2 allocations: 24.81 KiB)
  
julia> @btime append!(copy($v), $i);
  483.087 ns (2 allocations: 24.81 KiB)
julia> VERSION
v"1.10.0-beta1"

julia> @btime vcat($v, $i);
  521.052 ns (1 allocation: 8.00 KiB)

julia> @btime push!(copy($v), $i);
  573.370 ns (2 allocations: 24.81 KiB)

julia> @btime append!(copy($v), $i);
  579.710 ns (2 allocations: 24.81 KiB)

@NHDaly
Copy link
Member

NHDaly commented Aug 8, 2023

Bisect points to 3f04640, which backported #48933, as introducing this regression for 1.9.

I'll try again to see if i can get bisect to tell me where it was fixed.

@NHDaly
Copy link
Member

NHDaly commented Aug 8, 2023

(CC: @staticfloat for the above)

@staticfloat
Copy link
Member

@dkarrasch may know, as he authored the original PR.

@nickrobinson251
Copy link
Contributor Author

btw, for our real-use we just rolled our own function for now (and i'm sure there are other good ways to write this same thing):

function vcat_element(v::Vector{T}, x::T) where T
    len = length(v)
    new_v = Vector{T}(undef, len + 1)
    unsafe_copyto!(new_v, 1, v, 1, len)
    new_v[end] = x
    return new_v
end

@nickrobinson251
Copy link
Contributor Author

nickrobinson251 commented Aug 9, 2023

Also, i see a few other vcat perf issues being reported for recent releases e.g.

Should we add some more vcat-related benchmarks somewhere?

@NHDaly
Copy link
Member

NHDaly commented Aug 10, 2023

More benchmarks sounds good.

Also sorry I forgot to report back.. Bisect results were confusing as to finding which commit fixed this on the 1.10 branch. After looking more closely at the bisect log, I realized why it was confusing:

  • The original commit, e6c84a1, didn't cause a regression on the 1.10 branch! The regression only occurred on the 1.9 branch during the backport.

So either there was a bug during backporting, or somehow this commit was depending on something only present in 1.10 to make it effective.

@NHDaly
Copy link
Member

NHDaly commented Aug 10, 2023

I'm not sure how best to find out what commits in 1.10 were needed for the original commit to not cause a regression? I guess we could do a bisect that replays the backport onto various commits inbetween 1.9 and 1.10 to see when it wouldn't have caused a regression?

@NHDaly
Copy link
Member

NHDaly commented Aug 10, 2023

Okay, I ran that bisect! Here's the bisect scripts I used, in case it's helpful to anyone in the future:

# Usage: ./bisect-script.sh <julia script>

# assert that the script is given
if [ -z "$1" ]; then
    echo "Usage: ./bisect-script.sh <julia script>"
    exit 1
fi

git cherry-pick -m1 e6c84a1eae1f75cad1c62f62f2074c0f1bf124d1  # <-- this was the extra part I added for this
make clean
make -j || (make cleanall; make -j)
./julia --startup=no $1 && (
    git reset --hard HEAD~  # cleanup <-- also added this
    exit 0
) || (
    git reset --hard HEAD~  # cleanup <-- and this
    exit 1
)
@info VERSION

const v = collect(Any, 1:1000);
const i = 1;

using Test

f(i) = vcat(v, i)
# warmup
@test length(f(i)) == 1001
# final actual test
# NOTE: Swapped the test while bisecting for the fix. "failing" means it's fixed.
@test @allocations(f(i)) > 5

@NHDaly
Copy link
Member

NHDaly commented Aug 10, 2023

Bisect pointed to 1543cdd as the first commit inbetween 1.9 and 1.10, where backporting the vcat change doesn't introduce a regression.

That's coming from #48720.

I guess adding these two lines is what prevented the regression?:

+checkindex(::Type{Bool}, inds::IdentityUnitRange, i::Real) = checkindex(Bool, inds.indices, i)
+checkindex(::Type{Bool}, inds::OneTo{T}, i::T) where {T<:BitInteger} = unsigned(i - one(i)) < unsigned(last(inds))

@NHDaly
Copy link
Member

NHDaly commented Aug 10, 2023

The question now is how to proceed.

I'm very far away from this kind of performance tuning work. I see at least these two options:

  1. Back-out the original change that introduced this regression in 1.9. My understanding is that this was a bug fix, though? So we probably don't want to do that?
  2. Backport faster iteration over subarrays (views) #48720 into 1.9 as well, to resolve the perf regression.
    • Is there a chance that this can itself cause more harm though? I obviously prefer avoiding introducing new changes, since they can have their own unintended side effects....

Thanks!!
Tagging @N5N3, @matthias314, @staticfloat, and @KristofferC who could maybe help decide how to proceed? Thanks

@KristofferC
Copy link
Member

KristofferC commented Aug 10, 2023

If it fixes it, backporting #48720 doesnt seem so bad.

@NHDaly
Copy link
Member

NHDaly commented Aug 10, 2023

I have confirmed that backporting #48720 fixes the regression locally. I'll mark it for backporting then. 👍

Thank you!

@NHDaly
Copy link
Member

NHDaly commented Aug 10, 2023

... Do you understand why it fixes the regression? That would probably be good to understand before backporting, to ensure that we aren't introducing any other complexities?

@NHDaly
Copy link
Member

NHDaly commented Aug 29, 2023

This should be fixed in 1.9.3. Leaving a note here for us to confirm.

@vtjnash vtjnash closed this as completed Aug 31, 2023
@nickrobinson251
Copy link
Contributor Author

If I wanted to follow-up about adding some benchmarks related to vcat performance, where/how would i do that?

@KristofferC
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster regression 1.9 Regression in the 1.9 release
Projects
None yet
Development

No branches or pull requests

7 participants