Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use lispy tuples in cat (fixes #21673) #39314

Merged
merged 2 commits into from
Jan 20, 2021
Merged

Conversation

timholy
Copy link
Member

@timholy timholy commented Jan 19, 2021

The cat pipeline has long had poor inferrability.
Together with #39292 and #39294, this should basically put an
end to that problem.

This (which is on top of #39292) gives me the following for the benchmarks in #21673:

julia> @btime test1(20);
  808.795 ns (18 allocations: 720 bytes)

julia> @btime test2(20);
  28.607 ns (1 allocation: 256 bytes)

which is already a considerable improvement for test1 (I got 2.448 μs before any of these changes, and 1.746 μs with just #39292).
But with this and #39294, I get

julia> @btime test1(20)
  33.313 ns (1 allocation: 256 bytes)

That's close to a 100x improvement over where we started, and within 20% of test2. I think that's good enough to declare #21673 fixed.

The `cat` pipeline has long had poor inferrability.
Together with #39292 and #39294, this should basically put an
end to that problem.

Together, at least in simple cases these make the performance
of `cat` essentially equivalent to the manual version.
In other words, the `test1` and `test2` of #21673 benchmark
very similarly.
@jebej
Copy link
Contributor

jebej commented Jan 19, 2021

Thanks for working on this! It appears that 1.6 has regressed substantially compared to 1.5, so hopefully these PRs can be backported.

__cat(A, shape, catdims, X...) = __cat_offset!(A, shape, catdims, ntuple(zero, length(shape)), X...)

function __cat_offset!(A, shape, catdims, offsets, x, X...)
# splitting the "work" on x from X... may reduce latency (fewer costly specializations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever!

@JeffBezanson JeffBezanson added performance Must go faster backport 1.6 Change should be backported to release-1.6 labels Jan 19, 2021
@timholy timholy merged commit 78d55e2 into master Jan 20, 2021
@timholy timholy deleted the teh/more_cat_inferrability branch January 20, 2021 06:46
KristofferC pushed a commit that referenced this pull request Jan 20, 2021
The `cat` pipeline has long had poor inferrability.
Together with #39292 and #39294, this should basically put an
end to that problem.

Together, at least in simple cases these make the performance
of `cat` essentially equivalent to the manual version.
In other words, the `test1` and `test2` of #21673 benchmark
very similarly.

(cherry picked from commit 78d55e2)
@KristofferC KristofferC removed the backport 1.6 Change should be backported to release-1.6 label Feb 1, 2021
KristofferC pushed a commit that referenced this pull request Feb 1, 2021
The `cat` pipeline has long had poor inferrability.
Together with #39292 and #39294, this should basically put an
end to that problem.

Together, at least in simple cases these make the performance
of `cat` essentially equivalent to the manual version.
In other words, the `test1` and `test2` of #21673 benchmark
very similarly.

(cherry picked from commit 78d55e2)
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this pull request May 4, 2021
The `cat` pipeline has long had poor inferrability.
Together with JuliaLang#39292 and JuliaLang#39294, this should basically put an
end to that problem.

Together, at least in simple cases these make the performance
of `cat` essentially equivalent to the manual version.
In other words, the `test1` and `test2` of JuliaLang#21673 benchmark
very similarly.
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this pull request May 9, 2021
The `cat` pipeline has long had poor inferrability.
Together with JuliaLang#39292 and JuliaLang#39294, this should basically put an
end to that problem.

Together, at least in simple cases these make the performance
of `cat` essentially equivalent to the manual version.
In other words, the `test1` and `test2` of JuliaLang#21673 benchmark
very similarly.
staticfloat pushed a commit that referenced this pull request Dec 23, 2022
The `cat` pipeline has long had poor inferrability.
Together with #39292 and #39294, this should basically put an
end to that problem.

Together, at least in simple cases these make the performance
of `cat` essentially equivalent to the manual version.
In other words, the `test1` and `test2` of #21673 benchmark
very similarly.

(cherry picked from commit 78d55e2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants