Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use a tuple instead of Ref in broadcasts #85

Merged
merged 1 commit into from
Jun 1, 2024
Merged

use a tuple instead of Ref in broadcasts #85

merged 1 commit into from
Jun 1, 2024

Conversation

rafaqz
Copy link
Member

@rafaqz rafaqz commented May 31, 2024

Ref is not free

Copy link
Member

@jw3126 jw3126 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. In my experience the the compiler optimizes broadcast Ref like this very well. Are you aware of an example where Tuple is acually faster in a broadcast like this? Anyway I like simple stupid Tuple more than relying on optimizer so LGTM.

@rafaqz
Copy link
Member Author

rafaqz commented Jun 1, 2024

Benchmarking something basic was a Ns faster even though the allocation was elided in Ref. I don't have the code right now to demonstrate.

And yes keeping it simple is better anyway, this is really low level code.

@aplavin
Copy link
Member

aplavin commented Jun 1, 2024

Wow good to know if that's the case!
I've always thought that Ref is the recommended solution for broadcasting as a scalar – it's even recommended in the Julia docs. Do you happen to have a relevant Julia issue that Ref is slower? Everyone else also deserves to know :)

@rafaqz
Copy link
Member Author

rafaqz commented Jun 1, 2024

I thought performance was not the reason for the Ref recommendation, instead it's because of scalar unwrapping details.

Tuple logically cant be slower than Ref, and Ref needs compiler optimisations to match a Tuple. I don't think this is any huge revelation.

See e.g. https://discourse.julialang.org/t/marking-types-as-scalar-for-broadcasting-ref-vs-tuple/29105

@rafaqz
Copy link
Member Author

rafaqz commented Jun 1, 2024

Also see the original issues for performance concerns.

JuliaLang/julia#18379

The compiler has to actively elide Ref allocations. My suspicion is at times this complexity will stop it seeing other optimisations in a larger context.

That's important here where this small broadcast will often be in the middle of a much more complicated process that needs high performance.

@rafaqz
Copy link
Member Author

rafaqz commented Jun 1, 2024

Here's a very specific reference to the problem:

JuliaLang/julia#39151 (comment)

And here, Ref blocking constant propagation is identified as a key problem. That may be the reason my own benchmarks were slower too:

JuliaLang/julia#39151 (comment)

@rafaqz rafaqz merged commit 0442058 into master Jun 1, 2024
41 of 72 checks passed
@rafaqz rafaqz deleted the no_ref branch June 1, 2024 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants