-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use a tuple instead of Ref in broadcasts #85
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. In my experience the the compiler optimizes broadcast Ref like this very well. Are you aware of an example where Tuple is acually faster in a broadcast like this? Anyway I like simple stupid Tuple more than relying on optimizer so LGTM.
Benchmarking something basic was a Ns faster even though the allocation was elided in Ref. I don't have the code right now to demonstrate. And yes keeping it simple is better anyway, this is really low level code. |
Wow good to know if that's the case! |
I thought performance was not the reason for the Ref recommendation, instead it's because of scalar unwrapping details. Tuple logically cant be slower than Ref, and Ref needs compiler optimisations to match a Tuple. I don't think this is any huge revelation. See e.g. https://discourse.julialang.org/t/marking-types-as-scalar-for-broadcasting-ref-vs-tuple/29105 |
Also see the original issues for performance concerns. The compiler has to actively elide Ref allocations. My suspicion is at times this complexity will stop it seeing other optimisations in a larger context. That's important here where this small broadcast will often be in the middle of a much more complicated process that needs high performance. |
Here's a very specific reference to the problem: JuliaLang/julia#39151 (comment) And here, Ref blocking constant propagation is identified as a key problem. That may be the reason my own benchmarks were slower too: |
Ref is not free