-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simplify reinterpret array code #43955
Conversation
Can you comment on how does it interact with TBAA? Doesn't this make Julia less strict even than C wrt aliasing? I don't know how much of Julia's type is passed down to LLVM currently but I wondered if it closes future possibilities. cc @vchuravy |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
Both memcpy and unsafe calls disable TBAA optimizations. That is sort of the point of these intrinsics, and of reinterpretarray. |
Thanks for the clarification. I didn't know that |
@Keno bump |
What's the status on this? Is it good to merge? |
Waiting for Keno's review |
If it passes tests, it's fine by me. The implementation looks like, but LLVM is very fussy about this kind of code, since it depends on the idiom recognition to turn it back into memcpy for performance. If there is a particular motivating case, perhaps that can be turned into a codegen test? |
Why do we want it to add extra memcpy calls? I thought we wanted it to remove those and turn it back into bitcast. This should just simplify our emitted llvm IR from being |
Depends how big the struct is that you're copying. For small things, it's just a bitcast of course. |
Avoid one of the memcpy calls, when possible.
Avoid one of the memcpy calls, when possible.
Avoid one of the memcpy calls, when possible.
This converts most
memcpy
into aunsafe_load
orunsafe_store!
(only one, not both), which should avoid at least one alloca in most cases, and hopefully helps LLVM with AA in some of those cases also.No measurable impact on the internal tests:
@nanosoldier
runbenchmarks("array", vs=":master")