-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manually implement PartialEq for Option<T> and specialize non-nullable types #103556
Manually implement PartialEq for Option<T> and specialize non-nullable types #103556
Conversation
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
r? @m-ou-se (rust-highfive has picked a reviewer for you, use r? to override) |
perf run was requested @bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit d0a58800564093f4b6db3ae0b5f76547c7564f4b with merge 05058ddce07865ec73eedb7c8d2cddd50d96d959... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty significant codegen win, although I'm surprised we don't get this already. Needs a fix for some UB, tho (fixing the UB in godbolt still produces the codegen win).
This comment has been minimized.
This comment has been minimized.
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit a8140e59b80fae0c3b27e3a1d6e0b176dd5fb757 with merge 8032e517bc4d1e2309051ba99ef9c8beaff83a82... |
This comment has been minimized.
This comment has been minimized.
Made the NonNull test support |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 5ed28fed2a0b927a267da146d51ec99bce8bc92f with merge a041a05c3184bb8c38b8940422e8951e99b6d3f1... |
|
☀️ Try build successful - checks-actions |
Queued a041a05c3184bb8c38b8940422e8951e99b6d3f1 with parent 6365e5a, future comparison URL. |
I went to try to see if there's anything we could do to make LLVM understand this, and realized that right now we're shooting outselves in the foot: https://rust.godbolt.org/z/Ye5xr8P8x What's pub fn demo_std(x: &NonZeroU32, y: &NonZeroU32) -> bool {
x == y
} define noundef zeroext i1 @_ZN7example8demo_std17hcce6db1e74f1c1d4E(ptr noalias nocapture noundef readonly align 4 dereferenceable(4) %0, ptr noalias nocapture noundef readonly align 4 dereferenceable(4) %1) unnamed_addr #0 {
%_9 = load i32, ptr %0, align 4
%_10 = load i32, ptr %1, align 4
%2 = icmp eq i32 %_9, %_10
ret i1 %2
} Whereas if you write the obvious implementation yourself pub fn demo_obvious(x: &NonZeroU32, y: &NonZeroU32) -> bool {
x.get() == y.get()
} Then the loads get the efine noundef zeroext i1 @_ZN7example12demo_obvious17haee70b6eb73f133dE(ptr noalias nocapture noundef readonly align 4 dereferenceable(4) %x, ptr noalias nocapture noundef readonly align 4 dereferenceable(4) %y) unnamed_addr #0 {
%self = load i32, ptr %x, align 4, !range !2, !noundef !3
%self1 = load i32, ptr %y, align 4, !range !2, !noundef !3
%0 = icmp eq i32 %self, %self1
ret i1 %0
}
!2 = !{i32 1, i32 0} It's possible that LLVM still might not be able to optimize this even with that for other reasons (#49572 (comment)), but I think we should at least find out whether giving LLVM the obvious information would be enough to let it make this transform -- it would be great if we could solve this in the EDIT: Oh, if nikic already looked then there's probably no easy fix. |
cc #49892 |
b6b33c2
to
a1b650c
Compare
@rustbot ready |
☔ The latest upstream changes (presumably #103797) made this pull request unmergeable. Please resolve the merge conflicts. |
a1b650c
to
20f2d8b
Compare
Thanks! It's great that this worked out without adding any @bors r+ |
Weird, let's try that again @bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (8841bee): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
Perf changes are few, tiny, and not a concern. @rustbot label: +perf-regression-triaged |
…eq, r=scottmcm Manually implement PartialEq for Option<T> and specialize non-nullable types This PR manually implements `PartialEq` and `StructuralPartialEq` for `Option`, which seems to produce slightly better codegen than the automatically derived implementation. It also allows specializing on the `core::num::NonZero*` and `core::ptr::NonNull` types, taking advantage of the niche optimization by transmuting the `Option<T>` to `T` to be compared directly, which can be done in just two instructions. A comparison of the original, new and specialized code generation is available [here](https://godbolt.org/z/dE4jxdYsa).
This PR manually implements
PartialEq
andStructuralPartialEq
forOption
, which seems to produce slightly better codegen than the automatically derived implementation.It also allows specializing on the
core::num::NonZero*
andcore::ptr::NonNull
types, taking advantage of the niche optimization by transmuting theOption<T>
toT
to be compared directly, which can be done in just two instructions.A comparison of the original, new and specialized code generation is available here.