Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let compiler assume, that Layout::align() is always a power of two #75281

Closed

Conversation

TimDiekmann
Copy link
Member

A perf-run may be useful. I don't expect a great speed up there though, as Layouts created with Layout::new() are known at compile time and this is always (almost?) used in std.

Fixes #75264

@rust-highfive
Copy link
Collaborator

r? @LukasKalbertodt

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 8, 2020
@LukasKalbertodt
Copy link
Member

Good idea. Let's just try.

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Aug 8, 2020

⌛ Trying commit 82cbd18 with merge c6f0e225a4165554133f9072026caea2e61861ff...

@bors
Copy link
Contributor

bors commented Aug 8, 2020

☀️ Try build successful - checks-actions, checks-azure
Build commit: c6f0e225a4165554133f9072026caea2e61861ff (c6f0e225a4165554133f9072026caea2e61861ff)

@rust-timer
Copy link
Collaborator

Queued c6f0e225a4165554133f9072026caea2e61861ff with parent c989ac1, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (c6f0e225a4165554133f9072026caea2e61861ff): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

@LukasKalbertodt
Copy link
Member

A couple of cases have regressed by at most a few percent. I would almost say the differences are all just noise. But I am not sure, I don't really have experience with interpreting these perf results. But yeah, apart from the debug_assert, I can't possibly imagine how your change makes stuff slower...

Should we ping some people who can help interpret those results?

@TimDiekmann
Copy link
Member Author

TimDiekmann commented Aug 8, 2020

I could remove the debug-assertion and do another perf-run. Actually it just double checking if Layout::from_size_align_unchecked wasn't used wrong so maybe there should be an assertion instead. Or we'll leave it out altogether.

@Amanieu You may want to take a look at this?


I don't really have experience with interpreting these perf results

Me neither, I always hope everything is green 😄

@Amanieu
Copy link
Member

Amanieu commented Aug 8, 2020

I don't think we gain anything from this: in the standard library we are careful to always use from_size_align_unchecked when dealing with alignment that are known to be a power of 2 to avoid the extra check. And at this point I don't think anyone is using Layout outside of the standard library.

@Mark-Simulacrum
Copy link
Member

The debug assert shouldn't matter (std in perf doesn't have them on), but the assume and check is likely causing more code to be generated or inclined - e.g. query counts here changed quite a bit: https://perf.rust-lang.org/detailed-query.html?commit=c6f0e225a4165554133f9072026caea2e61861ff&base_commit=c989ac132a81503ceaaa1054da0a3a8c5c305ef0&benchmark=issue-46449-opt&run_name=full

@TimDiekmann
Copy link
Member Author

And at this point I don't think anyone is using Layout outside of the standard library.

I don't think this is a true. Layout is stable and anyone implementing GlobalAlloc deals with it. Additionally the allocator-API is using this type. It might be true, that often this path can be optimized away as a layout is used, which is known at compile time using Layout::new<T>.

the assume and check is likely causing more code to be generated or inclined

I also think this is the main reason, why there are a very few small regressions. Is there a change, that intrinsics::assume will work better once it's const?

@LukasKalbertodt
Copy link
Member

And at this point I don't think anyone is using Layout outside of the standard library.

I do use it in a crate of mine (and already for quite some time). However, I do not call align().

I am not quite sure how to proceed with this PR :/
Should we ping more people to better judge if the regressions are real or noise? Or do we just close as a change only motivated by performance didn't show any performance advantages? Do we even have practical examples where we would expect a performance improvement? What exactly does the knowledge of an integer being power-of-2 give us (except eliding checks that check exactly that)? And maybe this all does not matter anyway because it is always next to allocation code which takes a lot more time anyway?

@Amanieu
Copy link
Member

Amanieu commented Aug 10, 2020

I'm personally tending towards closing this PR. I don't think we should be complicating the code with optimizations that don't actually improve performance in practice.

@TimDiekmann
Copy link
Member Author

TimDiekmann commented Aug 11, 2020

What exactly does the knowledge of an integer being power-of-2 give us?

It's very fast to align a pointer to a power of two or round it down to the previous power of two using bit operations. size & !(align - 1) removes the last bits and will never wrap, as align is always > 0 (0 is not a power of two), (ptr + align - 1) & !(align - 1) aligns up a pointer to align and only ptr + align has to be checked for wrapping.

However, this optimiziation is only useful if you have to check usize::is_power_of_two and if you have check it at runtime. In std, most of the code uses Layout::new (or mem::size_of/align_of), which takes T as parameter, so the compiler implies this check. As soon as you use Layout::from_size_align at runtime, the check has to be made. As @Amanieu pointed out, std uses the unsafe counterpart Layout::from_size_align_unchecked whenever possible to bypasses this check (and the check to avoid wrapping size + align mentioned above).

And maybe this all does not matter anyway because it is always next to allocation code which takes a lot more time anyway?

While this is true for most allocation code, there a still allocations in the hot path using primitive, very fast allocators like stack allocators (e.g. bumpalo or Region). For the latter, the allocation code is this, I thing bumpalo's allocator is similar. Note, that there are only two branches:

fn alloc(layout) -> NonNull<[u8]> {
    let new_ptr = self.ptr.checked_sub(layout.size()).ok_or(AllocErr)?;
    let aligned_ptr = new_ptr & !(layout.align() - 1);

    if unlikely(aligned_ptr < self.start()) {
        return Err(AllocErr);
    }

    let ptr = unsafe { NonNull::new_unchecked(ptr as *mut u8) };
    let memory = NonNull::slice_from_raw_parts(ptr, self.ptr - aligned_ptr);
    self.ptr = aligned_ptr;
    Ok(memory)
}

@crlf0710 crlf0710 added I-needs-decision Issue: In need of a decision. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 28, 2020
@crlf0710 crlf0710 added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 18, 2020
@Dylan-DPC-zz Dylan-DPC-zz added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 18, 2020
@Amanieu
Copy link
Member

Amanieu commented Sep 18, 2020

Considering the disappointing perf results, I'm going to close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-needs-decision Issue: In need of a decision. S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance: Layout::align() can assume, that the alignment is a power of two
9 participants