Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Begin exercising rustc's right to reorder struct fields for optimal padding #28951

Closed
bstrie opened this issue Oct 10, 2015 · 18 comments
Closed

Comments

@bstrie
Copy link
Contributor

bstrie commented Oct 10, 2015

The representation of structs that are not tagged with a repr attribute is currently and deliberately undefined, in order to allow us to optimize the representation to automatically reduce memory usage due to excess padding (as described in http://www.catb.org/esr/structure-packing/ ). I don't believe we're currently exercising this anywhere, which could present a subtle backcompat hazard if people are using unsafe code whose semantics relies on the unstable representation. Even if we've technically reserved the right to break this sort of code, failing to act soon enough could potentially make such breakage more than we can tolerate, tying our hands.

I don't believe that this requires an RFC, as it's a longstanding design decision and the "ideal" representation is fairly straightforward and uncontroversial (issues like false sharing notwithstanding).

@bstrie
Copy link
Contributor Author

bstrie commented Oct 10, 2015

Let's use this as a tracking issue to begin discussing blockers that need to be resolved. @eddyb mentions that we should start by removing trailing padding... whatever that means. :P

@eddyb
Copy link
Member

eddyb commented Oct 10, 2015

C mixes pointers and arrays in various ways and thus requires that the size of any type is a multiple of the alignment.
We don't have to do this, except for #[repr(C)] structs, so ((u16, u8), u8) could be 4 bytes instead of 6 (that it currently is).
ptr.offset(n) would continue to work by rounding up the size to be a multiple of the alignment, and we currently don't make guarantees about size_of so nobody should mix pointers to sequences and addresses/sizes.

@petrochenkov
Copy link
Contributor

@eddyb
That sounds.. unsettling.
If size_of<T> will still report size of single T and not "in-array" T, then the first thing it will break is std::Vec (and who knows what else in external crates).
On the other hand if size_of<T> will report size of "in-array" T, then nothing bad will probably happen, just some pessimization. A new function min_size_of will probably be useful, e.g. for single allocations.

@eddyb
Copy link
Member

eddyb commented Oct 10, 2015

I was thinking only about pointers + .offset vs addresses + size_of, so I missed the cases where the size of [T; n] is computed for runtime n.
Maybe it's time for a grep across crates.io, to find all the fragile uses.

@petrochenkov
Copy link
Contributor

A quick grep across stdlib tells that whole libcollections and libarena is one big fragile use.
It's interesting, that most of size_of results either multiplied on something, either compared with 0, or used on something where calculation method doesn't matter, like on u64 to count bits. Cases where the size of a single element is needed seem to be rare.

@petrochenkov
Copy link
Contributor

BTW, what is the algorithm for the optimal structure packing? (Let's start with "in-array" packing, which is more important to optimize.)
A quick googling doesn't give an answer.
The article above mentions sorting by alignment, but it's not always optimal, counterexample -
4 fields x(s:1,a:2), y(s:1,a:2), z(s:1,a:1), w(s:1,a:1), the optimal packing
|x|z|y|w| is not sorted by alignment.
The algorithm should be a part of ABI (if it's ever stabilized), so it shouldn't involve solving NP hard problems or something :D

@sfackler
Copy link
Member

NP hard problems are not really concerning if n is 4 :P

@pnkfelix
Copy link
Member

On the other hand if size_of will report size of "in-array" T, then nothing bad will probably happen, just some pessimization. A new function min_size_of will probably be useful, e.g. for single allocations.

This is an interesting point. The user-defined allocator API that I've been working on has some utility methods for doing stuff with sizes and alignment; I, like @eddyb, had assumed that we should make size_of method provided by that API not be required to return a value that is a multiple of the alignment (and instead there is a separate function to calculate the size of such padding).

But after seeing the discussion here, I am wondering if that is wise ... given the precedent that has been set by C, and the likelihood of widespread assumptions about the size_of API, maybe I should have size_of round up to multiple-of-alignment, and add a second method, size_of_one, that does not.

@ranma42
Copy link
Contributor

ranma42 commented Oct 12, 2015

I am wary about having the same type T with different layout or size depending on the context where it is used (array vs single value, for example).
I definitely expect some code to use size_of in order to create slices of values of a type and unsafely copy them. This would break if in-array and single element had different in-memory representation (example: if in-array is padded and is slice-copied to a non-padded single element, something else is getting overwritten by the padding) or if the type changed layout in different contexts.

Nonetheless AFAICT there should be no problem with rearranging the fields of a struct as long as the layout (padding included) is the same for all of the uses of the type.

@petrochenkov
Copy link
Contributor

@ranma42
"Useful" part of layout of T will have to be the same everywhere, because a reference &T doesn't know where it points to - in array or not, but it needs to know offsets of all fields. So, the single instances will have to have in-array layout (because, again, it's more important to optimize) and the only possible difference is presence or absence of the final padding.
Edit: but treating a single element as slice ([T] with dynamic size, not [T; 1]) (http://doc.rust-lang.org/nightly/std/slice/fn.mut_ref_slice.html and friends) still can pose problems, it needs investigation.
Edit2: The strikethrough line is wrong, if the structure is optimally packed without the final padding, then it's optimally packed with the final padding as well, the opposite is not true.

@Amanieu
Copy link
Member

Amanieu commented Dec 6, 2015

Some notes from the Swift ABI related to this.

@petrochenkov
Copy link
Contributor

@sfackler

NP hard problems are not really concerning if n is 4 :P

Structures like this have to conform to ABI as well.

@Amanieu
Copy link
Member

Amanieu commented Apr 12, 2016

The algorithm should be a part of ABI (if it's ever stabilized), so it shouldn't involve solving NP hard problems or something :D

Actually I don't think we need to make the struct packing algorithm part of the ABI. Instead we can just put the chosen struct layout in the metadata of the crate which defines the struct. The only downside is that two different structs with the same contents are not guaranteed to have the same layout.

@eddyb
Copy link
Member

eddyb commented Apr 12, 2016

@Amanieu At least in theory, layouts have to be deterministic, but can depend on the crate SVH (as a seed).
In practice, the items taken into consideration have to be part of the dependency subgraph of the type itself, for incremental recompilation.

@ahicks92
Copy link
Contributor

I think #37429 addresses this for the most part, but note that it's being turned off by #38523. There is a thread on it here.

There is more exotic stuff being discussed here, in particular the Swift ABI. I think it might be worth doing someday, but I doubt it will help much. Especially as compared to the amount of work required to make it happen given that it wasn't planned for from the start, and the potential pessimization of using packed structs for everything in LLVM. That said, I thought field reordering would help tremendously and it didn't help tremendously, so perhaps I'm wrong about this too.

@bstrie wants to know if they can close this; I'd leave it open for now. The functionality is in and working, but it's not rolled out yet.

@Mark-Simulacrum
Copy link
Member

I think we turned this back on with optimization fuel; I'm uncertain whether there's any reason to keep this open. cc @camlorn @eddyb

@SimonSapin
Copy link
Contributor

Yes, in #40377.

@ahicks92
Copy link
Contributor

Yeah, someone can close this.

We can technically do better, but not without a major refactor on order with the one I already did, and not without losing performance elsewhere. Since we already do better than most other languages (the only one I know of that does better is Swift), I think it's more than sufficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants