-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc_trans: rewrite mips64 ABI code #47964
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @estebank (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
r? @eddyb |
Does clang need to do this? How many fields is that struct limited to? I'd rather not add the general case here. |
src/librustc_trans/cabi_mips64.rs
Outdated
chunks.push(bits_to_int_reg(final_int_bits)); | ||
} | ||
|
||
arg.cast_to(CastTarget::Chunked(chunks)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you don't need this! This is literally what Uniform { unit: Reg::i64(), total: trailing_int_size }
does!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's only equivalent to Uniform
if chunks
is empty when entering the else block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry I didn't see it came from more code above 😞.
Yes clang needs to do this. If you pass the structure directly to LLVM, it will allocate 1 register per field instead of merging them together into 64-bit chunks which is what happens here. There is no limit to the size of structures passed using this method. For example, if a function takes one structure argument containing 8 |
src/librustc_trans/cabi_mips64.rs
Outdated
|
||
// We only care about aligned doubles | ||
if let layout::Abi::Scalar(ref scalar) = field.abi { | ||
if let layout::F64 = scalar.value { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come this isn't recursive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in why doesn't it recurse into structures within structures?
I originally thought this from reading the ABI document, but it's a bit ambiguous and it seems the ABI is intentionally not recursive. Eg this structure is passed in an integer register by clang and gcc even though it is an aligned double.
struct test {
struct {
double x;
};
};
@jcowgill So only a prefix actually gets put into registers? How many registers are there to use? (is this something we can rely on, like we do on x64?) |
Yes only a prefix is put into registers. My earlier comment about "8 |
@jcowgill Thanks for investigating this! I'm still not super eager to merge the version using a How do you feel about I might also want to maybe move towards a model that better match hardware register sizes, so we wouldn't need to store the 64-bit size of the used registers (because they can't be anything else). |
I think that would work, although it is a bit more complicated. Would this be implemented using a separate structure the same way If I understand correctly, these would then also be equivalent? |
I meant only a |
I think my equivalence is wrong anyway since it doesn't work for floats. I don't like If I'll have a go at implementing this... |
@jcowgill I thought |
I've pushed a new version using |
src/librustc_trans/abi.rs
Outdated
cx.data_layout().aggregate_align | ||
.max(Reg { kind: RegKind::Integer, size: chunk }.align(cx)) | ||
.max(Reg { kind: RegKind::Float, size: chunk }.align(cx)) | ||
.max(Reg { kind: RegKind::Vector, size: chunk }.align(cx)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather iterate the array and set 3 bool flags - some of these might not be used.
src/librustc_trans/cabi_mips64.rs
Outdated
let dl = &cx.tcx.data_layout; | ||
let size = arg.layout.size; | ||
let align = arg.layout.align.max(dl.i32_align).min(dl.i64_align); | ||
|
||
if arg.layout.is_aggregate() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should handle the simple (non-aggregate) case first with an early return, to avoid the indentation in the aggregate case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/librustc_trans/cabi_mips64.rs
Outdated
use rustc::ty::layout::Size; | ||
fn extend_integer_width_mips(arg: &mut ArgType, bits: u64) { | ||
// This is like ArgType::extend_integer_with_to, but always sign extends | ||
// integers >= 32 bits as required by the n64 ABI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty confusing - so specifically, u32
is the only value that gets passed differently? Can you use extend_integer_with_to
except in the u32
case? That way it's more obvious where the exception lies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I've adjusted the code to do that.
src/librustc_trans/cabi_mips64.rs
Outdated
}, | ||
_ => None | ||
} | ||
}).collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do this without a Vec
? e.g. you can have a variable with the mutable iterator and call .next()
three times, checking for combinations of Some
and None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've refactored the float reg testing code into a separate function which I call manually for the different numbers of fields. I think it looks a bit better.
src/librustc_trans/cabi_mips64.rs
Outdated
|
||
if ret.layout.fields.count() == float_regs.len() { | ||
if float_regs.len() == 1 { | ||
ret.cast_to(Uniform { unit: float_regs[0], total: float_regs[0].size }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have conversion impls so you can specifically do ret.cast_to(float_regs[0])
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/librustc_trans/cabi_mips64.rs
Outdated
// Insert enough integers to cover [last_offset, offset) | ||
assert!(last_offset.is_abi_aligned(dl.f64_align)); | ||
for _ in 0..((offset - last_offset).bits() / 64) { | ||
chunks.push(RegKind::Integer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can keep an index into the final array to avoid allocating a Vec
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/librustc_trans/cabi_mips64.rs
Outdated
total: size | ||
}); | ||
if !offset.is_abi_aligned(align) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no alignment padding anymore? I'm not even sure what clang does wrt that for other targets - especially since there's a stackalign
attribute now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is correct. In 32-bit MIPS, we have to pad 64-bit aligned structures manually because we cast to an i32 Uniform so LLVM doesn't know that the structure is misaligned. In 64-bit MIPS we always use i64 chunks so this issue doesn't arise.
src/librustc_trans/abi.rs
Outdated
@@ -407,7 +407,8 @@ impl<'tcx> LayoutExt<'tcx> for TyLayout<'tcx> { | |||
#[derive(Clone, Copy, PartialEq, Eq, Debug)] | |||
pub enum CastTarget { | |||
Uniform(Uniform), | |||
Pair(Reg, Reg) | |||
Pair(Reg, Reg), | |||
ChunkedPrefix { prefix: [RegKind; 8], chunk: Size, total: Size } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really starting to wonder if we can turn all 3 variants into one (such that CastTarget
becomes a struct
).
If the prefix
array uses Option<RegKind>
instead to indicate that not all 8 slots are full, then we can replace total
with rest: Uniform
and have it represent strictly what's after the prefix.
And then Pair(a, b)
can become prefix: [Some(a.kind), None, ..., None], chunk: a.size, rest: Uniform::from(b)
.
Oh and chunk
makes more sense as prefix_chunk
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've attempted to implement this (in separate commits for the moment). It seems to work ok.
@eddyb the user pushed some commits that seems to have fixed your concerns, can you review this again? |
// Simplify to a single unit when there is no prefix and size <= unit size | ||
if self.rest.total <= self.rest.unit.size { | ||
return rest_ll_unit; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does Uniform
need to be a separate struct
at this point? Probably not, but we can revisit later.
Apologies for the delay, still digging through my notifications. This new @bors r+ |
📌 Commit 47c33f7 has been approved by |
rustc_trans: rewrite mips64 ABI code This PR rewrites the ABI handling code for 64-bit MIPS and should fix various FFI issues including rust-lang#47290. To accomodate the 64-bit ABI I have had to add a new `CastTarget` variant which I've called `Chunked` (though maybe this isn't the best name). This allows an ABI to cast to some arbitrary structure of `Reg` types. This is required on MIPS which might need to cast to a structure containing a mixture of `i64` and `f64` types.
@jcowgill I see you have a bunch of MIPS-related PRs. We are currently having trouble implementing MIPS SIMD intrinsics in coresimd (we can't compile binaries with the |
@gnzlbg I can probably help with that - assuming I find some time :) |
@jcowgill this would be a start: #46310
I started implementing those intrinsics, but because of the issue above, did not progress much. Maybe once that issue is solved, implementing the intrinsics will "just work". |
Since rust-lang#47964 was merged, 64-bit mips started passing all structures using 64-bit chunks regardless of their contents. The repr-transparent-aggregates tests needs updating to cope with this.
This PR rewrites the ABI handling code for 64-bit MIPS and should fix various FFI issues including #47290.
To accomodate the 64-bit ABI I have had to add a new
CastTarget
variant which I've calledChunked
(though maybe this isn't the best name). This allows an ABI to cast to some arbitrary structure ofReg
types. This is required on MIPS which might need to cast to a structure containing a mixture ofi64
andf64
types.