-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix UB in LLVM FFI when passing zero or >1 bundle #123941
Conversation
Caught this while working on #123936. |
Perf is probably fairly useless because we don't have bundles AFAICT outside of MSVC or CFI. I suspect that there's potential for optimizing here to avoid copies in the case that N=1 because ArrayRef works out, and that is likely a relatively common case, but this seems like the right fix to start with and should be cheap on Linux since the vector isn't ever pushed to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=me with a nit.
Rust passes a *const &OperandBundleDef to these APIs, usually from a Vec<&OperandBundleDef> or so. Previously we were dereferencing that pointer and passing it to the ArrayRef constructor with some length (N). This meant that if the length was 0, we were dereferencing a pointer to nowhere, and if the length was >1 then loading the *second* element somewhere in LLVM would've been reading past the end. Since Rust can't hold OperandBundleDef by-value we're forced to indirect through a vector that copies out the OperandBundleDefs from the by-reference list on the Rust side in order to match the LLVM expected API.
34c110e
to
bf3decc
Compare
@bors r=nikic r? nikic |
Fix UB in LLVM FFI when passing zero or >1 bundle Rust passes a `*const &OperandBundleDef` to these APIs, usually from a `Vec<&OperandBundleDef>` or so. Previously we were dereferencing that pointer and passing it to the ArrayRef constructor with some length (N). This meant that if the length was 0, we were dereferencing a pointer to nowhere (if the vector on the Rust side didn't actually get allocated or so), and if the length was >1 then loading the *second* element somewhere in LLVM would've been reading past the end. Since Rust can't hold OperandBundleDef by-value we're forced to indirect through a vector that copies out the OperandBundleDefs from the by-reference list on the Rust side in order to match the LLVM expected API.
💔 Test failed - checks-actions |
@@ -1597,11 +1613,18 @@ LLVMRustBuildCallBr(LLVMBuilderRef B, LLVMTypeRef Ty, LLVMValueRef Fn, | |||
IndirectDestsUnwrapped.push_back(unwrap(IndirectDests[i])); | |||
} | |||
|
|||
// FIXME: Is there a way around this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to make sure I understand correctly... in Rust we have this data as an array of pointers such that the actual OperandBundleDef are non-contiguous, but LLVM wants them as a single flat array with all the OperandBundleDef next to each other? So really in Rust we'd want Vec<OperandBundleDef>
instead of Vec<&OperandBundleDef>
? But for some reason this can't be by-val in Rust, despite LLVM requiring this as a packed array (which can only be created by-val)?
May be worth adding some comment to explain this...
The fact that in Rust we have two types called OperandBundleDef
where one is a pointer to the other does not help either.^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's right. OperandBundleDef is a C++ class - we can't easily pass that around by value in Rust without something like cxx to produce the right bindings (and even then it might not be possible if anything isn't trivially movable inside it).
I'm not sure there's a good place for a comment. I think long-term if we want to optimize this the right strategy is to have our wrapper code allocate the Vec and let Rust push into that Vec (roughly speaking).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we have two types on the rust side (can't grep right now). The OperandBundleDef I'm aware of is an extern type (I.e. lacks a size) so it can only be held by reference. It's the same thing as the C++ OperandBundleDef though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we have two, see here.
Fix UB in LLVM FFI when passing zero or >1 bundle Rust passes a `*const &OperandBundleDef` to these APIs, usually from a `Vec<&OperandBundleDef>` or so. Previously we were dereferencing that pointer and passing it to the ArrayRef constructor with some length (N). This meant that if the length was 0, we were dereferencing a pointer to nowhere (if the vector on the Rust side didn't actually get allocated or so), and if the length was >1 then loading the *second* element somewhere in LLVM would've been reading past the end. Since Rust can't hold OperandBundleDef by-value we're forced to indirect through a vector that copies out the OperandBundleDefs from the by-reference list on the Rust side in order to match the LLVM expected API.
@@ -1524,13 +1524,21 @@ extern "C" void LLVMRustFreeOperandBundleDef(OperandBundleDef *Bundle) { | |||
|
|||
extern "C" LLVMValueRef LLVMRustBuildCall(LLVMBuilderRef B, LLVMTypeRef Ty, LLVMValueRef Fn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extern "C" LLVMValueRef LLVMRustBuildCall(LLVMBuilderRef B, LLVMTypeRef Ty, LLVMValueRef Fn, | |
// OpBundlesIndirect is an array of pointers (*not* a pointer to an array). | |
extern "C" LLVMValueRef LLVMRustBuildCall(LLVMBuilderRef B, LLVMTypeRef Ty, LLVMValueRef Fn, |
Something like, maybe, for the comment?
"Pointer to array" is how the old code treated it.
Rollup of 12 pull requests Successful merges: - rust-lang#123423 (Distribute LLVM bitcode linker as a preview component) - rust-lang#123548 (libtest: also measure time in Miri) - rust-lang#123666 (Fix some typos in doc) - rust-lang#123864 (Remove a HACK by instead inferring opaque types during expected/formal type checking) - rust-lang#123896 (Migrate some diagnostics in `rustc_resolve` to session diagnostic) - rust-lang#123919 (builtin-derive: tag → discriminant) - rust-lang#123922 (Remove magic constants when using `base_n`.) - rust-lang#123931 (Don't leak unnameable types in `-> _` recover) - rust-lang#123933 (move the LargeAssignments lint logic into its own file) - rust-lang#123934 (`rustc_data_structures::graph` mini refactor) - rust-lang#123941 (Fix UB in LLVM FFI when passing zero or >1 bundle) - rust-lang#123957 (disable create_dir_all_bare test on all(miri, windows)) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#123941 - Mark-Simulacrum:fix-llvm-ub, r=nikic Fix UB in LLVM FFI when passing zero or >1 bundle Rust passes a `*const &OperandBundleDef` to these APIs, usually from a `Vec<&OperandBundleDef>` or so. Previously we were dereferencing that pointer and passing it to the ArrayRef constructor with some length (N). This meant that if the length was 0, we were dereferencing a pointer to nowhere (if the vector on the Rust side didn't actually get allocated or so), and if the length was >1 then loading the *second* element somewhere in LLVM would've been reading past the end. Since Rust can't hold OperandBundleDef by-value we're forced to indirect through a vector that copies out the OperandBundleDefs from the by-reference list on the Rust side in order to match the LLVM expected API.
…-Simulacrum llvm RustWrapper: explain OpBundlesIndirect argument type Follow-up to rust-lang#123941 r? `@Mark-Simulacrum`
…-Simulacrum llvm RustWrapper: explain OpBundlesIndirect argument type Follow-up to rust-lang#123941 r? ``@Mark-Simulacrum``
Rollup merge of rust-lang#124132 - RalfJung:OpBundlesIndirect, r=Mark-Simulacrum llvm RustWrapper: explain OpBundlesIndirect argument type Follow-up to rust-lang#123941 r? ``@Mark-Simulacrum``
Rust passes a
*const &OperandBundleDef
to these APIs, usually from aVec<&OperandBundleDef>
or so. Previously we were dereferencing that pointer and passing it to the ArrayRef constructor with some length (N).This meant that if the length was 0, we were dereferencing a pointer to nowhere (if the vector on the Rust side didn't actually get allocated or so), and if the length was >1 then loading the second element somewhere in LLVM would've been reading past the end.
Since Rust can't hold OperandBundleDef by-value we're forced to indirect through a vector that copies out the OperandBundleDefs from the by-reference list on the Rust side in order to match the LLVM expected API.