Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix UB in LLVM FFI when passing zero or >1 bundle #123941

Merged
merged 1 commit into from
Apr 15, 2024

Conversation

Mark-Simulacrum
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum commented Apr 14, 2024

Rust passes a *const &OperandBundleDef to these APIs, usually from a Vec<&OperandBundleDef> or so. Previously we were dereferencing that pointer and passing it to the ArrayRef constructor with some length (N).

This meant that if the length was 0, we were dereferencing a pointer to nowhere (if the vector on the Rust side didn't actually get allocated or so), and if the length was >1 then loading the second element somewhere in LLVM would've been reading past the end.

Since Rust can't hold OperandBundleDef by-value we're forced to indirect through a vector that copies out the OperandBundleDefs from the by-reference list on the Rust side in order to match the LLVM expected API.

@rustbot
Copy link
Collaborator

rustbot commented Apr 14, 2024

r? @cuviper

rustbot has assigned @cuviper.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 14, 2024
@Mark-Simulacrum
Copy link
Member Author

Caught this while working on #123936.

@Mark-Simulacrum
Copy link
Member Author

Perf is probably fairly useless because we don't have bundles AFAICT outside of MSVC or CFI. I suspect that there's potential for optimizing here to avoid copies in the case that N=1 because ArrayRef works out, and that is likely a relatively common case, but this seems like the right fix to start with and should be cheap on Linux since the vector isn't ever pushed to.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me with a nit.

compiler/rustc_llvm/llvm-wrapper/RustWrapper.cpp Outdated Show resolved Hide resolved
compiler/rustc_llvm/llvm-wrapper/RustWrapper.cpp Outdated Show resolved Hide resolved
Rust passes a *const &OperandBundleDef to these APIs, usually from a
Vec<&OperandBundleDef> or so. Previously we were dereferencing that
pointer and passing it to the ArrayRef constructor with some length (N).

This meant that if the length was 0, we were dereferencing a pointer to
nowhere, and if the length was >1 then loading the *second* element
somewhere in LLVM would've been reading past the end.

Since Rust can't hold OperandBundleDef by-value we're forced to indirect
through a vector that copies out the OperandBundleDefs from the
by-reference list on the Rust side in order to match the LLVM expected
API.
@Mark-Simulacrum
Copy link
Member Author

@bors r=nikic

r? nikic

@bors
Copy link
Contributor

bors commented Apr 15, 2024

📌 Commit bf3decc has been approved by nikic

It is now in the queue for this repository.

@rustbot rustbot assigned nikic and unassigned cuviper Apr 15, 2024
@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 15, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 15, 2024
Fix UB in LLVM FFI when passing zero or >1 bundle

Rust passes a `*const &OperandBundleDef` to these APIs, usually from a `Vec<&OperandBundleDef>` or so. Previously we were dereferencing that pointer and passing it to the ArrayRef constructor with some length (N).

This meant that if the length was 0, we were dereferencing a pointer to nowhere (if the vector on the Rust side didn't actually get allocated or so), and if the length was >1 then loading the *second* element somewhere in LLVM would've been reading past the end.

Since Rust can't hold OperandBundleDef by-value we're forced to indirect through a vector that copies out the OperandBundleDefs from the by-reference list on the Rust side in order to match the LLVM expected API.
@bors
Copy link
Contributor

bors commented Apr 15, 2024

⌛ Testing commit bf3decc with merge 55779ee...

@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-aux failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

@bors
Copy link
Contributor

bors commented Apr 15, 2024

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 15, 2024
@Mark-Simulacrum
Copy link
Member Author

@bors retry - miri reports a memory leak but I doubt it's related to this PR (#123961 for follow-up)

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 15, 2024
@@ -1597,11 +1613,18 @@ LLVMRustBuildCallBr(LLVMBuilderRef B, LLVMTypeRef Ty, LLVMValueRef Fn,
IndirectDestsUnwrapped.push_back(unwrap(IndirectDests[i]));
}

// FIXME: Is there a way around this?
Copy link
Member

@RalfJung RalfJung Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to make sure I understand correctly... in Rust we have this data as an array of pointers such that the actual OperandBundleDef are non-contiguous, but LLVM wants them as a single flat array with all the OperandBundleDef next to each other? So really in Rust we'd want Vec<OperandBundleDef> instead of Vec<&OperandBundleDef>? But for some reason this can't be by-val in Rust, despite LLVM requiring this as a packed array (which can only be created by-val)?

May be worth adding some comment to explain this...

The fact that in Rust we have two types called OperandBundleDef where one is a pointer to the other does not help either.^^

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. OperandBundleDef is a C++ class - we can't easily pass that around by value in Rust without something like cxx to produce the right bindings (and even then it might not be possible if anything isn't trivially movable inside it).

I'm not sure there's a good place for a comment. I think long-term if we want to optimize this the right strategy is to have our wrapper code allocate the Vec and let Rust push into that Vec (roughly speaking).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we have two types on the rust side (can't grep right now). The OperandBundleDef I'm aware of is an extern type (I.e. lacks a size) so it can only be held by reference. It's the same thing as the C++ OperandBundleDef though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we have two, see here.

jieyouxu added a commit to jieyouxu/rust that referenced this pull request Apr 15, 2024
Fix UB in LLVM FFI when passing zero or >1 bundle

Rust passes a `*const &OperandBundleDef` to these APIs, usually from a `Vec<&OperandBundleDef>` or so. Previously we were dereferencing that pointer and passing it to the ArrayRef constructor with some length (N).

This meant that if the length was 0, we were dereferencing a pointer to nowhere (if the vector on the Rust side didn't actually get allocated or so), and if the length was >1 then loading the *second* element somewhere in LLVM would've been reading past the end.

Since Rust can't hold OperandBundleDef by-value we're forced to indirect through a vector that copies out the OperandBundleDefs from the by-reference list on the Rust side in order to match the LLVM expected API.
@@ -1524,13 +1524,21 @@ extern "C" void LLVMRustFreeOperandBundleDef(OperandBundleDef *Bundle) {

extern "C" LLVMValueRef LLVMRustBuildCall(LLVMBuilderRef B, LLVMTypeRef Ty, LLVMValueRef Fn,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
extern "C" LLVMValueRef LLVMRustBuildCall(LLVMBuilderRef B, LLVMTypeRef Ty, LLVMValueRef Fn,
// OpBundlesIndirect is an array of pointers (*not* a pointer to an array).
extern "C" LLVMValueRef LLVMRustBuildCall(LLVMBuilderRef B, LLVMTypeRef Ty, LLVMValueRef Fn,

Something like, maybe, for the comment?

"Pointer to array" is how the old code treated it.

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 15, 2024
Rollup of 12 pull requests

Successful merges:

 - rust-lang#123423 (Distribute LLVM bitcode linker as a preview component)
 - rust-lang#123548 (libtest: also measure time in Miri)
 - rust-lang#123666 (Fix some typos in doc)
 - rust-lang#123864 (Remove a HACK by instead inferring opaque types during expected/formal type checking)
 - rust-lang#123896 (Migrate some diagnostics in `rustc_resolve` to session diagnostic)
 - rust-lang#123919 (builtin-derive: tag → discriminant)
 - rust-lang#123922 (Remove magic constants when using `base_n`.)
 - rust-lang#123931 (Don't leak unnameable types in `-> _` recover)
 - rust-lang#123933 (move the LargeAssignments lint logic into its own file)
 - rust-lang#123934 (`rustc_data_structures::graph` mini refactor)
 - rust-lang#123941 (Fix UB in LLVM FFI when passing zero or >1 bundle)
 - rust-lang#123957 (disable create_dir_all_bare test on all(miri, windows))

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 4f3a39b into rust-lang:master Apr 15, 2024
11 of 12 checks passed
@rustbot rustbot added this to the 1.79.0 milestone Apr 15, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Apr 15, 2024
Rollup merge of rust-lang#123941 - Mark-Simulacrum:fix-llvm-ub, r=nikic

Fix UB in LLVM FFI when passing zero or >1 bundle

Rust passes a `*const &OperandBundleDef` to these APIs, usually from a `Vec<&OperandBundleDef>` or so. Previously we were dereferencing that pointer and passing it to the ArrayRef constructor with some length (N).

This meant that if the length was 0, we were dereferencing a pointer to nowhere (if the vector on the Rust side didn't actually get allocated or so), and if the length was >1 then loading the *second* element somewhere in LLVM would've been reading past the end.

Since Rust can't hold OperandBundleDef by-value we're forced to indirect through a vector that copies out the OperandBundleDefs from the by-reference list on the Rust side in order to match the LLVM expected API.
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Apr 20, 2024
…-Simulacrum

llvm RustWrapper: explain OpBundlesIndirect argument type

Follow-up to rust-lang#123941

r? `@Mark-Simulacrum`
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Apr 20, 2024
…-Simulacrum

llvm RustWrapper: explain OpBundlesIndirect argument type

Follow-up to rust-lang#123941

r? ``@Mark-Simulacrum``
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Apr 21, 2024
Rollup merge of rust-lang#124132 - RalfJung:OpBundlesIndirect, r=Mark-Simulacrum

llvm RustWrapper: explain OpBundlesIndirect argument type

Follow-up to rust-lang#123941

r? ``@Mark-Simulacrum``
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants