Rollup of 7 pull requests #95941

Dylan-DPC · 2022-04-11T14:56:41Z

Successful merges:

Document the current MIR semantics that are clear from existing code #95320 (Document the current MIR semantics that are clear from existing code)
Replace RwLock by a futex based one on Linux #95801 (Replace RwLock by a futex based one on Linux)
Fix miscompilation of inline assembly with outputs in cases where we emit an invoke instead of call instruction. #95864 (Fix miscompilation of inline assembly with outputs in cases where we emit an invoke instead of call instruction.)
Fix formatting error in pin.rs docs #95894 (Fix formatting error in pin.rs docs)
Clarify str::from_utf8_unchecked's invariants #95895 (Clarify str::from_utf8_unchecked's invariants)
Remove duplicate aliases for check codegen_{cranelift,gcc} and fix build codegen_gcc #95901 (Remove duplicate aliases for check codegen_{cranelift,gcc} and fix build codegen_gcc)
CI: do not compile libcore twice when performing LLVM PGO #95927 (CI: do not compile libcore twice when performing LLVM PGO)

Failed merges:

r? @ghost
@rustbot modify labels: rollup

…e conversions

Co-authored-by: Amanieu d'Antras <amanieu@gmail.com>

We may sometimes emit an `invoke` instead of a `call` for inline assembly during the MIR -> LLVM IR lowering. But we failed to update the IR builder's current basic block before writing the results to the outputs. This would result in invalid IR because the basic block would end in a `store` instruction, which isn't a valid terminator.

Specifically, make it clear that it is immediately UB to pass ill-formed UTF-8 into the function. The previous wording left space to interpret that the UB only occurred when calling another function, which "assumes that `&str`s are valid UTF-8." This does not change whether str being UTF-8 is a safety or a validity invariant. (As per previous discussion, it is a safety invariant, not a validity invariant.) It just makes it clear that valid UTF-8 is a precondition of str::from_utf8_unchecked, and that emitting an Abstract Machine fault (e.g. UB or a sanitizer error) on invalid UTF-8 is a valid thing to do. If user code wants to create an unsafe `&str` pointing to ill-formed UTF-8, it must be done via transmutes. Also, just, don't.

Bootstrap already allows selecting these in `PathSet::has`, which allows any string that matches the end of a full path. I found these by adding `assert!(path.exists())` in `StepDescription::paths`. I think ideally we wouldn't have any aliases that aren't paths, but I've held off on enforcing that here since it may be controversial, I'll open a separate PR.

These paths (`_cranelift` and `_gcc`) are somewhat misleading, since they actually tell bootstrap to build *all* codegen backends. But this seems like a useful improvement in the meantime.

Document the current MIR semantics that are clear from existing code This PR adds documentation to places, operands, rvalues, statementkinds, and terminatorkinds that describes their existing semantics and requirements. In many places the semantics depend on the Rust memory model or other T-Lang decisions - when this is the case, it is just noted as such with links to UCG issues where possible. I'm hopeful that none of the documentation added here can be used to justify optimizations that depend on the memory model. The documentation for places and operands probably comes closest to running afoul of this - if people think that it cannot be merged as is, it can definitely also be taken out. The goal here is to only document parts of MIR that seem to be decided already, or are at least depended on by existing code. That leaves quite a number of open questions - those are marked as "needs clarification." I'm not sure what to do with those in this PR - we obviously can't decide all these questions here. Should I just leave them in as is? Take them out? Keep them in but as `//` instead of `///` comments? If this is too big to review at once, I can split this up. r? rust-lang/mir-opt

Replace RwLock by a futex based one on Linux This replaces the pthread-based RwLock on Linux by a futex based one. This implementation is similar to [the algorithm](https://gist.github.com/kprotty/3042436aa55620d8ebcddf2bf25668bc) suggested by `@kprotty,` but modified to prefer writers and spin before sleeping. It uses two futexes: One for the readers to wait on, and one for the writers to wait on. The readers futex contains the state of the RwLock: The number of readers, a bit indicating whether writers are waiting, and a bit indicating whether readers are waiting. The writers futex is used as a simple condition variable and its contents are meaningless; it just needs to be changed on every notification. Using two futexes rather than one has the obvious advantage of allowing a separate queue for readers and writers, but it also means we avoid the problem a single-futex RwLock would have of making it hard for a writer to go to sleep while the number of readers is rapidly changing up and down, as the writers futex is only changed when we actually want to wake up a writer. It always prefers writers, as we decided [here](rust-lang#93740 (comment)). To be able to prefer writers, it relies on futex_wake to return the number of awoken threads to be able to handle write-unlocking while both the readers-waiting and writers-waiting bits are set. Instead of waking both and letting them race, it first wakes writers and only continues to wake the readers too if futex_wake reported there were no writers to wake up. r? `@Amanieu`

…compile, r=Amanieu Fix miscompilation of inline assembly with outputs in cases where we emit an invoke instead of call instruction. We ran into this bug where rustc would segfault while trying to compile certain uses of inline assembly. Here is a simple repro that demonstrates the issue: ```rust #![feature(asm_unwind)] fn main() { let _x = String::from("string here just cause we need something with a non-trivial drop"); let foo: u64; unsafe { std::arch::asm!( "mov {}, 1", out(reg) foo, options(may_unwind) ); } println!("{}", foo); } ``` ([playground link](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=7d6641e83370d2536a07234aca2498ff)) But crucially `feature(asm_unwind)` is not actually needed and this can be triggered on stable as a result of the way async functions/generators are handled in the compiler. e.g.: ```rust extern crate futures; // 0.3.21 async fn bar() { let foo: u64; unsafe { std::arch::asm!( "mov {}, 1", out(reg) foo, ); } println!("{}", foo); } fn main() { futures::executor::block_on(bar()); } ``` ([playground link](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=1c7781c34dd4a3e80ae4bd936a0c82fc)) An example of the incorrect LLVM generated: ```llvm bb1: ; preds = %start %1 = invoke i64 asm sideeffect alignstack inteldialect unwind "mov ${0:q}, 1", "=&r,~{dirflag},~{fpsr},~{flags},~{memory}"() to label %bb2 unwind label %cleanup, !srcloc !9 store i64 %1, i64* %foo, align 8 bb2: [...snip...] ``` The store should not be placed after the asm invoke but rather should be in the normal control flow basic block (`bb2` in this case). [Here](https://gist.github.com/luqmana/be1af5b64d2cda5a533e3e23a7830b44) is a writeup of the investigation that lead to finding this.

Fix formatting error in pin.rs docs Not sure if there's more formatting issues I missed; I kinda lost interest reading midway through.

Clarify str::from_utf8_unchecked's invariants Specifically, make it clear that it is immediately UB to pass ill-formed UTF-8 into the function. The previous wording left space to interpret that the UB only occurred when calling another function, which "assumes that `&str`s are valid UTF-8." This does not change whether str being UTF-8 is a safety or a validity invariant. (As per previous discussion, it is a safety invariant, not a validity invariant.) It just makes it clear that valid UTF-8 is a precondition of str::from_utf8_unchecked, and that emitting an Abstract Machine fault (e.g. UB or a sanitizer error) on invalid UTF-8 is a valid thing to do. If user code wants to create an unsafe `&str` pointing to ill-formed UTF-8, it must be done via transmutes. Also, just, don't. Zulip discussion: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-lang.2Fwg-unsafe-code-guidelines/topic/str.3A.3Afrom_utf8_unchecked.20Safety.20requirement

…Mark-Simulacrum Remove duplicate aliases for `check codegen_{cranelift,gcc}` and fix `build codegen_gcc` * Remove duplicate aliases Bootstrap already allows selecting these in `PathSet::has`, which allows any string that matches the end of a full path. I found these by adding `assert!(path.exists())` in `StepDescription::paths`. I think ideally we wouldn't have any aliases that aren't paths, but I've held off on enforcing that here since it may be controversial, I'll open a separate PR. * Add `build compiler/rustc_codegen_gcc` as an alias for `CodegenBackend` These paths (`_cranelift` and `_gcc`) are somewhat misleading, since they actually tell bootstrap to build *all* codegen backends. But this seems like a useful improvement in the meantime. cc ``@bjorn3`` ``@antoyo``

CI: do not compile libcore twice when performing LLVM PGO I forgot the delete the first compilation when modifying this file in a previous PR. r? ``@lqd``

Dylan-DPC · 2022-04-11T14:57:06Z

@bors r+ rollup=never p=5

bors · 2022-04-11T14:57:08Z

📌 Commit 633e004 has been approved by Dylan-DPC

bors · 2022-04-11T17:47:20Z

🔒 Merge conflict

This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again.

How do I rebase?

Assuming self is your fork and upstream is this repository, you can resolve the conflict following these steps:

git checkout rollup-9k6ryns (switch to your branch)
git fetch upstream master (retrieve the latest master)
git rebase upstream/master -p (rebase on top of it)
Follow the on-screen instruction to resolve conflicts (check git status if you got lost).
git push self rollup-9k6ryns --force-with-lease (update this PR)

You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial.

Please avoid the "Resolve conflicts" button on GitHub. It uses git merge instead of git rebase which makes the PR commit history more difficult to read.

Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Cargo.lock conflict is handled during merge and rebase. This is normal, and you should still perform step 5 to update this PR.

Error message

Auto-merging compiler/rustc_middle/src/mir/mod.rs
CONFLICT (content): Merge conflict in compiler/rustc_middle/src/mir/mod.rs
Auto-merging compiler/rustc_const_eval/src/transform/validate.rs
Automatic merge failed; fix conflicts and then commit the result.

Dylan-DPC · 2022-04-11T17:53:13Z

@bors r-

bors · 2022-04-11T18:32:43Z

☔ The latest upstream changes (presumably #95125) made this pull request unmergeable. Please resolve the merge conflicts.

m-ou-se and others added 30 commits April 7, 2022 11:34

Return status from futex_wake().

f1a4041

Improve documentation of Place and Operand

4f28344

Adjust computation of place types to detect more invalid places

8368590

Add documentation for the semantics of MIR rvalues

e000179

Extend the MIR validator to check many more things around rvalues.

5fc8676

Improve documentation for MIR statement kinds.

148beaf

Improve documentation for MIR terminators

c996bc0

Adjust MIR validator to check a few more things for terminators

3c169f3

Improve MIR phases documentation with summaries of changes

8ef4af7

Address various comments and change some details around place to valu…

14fb427

…e conversions

Add futex-based RwLock on Linux.

6cb463c

Fix typo in futex rwlock.

307aa58

Co-authored-by: Amanieu d'Antras <amanieu@gmail.com>

Add more clarifications in response to Ralf's comments

a5d2c04

Remove rule that place loads may not happen with variant index set

9745a17

Update asm-may_unwind test to handle use of asm with outputs.

0b2f360

Fix formatting error in pin.rs docs

bb3a071

Add build compiler/rustc_codegen_gcc as an alias for CodegenBackend

4c14383

These paths (`_cranelift` and `_gcc`) are somewhat misleading, since they actually tell bootstrap to build *all* codegen backends. But this seems like a useful improvement in the meantime.

CI: do not compile libcore twice when performing LLVM PGO

aeb3df7

Add doc comments to futex operations.

7c28791

Add comments to futex rwlock implementation.

1f2c2bb

Use compare_exchange_weak in futex rwlock implementation.

c4a4f48

Use is_ or has_ prefix for pure -> bool functions.

8339381

Rollup merge of rust-lang#95894 - nyanpasu64:fix-pin-docs, r=Dylan-DPC

fac70d8

Fix formatting error in pin.rs docs Not sure if there's more formatting issues I missed; I kinda lost interest reading midway through.

Dylan-DPC added 2 commits April 11, 2022 16:56

Rollup merge of rust-lang#95927 - Kobzol:ci-pgo-libcore, r=lqd

633e004

CI: do not compile libcore twice when performing LLVM PGO I forgot the delete the first compilation when modifying this file in a previous PR. r? ``@lqd``

rustbot added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. rollup A PR which is a rollup labels Apr 11, 2022

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Apr 11, 2022

bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 11, 2022

Dylan-DPC closed this Apr 11, 2022

Dylan-DPC deleted the rollup-9k6ryns branch April 11, 2022 18:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rollup of 7 pull requests #95941

Rollup of 7 pull requests #95941

Dylan-DPC commented Apr 11, 2022

Dylan-DPC commented Apr 11, 2022

bors commented Apr 11, 2022

bors commented Apr 11, 2022

Dylan-DPC commented Apr 11, 2022

bors commented Apr 11, 2022

Rollup of 7 pull requests #95941

Rollup of 7 pull requests #95941

Conversation

Dylan-DPC commented Apr 11, 2022

Dylan-DPC commented Apr 11, 2022

bors commented Apr 11, 2022

bors commented Apr 11, 2022

Dylan-DPC commented Apr 11, 2022

bors commented Apr 11, 2022