-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace some operators in libcore with their short-circuiting equivalents #90346
Conversation
Using short-circuiting operators makes it easier to perform some kinds of source code analysis, like MC/DC code coverage (a requirement in safety-critical environments). The optimized x86_64 assembly is the same between the old and new versions: ``` mov eax, edi add dl, -1 adc eax, esi setb dl ret ```
Using short-circuiting operators makes it easier to perform some kinds of source code analysis, like MC/DC code coverage (a requirement in safety-critical environments). The optimized x86_64 assembly is the same between the old and new versions: ``` mov eax, edi add dl, -1 sbb eax, esi setb dl ret ```
Using short-circuiting operators makes it easier to perform some kinds of source code analysis, like MC/DC code coverage (a requirement in safety-critical environments). The optimized x86_64 assembly is equivalent between the old and new versions. Old assembly of that condition: ``` mov rax, qword ptr [rdi + rdx + 8] or rax, qword ptr [rdi + rdx] test rax, r9 je .LBB0_7 ``` New assembly of that condition: ``` mov rax, qword ptr [rdi + rdx] or rax, qword ptr [rdi + rdx + 8] test rax, r8 je .LBB0_7 ```
Using short-circuit operators makes it easier to perform some kinds of source code analysis, like MC/DC code coverage (a requirement in safety-critical environments). The optimized x86 assembly is the same between the old and new versions: ``` xor eax, eax test esi, esi je .LBB0_1 cmp edi, -2147483648 jne .LBB0_4 cmp esi, -1 jne .LBB0_4 ret .LBB0_1: ret .LBB0_4: mov eax, edi cdq idiv esi mov edx, eax mov eax, 1 ret ```
Using short-circuit operators makes it easier to perform some kinds of source code analysis, like MC/DC code coverage (a requirement in safety-critical environments). The optimized x86 assembly is the same between the old and new versions: ``` xor eax, eax test esi, esi je .LBB0_1 cmp edi, -2147483648 jne .LBB0_4 cmp esi, -1 jne .LBB0_4 ret .LBB0_1: ret .LBB0_4: mov eax, edi cdq idiv esi mov eax, 1 ret ```
r? @yaahc (rust-highfive has picked a reviewer for you, use r? to override) |
Are the optimizations here checked at both -Copt-level=3 and a size-optimization level (like -Copt-level=z)? |
The optimiziations were checked with |
Huh, interesting. I assumed that this would get vectorized, at least with some recent target-cpu. But it's only SWAR. |
Looking at Godbolt for |
Can you elaborate on this? Why would the short circuiting affect the analysis? Doesn't that mean the analysis should support bitwise ops in a way that it behaves similar to the short circuiting op? Also: are you performing these analyses on MIR (with or without opts?)? |
The safety standards we (Ferrocene) are targeting require that libcore achieves full MC/DC code coverage. With short-circuiting operators we can prove that just doing decision coverage is enough to achieve MC/DC, but with bitwise operators we're forced to add MC/DC instrumentation as we can't prove the same.
We need to run the coverage analysis on the final binary, and the most promising solution right now is to perform the analysis on the unoptimized assembly generated by the compiler. We can't do the analysis on MIR nor use the compiler's builtin code coverage support as that would result in a chicken-and-egg problem (with the compiler needing to be qualified to qualify the compiler). |
😨 so, the program that'll run in the safety critical env will need to be the unoptimized one, too?
I got that, but not why. Especially in unoptimized asm, can't you track that a register has multiple values anded/ored and make the jump behave as if it were two chained jumps? That said, I believe it would be a straight forward mir transform to replace all bitops followed by a switch into the equivalent short circuit mir. This would give you much wider coverage, without affecting anyone but the compiler team that'll maintain the mir transform. For now the transform could have a custom -Z flag, but once we get #77665 rebased and merged you can just select to run no opts but that one |
Is it not possible to sufficiently prove equivalence between two assembly outputs for optimization validation? There are many formal models for a reasonably large subset of x86 already, I think this is not beyond existing verification tools. |
r? @oli-obk |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 68a4460 with merge 8508dc2f4d8d85d812ea371de5baca75ca7c4020... |
We picked an existing qualified tool to perform the code coverage analysis, and the lack of non-short-circuiting operators is one of the requirements it has. It could be possible to perform additional transformations, analysis and proofs on top of it, but all of that work would need to be submitted to the safety inspectors and be approved by them, while the tool we're using is already pre-qualified. Just changing the operators in libcore whenever it doesn't impact performance is less overall work for everyone. We don't want to put the burden on the teams or the contributors on ensuring no short-circuiting operators land in libcore though. It's perfectly fine for the project to accept new changes with them, and if the newly added uses of the operators could be removed without impacting performance we'll take the burden of sending PRs after the fact. |
Thanks for the explanation. That sounds good to me, so with perf clean r=me |
😕 This PR changes a couple of operators to short-circuiting operators, but here you say that there must not be short-circuiting operators for this tool to work. Or did I misunderstand something? |
Oh that was a typo! Thanks for catching it @bjorn3, I fixed the sentence :) |
☀️ Try build successful - checks-actions |
Queued 8508dc2f4d8d85d812ea371de5baca75ca7c4020 with parent 9ed5b94, future comparison URL. |
Finished benchmarking commit (8508dc2f4d8d85d812ea371de5baca75ca7c4020): comparison url. Summary: This change led to small relevant improvements 🎉 in compiler performance.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. @bors rollup=never |
@bors r+ |
📌 Commit 68a4460 has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (6d42707): comparison url. Summary: This benchmark run did not return any relevant changes. If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression |
In libcore there are a few occurrences of bitwise operators used in boolean expressions instead of their short-circuiting equivalents. This makes it harder to perform some kinds of source code analysis over libcore, for example MC/DC code coverage (a requirement in safety-critical environments).
This PR aims to remove as many bitwise operators in boolean expressions from libcore as possible, without any performance regression and without other changes. This means not all bitwise operators are removed, only the ones that don't have any difference with their short-circuiting counterparts. This already simplifies achieving MC/DC coverage, and the other functions can be changed in future PRs.
The PR is best reviewed commit-by-commit, and each commit has the resulting assembly in the message.
Checked integer methods
These methods recently switched to bitwise operators in PRs #89459 and #89351. I confirmed bitwise operators are needed in most of the functions, except these two:
{integer}::checked_div
(Godbolt link (nightly)){integer}::checked_rem
(Godbolt link (nightly))@tspiteri already mentioned this was the case in #89459 (comment), but opted to also switch those two to bitwise operators for consistency. As that makes MC/DC analysis harder this PR proposes switching those two back to short-circuiting operators.
{unsigned_ints}::carrying_add
Godbolt link (1.56.0)
In this instance replacing the
|
with||
produces the exact same assembly when optimizations are enabled, so switching to the short-circuiting operator shouldn't have any impact.{unsigned_ints}::borrowing_sub
Godbolt link (1.56.0)
In this instance replacing the
|
with||
produces the exact same assembly when optimizations are enabled, so switching to the short-circuiting operator shouldn't have any impact.String UTF-8 validation
Godbolt link (1.56.0)
In this instance replacing the
|
with||
produces practically the same assembly, with the two operands for the "or" swapped: