Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace some operators in libcore with their short-circuiting equivalents #90346

Merged
merged 5 commits into from
Oct 30, 2021

Conversation

pietroalbini
Copy link
Member

In libcore there are a few occurrences of bitwise operators used in boolean expressions instead of their short-circuiting equivalents. This makes it harder to perform some kinds of source code analysis over libcore, for example MC/DC code coverage (a requirement in safety-critical environments).

This PR aims to remove as many bitwise operators in boolean expressions from libcore as possible, without any performance regression and without other changes. This means not all bitwise operators are removed, only the ones that don't have any difference with their short-circuiting counterparts. This already simplifies achieving MC/DC coverage, and the other functions can be changed in future PRs.

The PR is best reviewed commit-by-commit, and each commit has the resulting assembly in the message.

Checked integer methods

These methods recently switched to bitwise operators in PRs #89459 and #89351. I confirmed bitwise operators are needed in most of the functions, except these two:

@tspiteri already mentioned this was the case in #89459 (comment), but opted to also switch those two to bitwise operators for consistency. As that makes MC/DC analysis harder this PR proposes switching those two back to short-circuiting operators.

{unsigned_ints}::carrying_add

Godbolt link (1.56.0)

In this instance replacing the | with || produces the exact same assembly when optimizations are enabled, so switching to the short-circuiting operator shouldn't have any impact.

{unsigned_ints}::borrowing_sub

Godbolt link (1.56.0)

In this instance replacing the | with || produces the exact same assembly when optimizations are enabled, so switching to the short-circuiting operator shouldn't have any impact.

String UTF-8 validation

Godbolt link (1.56.0)

In this instance replacing the | with || produces practically the same assembly, with the two operands for the "or" swapped:

; Old
mov  rax, qword ptr [rdi + rdx + 8]
or   rax, qword ptr [rdi + rdx]
test rax, r9
je   .LBB0_7

; New
mov  rax, qword ptr [rdi + rdx]
or   rax, qword ptr [rdi + rdx + 8]
test rax, r8
je   .LBB0_7

Using short-circuiting operators makes it easier to perform some kinds
of source code analysis, like MC/DC code coverage (a requirement in
safety-critical environments). The optimized x86_64 assembly is the same
between the old and new versions:

```
mov eax, edi
add dl, -1
adc eax, esi
setb dl
ret
```
Using short-circuiting operators makes it easier to perform some kinds
of source code analysis, like MC/DC code coverage (a requirement in
safety-critical environments). The optimized x86_64 assembly is the same
between the old and new versions:

```
mov eax, edi
add dl, -1
sbb eax, esi
setb dl
ret
```
Using short-circuiting operators makes it easier to perform some kinds
of source code analysis, like MC/DC code coverage (a requirement in
safety-critical environments). The optimized x86_64 assembly is
equivalent between the old and new versions.

Old assembly of that condition:

```
mov  rax, qword ptr [rdi + rdx + 8]
or   rax, qword ptr [rdi + rdx]
test rax, r9
je   .LBB0_7
```

New assembly of that condition:

```
mov  rax, qword ptr [rdi + rdx]
or   rax, qword ptr [rdi + rdx + 8]
test rax, r8
je   .LBB0_7
```
Using short-circuit operators makes it easier to perform some kinds of
source code analysis, like MC/DC code coverage (a requirement in
safety-critical environments). The optimized x86 assembly is the same
between the old and new versions:

```
xor eax, eax
test esi, esi
je .LBB0_1
cmp edi, -2147483648
jne .LBB0_4
cmp esi, -1
jne .LBB0_4
ret
.LBB0_1:
ret
.LBB0_4:
mov eax, edi
cdq
idiv esi
mov edx, eax
mov eax, 1
ret
```
Using short-circuit operators makes it easier to perform some kinds of
source code analysis, like MC/DC code coverage (a requirement in
safety-critical environments). The optimized x86 assembly is the same
between the old and new versions:

```
xor eax, eax
test esi, esi
je .LBB0_1
cmp edi, -2147483648
jne .LBB0_4
cmp esi, -1
jne .LBB0_4
ret
.LBB0_1:
ret
.LBB0_4:
mov eax, edi
cdq
idiv esi
mov eax, 1
ret
```
@rust-highfive
Copy link
Collaborator

r? @yaahc

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 27, 2021
@thomcc
Copy link
Member

thomcc commented Oct 27, 2021

Are the optimizations here checked at both -Copt-level=3 and a size-optimization level (like -Copt-level=z)?

@pietroalbini
Copy link
Member Author

The optimiziations were checked with -O, which is equivalent to -C opt-level=2. I'll check -C opt-level=z tomorrow.

@camelid camelid added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Oct 27, 2021
@the8472
Copy link
Member

the8472 commented Oct 27, 2021

String UTF-8 validation

Huh, interesting. I assumed that this would get vectorized, at least with some recent target-cpu. But it's only SWAR.

@pietroalbini
Copy link
Member Author

Looking at Godbolt for -C opt-level=z, the results are the same as -O and -C opt-level=3.

@oli-obk
Copy link
Contributor

oli-obk commented Oct 28, 2021

This makes it harder to perform some kinds of source code analysis over libcore,

Can you elaborate on this? Why would the short circuiting affect the analysis? Doesn't that mean the analysis should support bitwise ops in a way that it behaves similar to the short circuiting op?

Also: are you performing these analyses on MIR (with or without opts?)?

@pietroalbini
Copy link
Member Author

Can you elaborate on this? Why would the short circuiting affect the analysis? Doesn't that mean the analysis should support bitwise ops in a way that it behaves similar to the short circuiting op?

The safety standards we (Ferrocene) are targeting require that libcore achieves full MC/DC code coverage. With short-circuiting operators we can prove that just doing decision coverage is enough to achieve MC/DC, but with bitwise operators we're forced to add MC/DC instrumentation as we can't prove the same.

Also: are you performing these analyses on MIR (with or without opts?)?

We need to run the coverage analysis on the final binary, and the most promising solution right now is to perform the analysis on the unoptimized assembly generated by the compiler. We can't do the analysis on MIR nor use the compiler's builtin code coverage support as that would result in a chicken-and-egg problem (with the compiler needing to be qualified to qualify the compiler).

@oli-obk
Copy link
Contributor

oli-obk commented Oct 28, 2021

perform the analysis on the unoptimized assembly generated by the compiler

😨 so, the program that'll run in the safety critical env will need to be the unoptimized one, too?

With short-circuiting operators we can prove that just doing decision coverage is enough to achieve MC/DC, but with bitwise operators we're forced to add MC/DC instrumentation as we can't prove the same.

I got that, but not why. Especially in unoptimized asm, can't you track that a register has multiple values anded/ored and make the jump behave as if it were two chained jumps?

That said, I believe it would be a straight forward mir transform to replace all bitops followed by a switch into the equivalent short circuit mir. This would give you much wider coverage, without affecting anyone but the compiler team that'll maintain the mir transform. For now the transform could have a custom -Z flag, but once we get #77665 rebased and merged you can just select to run no opts but that one

@workingjubilee
Copy link
Member

perform the analysis on the unoptimized assembly generated by the compiler

😨 so, the program that'll run in the safety critical env will need to be the unoptimized one, too?

Is it not possible to sufficiently prove equivalence between two assembly outputs for optimization validation? There are many formal models for a reasonably large subset of x86 already, I think this is not beyond existing verification tools.

@oli-obk
Copy link
Contributor

oli-obk commented Oct 29, 2021

r? @oli-obk

@rust-highfive rust-highfive assigned oli-obk and unassigned yaahc Oct 29, 2021
@oli-obk
Copy link
Contributor

oli-obk commented Oct 29, 2021

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 29, 2021
@bors
Copy link
Contributor

bors commented Oct 29, 2021

⌛ Trying commit 68a4460 with merge 8508dc2f4d8d85d812ea371de5baca75ca7c4020...

@pietroalbini
Copy link
Member Author

pietroalbini commented Oct 29, 2021

We picked an existing qualified tool to perform the code coverage analysis, and the lack of non-short-circuiting operators is one of the requirements it has. It could be possible to perform additional transformations, analysis and proofs on top of it, but all of that work would need to be submitted to the safety inspectors and be approved by them, while the tool we're using is already pre-qualified. Just changing the operators in libcore whenever it doesn't impact performance is less overall work for everyone.

We don't want to put the burden on the teams or the contributors on ensuring no short-circuiting operators land in libcore though. It's perfectly fine for the project to accept new changes with them, and if the newly added uses of the operators could be removed without impacting performance we'll take the burden of sending PRs after the fact.

@oli-obk
Copy link
Contributor

oli-obk commented Oct 29, 2021

Thanks for the explanation. That sounds good to me, so with perf clean r=me

@bjorn3
Copy link
Member

bjorn3 commented Oct 29, 2021

and the lack of short-circuiting operators is one of the requirements it has

😕 This PR changes a couple of operators to short-circuiting operators, but here you say that there must not be short-circuiting operators for this tool to work. Or did I misunderstand something?

@pietroalbini
Copy link
Member Author

Oh that was a typo! Thanks for catching it @bjorn3, I fixed the sentence :)

@bors
Copy link
Contributor

bors commented Oct 29, 2021

☀️ Try build successful - checks-actions
Build commit: 8508dc2f4d8d85d812ea371de5baca75ca7c4020 (8508dc2f4d8d85d812ea371de5baca75ca7c4020)

@rust-timer
Copy link
Collaborator

Queued 8508dc2f4d8d85d812ea371de5baca75ca7c4020 with parent 9ed5b94, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (8508dc2f4d8d85d812ea371de5baca75ca7c4020): comparison url.

Summary: This change led to small relevant improvements 🎉 in compiler performance.

  • Small improvement in instruction counts (up to -1.2% on incr-full builds of coercions)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 29, 2021
@oli-obk
Copy link
Contributor

oli-obk commented Oct 29, 2021

@bors r+

@bors
Copy link
Contributor

bors commented Oct 29, 2021

📌 Commit 68a4460 has been approved by oli-obk

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 29, 2021
@bors
Copy link
Contributor

bors commented Oct 29, 2021

⌛ Testing commit 68a4460 with merge 6d42707...

@bors
Copy link
Contributor

bors commented Oct 30, 2021

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 6d42707 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 30, 2021
@bors bors merged commit 6d42707 into rust-lang:master Oct 30, 2021
@rustbot rustbot added this to the 1.58.0 milestone Oct 30, 2021
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (6d42707): comparison url.

Summary: This benchmark run did not return any relevant changes.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

@pietroalbini pietroalbini deleted the pa-short-circuit branch October 30, 2021 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.