Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compress RWU from at least 32 bits to 4 bits #79727

Merged
merged 1 commit into from
Dec 8, 2020
Merged

Conversation

tmiasko
Copy link
Contributor

@tmiasko tmiasko commented Dec 5, 2020

The liveness uses a mixed representation of RWUs based on the
observation that most of them have invalid reader and invalid
writer. The packed variant uses 32 bits and unpacked 96 bits.
Unpacked data contains reader live node and writer live node.

Since live nodes are used only to determine their validity,
RWUs can always be stored in a packed form with four bits for
each: reader bit, writer bit, used bit, and one extra padding
bit to simplify packing and unpacking operations.

@lcnr
Copy link
Contributor

lcnr commented Dec 5, 2020

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Dec 5, 2020

⌛ Trying commit 02778c8c9a364c52053545d46089a428ccb4024e with merge b47bf19739a8e6218a9b76d6e157bc0d76ff9d91...

@bors
Copy link
Contributor

bors commented Dec 5, 2020

☀️ Try build successful - checks-actions
Build commit: b47bf19739a8e6218a9b76d6e157bc0d76ff9d91 (b47bf19739a8e6218a9b76d6e157bc0d76ff9d91)

@rust-timer
Copy link
Collaborator

Queued b47bf19739a8e6218a9b76d6e157bc0d76ff9d91 with parent 0781b44, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (b47bf19739a8e6218a9b76d6e157bc0d76ff9d91): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot modify labels: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 5, 2020
@tmiasko
Copy link
Contributor Author

tmiasko commented Dec 5, 2020

instructions:u
 clap-rs-check           avg: -1.9%	min: -4.8%	max:  0.0%
 inflate-check           avg: -2.8%	min: -3.9%	max: -0.0%
 inflate-debug           avg: -1.6%	min: -2.2%	max: -0.0%
 clap-rs-debug           avg: -0.7%	min: -1.8%	max:  0.0%
 keccak-check            avg: -0.9%	min: -1.4%	max:  0.0%
 cranelift-codegen-check avg: -0.8%	min: -1.3%	max:  0.0%

max-rss:
 keccak-debug      avg: -12.1%	 min: -26.1%	max:  0.1%
 cargo-opt         avg:  -1.8%	 min:  -7.0%	max:  0.5%
 inflate-opt       avg:  -1.3%?	 min:  -6.9%?	max:  0.9%?
 inflate-check     avg:  -4.5%	 min:  -6.6%	max: -0.0%
 inflate-debug     avg:  -1.0%	 min:  -4.5%	max:  1.5%
 style-servo-debug avg:   0.5%	 min:  -1.2%	max:  4.4%

@tmiasko tmiasko changed the title Compress RWU from 32 >= bits to 4 bits Compress RWU from at least 32 bits to 4 bits Dec 5, 2020
@tmiasko tmiasko force-pushed the 8bit-rwu branch 2 times, most recently from 1b0f854 to 9d12e73 Compare December 5, 2020 14:12
@tmiasko
Copy link
Contributor Author

tmiasko commented Dec 5, 2020

r? @lcnr

@lqd
Copy link
Member

lqd commented Dec 5, 2020

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Dec 5, 2020

⌛ Trying commit 84c38d870bc93bd269e843152866b91bdcb80551 with merge 16eae4b989421e982b659f3043d7a012f3f03861...

@tmiasko
Copy link
Contributor Author

tmiasko commented Dec 5, 2020

The current implementation trades some memory for simplicity by performing row operations using complete words. To do that it has to round up the size of a row up to the whole word. As a result it has average memory overhead of number of live nodes * word-size / 2. It might be actually beneficial to use smaller word size, let's find out.

@bors
Copy link
Contributor

bors commented Dec 5, 2020

☀️ Try build successful - checks-actions
Build commit: 16eae4b989421e982b659f3043d7a012f3f03861 (16eae4b989421e982b659f3043d7a012f3f03861)

@rust-timer
Copy link
Collaborator

Queued 16eae4b989421e982b659f3043d7a012f3f03861 with parent 551a2c6, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (16eae4b989421e982b659f3043d7a012f3f03861): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot modify labels: +S-waiting-on-review -S-waiting-on-perf

@tmiasko
Copy link
Contributor Author

tmiasko commented Dec 5, 2020

instructions:u:
 clap-rs-check           avg: -1.9%	min: -4.8%	max:  0.0%
 inflate-check           avg: -2.8%	min: -3.9%	max: -0.0%
 inflate-debug           avg: -1.6%	min: -2.1%	max: -0.0%
 clap-rs-debug           avg: -0.7%	min: -1.7%	max:  0.0%
 keccak-check            avg: -0.9%	min: -1.4%	max: -0.0%
 cranelift-codegen-check avg: -0.8%	min: -1.4%	max:  0.0%

max-rss
 keccak-debug      avg: -5.0%	min: -11.0%	max:  0.1%
 cargo-debug       avg:  0.9%   min:  -0.9%     max:  4.4%
 inflate-opt       avg: -1.6%?	min:  -6.1%?	max:  0.8%?
 inflate-check     avg: -4.4%	min:  -6.7%	max: -0.0%
 inflate-debug     avg: -2.1%	min:  -6.9%	max:  2.9%
 style-servo-debug avg: -1.7%	min:  -5.8%	max:  0.5%

The liveness uses a mixed representation of RWUs based on the
observation that most of them have invalid reader and invalid
writer. The packed variant uses 32 bits and unpacked 96 bits.
Unpacked data contains reader live node and writer live node.

Since live nodes are used only to determine their validity,
RWUs can always be stored in a packed form with four bits for
each: reader bit, writer bit, used bit, and one extra padding
bit to simplify packing and unpacking operations.
Copy link
Contributor

@lcnr lcnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surprised that we didn't use the node kind stored in the rwu_table.

It might make sense to move the implementation of the rwu table into a separate file so it's clearer what is part of it's API and what's only used for its implementation, but I don't have a strong opinion about this.

Good job 👍

compiler/rustc_passes/src/liveness.rs Show resolved Hide resolved
compiler/rustc_passes/src/liveness.rs Show resolved Hide resolved
compiler/rustc_passes/src/liveness.rs Show resolved Hide resolved
@lcnr
Copy link
Contributor

lcnr commented Dec 6, 2020

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Dec 6, 2020

⌛ Trying commit bc8317a with merge b017cc71b863fb1e4859d1de21889e95c01cc85b...

@bors
Copy link
Contributor

bors commented Dec 6, 2020

☀️ Try build successful - checks-actions
Build commit: b017cc71b863fb1e4859d1de21889e95c01cc85b (b017cc71b863fb1e4859d1de21889e95c01cc85b)

@rust-timer
Copy link
Collaborator

Queued b017cc71b863fb1e4859d1de21889e95c01cc85b with parent 0f6f2d6, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (b017cc71b863fb1e4859d1de21889e95c01cc85b): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot modify labels: +S-waiting-on-review -S-waiting-on-perf

@lcnr
Copy link
Contributor

lcnr commented Dec 6, 2020

It does look like using u8 is slightly more efficient

@tmiasko
Copy link
Contributor Author

tmiasko commented Dec 8, 2020

Yep, I think we should land this last version with u8. I opened #79777 to remove the first_merge, and I will open another PR to move the implementation of RWUTable. With recent experiences of perf regression in exhaustiveness checking after simple code reorganization I would prefer to do that separately.

@lcnr
Copy link
Contributor

lcnr commented Dec 8, 2020

@bors r+ rollup=never

@bors
Copy link
Contributor

bors commented Dec 8, 2020

📌 Commit bc8317a has been approved by lcnr

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 8, 2020
@bors
Copy link
Contributor

bors commented Dec 8, 2020

⌛ Testing commit bc8317a with merge 5109b2e4f231cdf82b91ac04ae3c8631d4a347f6...

@bors
Copy link
Contributor

bors commented Dec 8, 2020

💥 Test timed out

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Dec 8, 2020
@lqd
Copy link
Member

lqd commented Dec 8, 2020

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 8, 2020
@bors
Copy link
Contributor

bors commented Dec 8, 2020

⌛ Testing commit bc8317a with merge 1700ca0...

@bors
Copy link
Contributor

bors commented Dec 8, 2020

☀️ Test successful - checks-actions
Approved by: lcnr
Pushing 1700ca0 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 8, 2020
@bors bors merged commit 1700ca0 into rust-lang:master Dec 8, 2020
@rustbot rustbot added this to the 1.50.0 milestone Dec 8, 2020
@tmiasko tmiasko deleted the 8bit-rwu branch December 8, 2020 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants