Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encode DepKind as u16 #115391

Merged
merged 1 commit into from
Sep 4, 2023
Merged

Encode DepKind as u16 #115391

merged 1 commit into from
Sep 4, 2023

Conversation

saethlin
Copy link
Member

@saethlin saethlin commented Aug 30, 2023

The derived Encodable/Decodable impls serialize/deserialize as a varint, which results in a lot of code size around the encoding/decoding of these types which isn't justified: The full range of values here is rather small but doesn't quite fit in to a u8. Growing all serialized DepKind to 2 bytes costs us on average 1% size in the incr comp dep graph, which I plan to recoup in #110050 by taking advantage of the unused bits in all the serialized DepKind.

r? @nnethercote

@saethlin saethlin added the S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. label Aug 30, 2023
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 30, 2023
@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 30, 2023
@rust-log-analyzer

This comment has been minimized.

@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Aug 30, 2023

⌛ Trying commit d79627a264e31f967f84ff5bdbfcfa7e4eb0a19c with merge 6ff94474e1d115fd1b393d245725fb5f4cd4b74b...

@bors
Copy link
Contributor

bors commented Aug 30, 2023

☀️ Try build successful - checks-actions
Build commit: 6ff94474e1d115fd1b393d245725fb5f4cd4b74b (6ff94474e1d115fd1b393d245725fb5f4cd4b74b)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (6ff94474e1d115fd1b393d245725fb5f4cd4b74b): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.8% [-1.0%, -0.4%] 9
Improvements ✅
(secondary)
-0.4% [-0.4%, -0.4%] 4
All ❌✅ (primary) -0.8% [-1.0%, -0.4%] 9

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.9% [-1.3%, -0.5%] 2
Improvements ✅
(secondary)
-2.5% [-4.4%, -1.2%] 8
All ❌✅ (primary) -0.9% [-1.3%, -0.5%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.5% [-0.6%, -0.4%] 3
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.5% [-0.6%, -0.4%] 3

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 632.335s -> 630.426s (-0.30%)
Artifact size: 316.60 MiB -> 316.61 MiB (0.00%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 31, 2023
@saethlin saethlin marked this pull request as ready for review September 1, 2023 03:15
@saethlin saethlin removed the S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. label Sep 3, 2023
const VARIANTS: u16 = {
let deps: &[DepKind] = &[$(DepKind::$variant,)*];
let mut i = 0;
while i < deps.len() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for i in 0..deps.len() { will be simpler, no need for mut i, no i += 1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since #110393 I'm pretty sure the only way to do this is with while or loop

@nnethercote
Copy link
Contributor

r=me with the comments addressed.

@bors delegate=saethlin

@bors
Copy link
Contributor

bors commented Sep 3, 2023

✌️ @saethlin, you can now approve this pull request!

If @nnethercote told you to "r=me" after making some further change, please make that change, then do @bors r=@nnethercote

@saethlin
Copy link
Member Author

saethlin commented Sep 4, 2023

@bors r=nnethercote

@bors
Copy link
Contributor

bors commented Sep 4, 2023

📌 Commit 9867023 has been approved by nnethercote

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 4, 2023
@bors
Copy link
Contributor

bors commented Sep 4, 2023

⌛ Testing commit 9867023 with merge 9df32ddd9124ce0e979f8c060486dde73fb751c1...

@bors
Copy link
Contributor

bors commented Sep 4, 2023

☀️ Test successful - checks-actions
Approved by: nnethercote
Pushing 9df32ddd9124ce0e979f8c060486dde73fb751c1 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 4, 2023
@bors
Copy link
Contributor

bors commented Sep 4, 2023

👀 Test was successful, but fast-forwarding failed: 422 Update is not a fast forward

@saethlin
Copy link
Member Author

saethlin commented Sep 4, 2023

rust-lang/homu#75
@bors retry

@bors
Copy link
Contributor

bors commented Sep 4, 2023

⌛ Testing commit 9867023 with merge b14b074...

@bors
Copy link
Contributor

bors commented Sep 4, 2023

☀️ Test successful - checks-actions
Approved by: nnethercote
Pushing b14b074 to master...

@bors bors merged commit b14b074 into rust-lang:master Sep 4, 2023
11 checks passed
@rustbot rustbot added this to the 1.74.0 milestone Sep 4, 2023
@saethlin saethlin deleted the depkind-discrim branch September 4, 2023 06:23
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (b14b074): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.3% [-0.3%, -0.3%] 5
Improvements ✅
(secondary)
-0.3% [-0.3%, -0.3%] 3
All ❌✅ (primary) -0.3% [-0.3%, -0.3%] 5

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 628.522s -> 628.358s (-0.03%)
Artifact size: 316.23 MiB -> 316.14 MiB (-0.03%)

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 7, 2023
…hercote

Use a specialized varint + bitpacking scheme for DepGraph encoding

The previous scheme here uses leb128 to encode the edge tables that represent the incr comp dependency graph. The problem with that scheme is that leb128 has overhead for larger values, and generally relies on the distribution of encoded values being heavily skewed towards smaller values. That is definitely not the case for a dep node index, since they are handed out sequentially and the whole range is covered, the distribution is actually biased in the opposite direction: Most dep nodes are large.

This PR implements a different varint encoding scheme. Instead of applying varint encoding to individual dep node indices (which is extremely branchy) we now apply it per node.

While being built, each node now stores its edges in a `SmallVec` with a bit of extra logic to track the max value of each edge. Then we varint encode the whole batch. This is a gamble: We save on space by only claiming 2 bits per node instead of ~3 bits per edge which is a nice savings but needs to balance out with the space overhead that a single large index in a node with a lot of edges will encode unnecessary bytes in each of that node's edge indices.

Then, to keep the runtime overhead of this encoding scheme down we deserialize our indices by loading 4 bytes for each then masking off the bytes that are't ours. This is much less code and branches than leb128, but relies on having some readable bytes past the end of each edge list. We explicitly add such padding to the in-memory data during decoding. And we also do this decoding lazily, turning a dense on-disk encoding into a peak memory reduction.

Then we apply a bit-packing scheme; since in rust-lang#115391 we now have unused bits on `DepKind`, we use those unused bits (currently there are 7!) to store the 2 bits that we need for the byte width of the edges in each node, then use the remaining bits to store the length of the edge list, if it fits.

r? `@nnethercote`
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Sep 12, 2023
Use a specialized varint + bitpacking scheme for DepGraph encoding

The previous scheme here uses leb128 to encode the edge tables that represent the incr comp dependency graph. The problem with that scheme is that leb128 has overhead for larger values, and generally relies on the distribution of encoded values being heavily skewed towards smaller values. That is definitely not the case for a dep node index, since they are handed out sequentially and the whole range is covered, the distribution is actually biased in the opposite direction: Most dep nodes are large.

This PR implements a different varint encoding scheme. Instead of applying varint encoding to individual dep node indices (which is extremely branchy) we now apply it per node.

While being built, each node now stores its edges in a `SmallVec` with a bit of extra logic to track the max value of each edge. Then we varint encode the whole batch. This is a gamble: We save on space by only claiming 2 bits per node instead of ~3 bits per edge which is a nice savings but needs to balance out with the space overhead that a single large index in a node with a lot of edges will encode unnecessary bytes in each of that node's edge indices.

Then, to keep the runtime overhead of this encoding scheme down we deserialize our indices by loading 4 bytes for each then masking off the bytes that are't ours. This is much less code and branches than leb128, but relies on having some readable bytes past the end of each edge list. We explicitly add such padding to the in-memory data during decoding. And we also do this decoding lazily, turning a dense on-disk encoding into a peak memory reduction.

Then we apply a bit-packing scheme; since in rust-lang/rust#115391 we now have unused bits on `DepKind`, we use those unused bits (currently there are 7!) to store the 2 bits that we need for the byte width of the edges in each node, then use the remaining bits to store the length of the edge list, if it fits.

r? `@nnethercote`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants