-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cache-based fast path for cfgs and check-cfgs #122207
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
@bors try @rust-timer queue |
Failed to set assignee to
|
This comment has been minimized.
This comment has been minimized.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
Add cache-based fast path for cfgs and check-cfgs TODO, waiting for perf.
@Urgau I believe the handling specially-recognizes In any case, it's somewhat useless after-the-fact, as they will remain subscribed to the thread and must manually unsubscribe. |
Yeah, I kinda knew about it but I was hoping I was wrong 😄, but the real issue is that rustbot didn't considered the Draft status and assigned someone anyway. It should have waited for the status to changed; but that's a topic for another time. |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (4ee97b1): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 649.894s -> 649.476s (-0.06%) |
hot unambiguous perf results! |
24d06a9
to
b5e463d
Compare
No perf improvement or regression, which is expected since don't have any check-cfg or big cfg test. Let's try an approach that includes all expected cfgs and cfg. It again shouldn't have any impact but it's switching from |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Add cache-based fast path for cfgs and check-cfgs TODO, waiting for perf.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (478862c): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 649.076s -> 653.276s (0.65%) |
(The artifact size change is just getting the |
This cache complicates the cfg logic, so I don't want to merge it until there are some benchmarks showing improvements. |
There are merge commits (commits with multiple parents) in your changes. We have a no merge policy so these commits will need to be removed for this pull request to be merged. You can start a rebase with the following commands:
The following commits are merge commits: |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Add cache-based fast path for cfgs and check-cfgs This PR add a fast path for testing if a cfg is enabled and is expected. This cache reduce the number of lookup from 2/3 to just 1 (in the good path). Currently: - when there are no expecteds cfgs, we do 2 lookup (one for finding if the cfg is enabled and the second for trying to get the expected values) - when there are some expecteds cfgs, we do 3 lookup in the good case (same as above + one to see if the value is in the expected values) and 2 in the "bad" case With this PR we will only do 1 lookup if the cfg is expected (aka the good case) no matter if the cfg is actually enabled or not. See the PR changes/commits for more details. The perf runs done below shows that there is no to a slightly improvement, which is expected as we do less work, but we also don't have any benchmark with many cfgs to be assertive.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (81d5a42): comparison URL. Overall result: ❌ regressions - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 676.654s -> 676.922s (0.04%) |
Nothing significant. Closing. |
This PR add a fast path for testing if a cfg is enabled and is expected.
This cache reduce the number of lookup from 2/3 to just 1 (in the good path).
Currently:
With this PR we will only do 1 lookup if the cfg is expected (aka the good case) no matter if the cfg is actually enabled or not. See the PR changes/commits for more details.
The perf runs done below shows that there is no to a slightly improvement, which is expected as we do less work, but we also don't have any benchmark with many cfgs to be assertive.