-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROF-10201] Disable profiler allocation counting feature by default #3798
Conversation
**What does this PR do?** This PR introduces a new setting for the profiler: `profiling.advanced.allocation_counting_enabled` which controls the profiler allocation counting feature. This setting is off by default, enabling us to reduce allocation profiling overhead slightly. **Motivation:** We actually used to have this setting back in the ddtrace 1.x series. We introduced it in #2635 and removed it in #3281 -- by tieing it directly to allocation profiling. I decided to re-introduce it so we can disable this feature by default. **Additional Notes:** This PR sits atop #3797 because it makes sense to measure both overhead improvements together, but is otherwise completely independent. **How to test the change?** Here's the results for running the `profiler_allocation` benchmark: ``` ruby 2.7.7p221 (2022-11-24 revision 168ec2b1e5) [x86_64-linux] Warming up -------------------------------------- Allocations (baseline) 1.419M i/100ms Calculating ------------------------------------- Allocations (baseline) 14.535M (± 2.1%) i/s - 146.197M in 10.062717s Warming up -------------------------------------- Allocations (alloc_profiling_enabled) 1.122M i/100ms Calculating ------------------------------------- Allocations (alloc_profiling_enabled) 11.636M (± 2.2%) i/s - 116.679M in 10.032209s Warming up -------------------------------------- Allocations (alloc_profiling_disabled) 1.124M i/100ms Calculating ------------------------------------- Allocations (alloc_profiling_disabled) 11.866M (± 2.6%) i/s - 119.175M in 10.050580s Comparison: Allocations (baseline): 14534979.3 i/s Allocations (alloc_profiling_disabled): 11865926.7 i/s - 1.22x slower Allocations (alloc_profiling_enabled): 11635919.9 i/s - 1.25x slower ``` The difference is close to the margin of error; nevertheless this feature was showing up on the native profiler, and since it was on the hot path for allocation profiling, I think it's worth it.
BenchmarksBenchmark execution time: 2024-07-22 12:50:17 Comparing candidate commit 94f14f0 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 2 unstable metrics. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## ivoanjo/prof-10201-coarse-timestamps #3798 +/- ##
========================================================================
- Coverage 97.91% 97.91% -0.01%
========================================================================
Files 1248 1248
Lines 75192 75210 +18
Branches 3638 3638
========================================================================
+ Hits 73627 73643 +16
- Misses 1565 1567 +2 ☔ View full report in Codecov by Sentry. |
struct cpu_and_wall_time_worker_state *state = active_sampler_instance_state; // Read from global variable, see "sampler global state safety" note above | ||
|
||
// This should not happen in a normal situation because the tracepoint is always enabled after the instance is set | ||
// and disabled before it is cleared, but just in case... | ||
if (state == NULL) return; | ||
|
||
if (RB_UNLIKELY(state->allocation_counting_enabled)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is RB_UNLIKELY
some kind of compiler optimisation macro to optimise resulting code for the false outcome?
And is allocation_counting_enabled
unlikely because it is off by default and you don't expect many users to enable it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and yes!
(Here's where the wrapper is defined)
TBH I don't expect it to have a huge difference between having and not having this macro, but I liked using it in this situation to have more self-describing code in terms of what's the expected common path and the exception code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I stumbled upon RB_UNLIKELY that I'm not familiar with and asked a question for my education :)
What does this PR do?
This PR introduces a new setting for the profiler:
profiling.advanced.allocation_counting_enabled
which controls the profiler allocation counting feature.This setting is off by default, enabling us to reduce allocation profiling overhead slightly.
Motivation:
We actually used to have this setting back in the ddtrace 1.x series. We introduced it in #2635 and removed it in #3281 -- by tieing it directly to allocation profiling.
I decided to re-introduce it so we can disable this feature by default.
Additional Notes:
This PR sits atop #3797 because it makes sense to measure both overhead improvements together, but is otherwise completely independent.
How to test the change?
Here's the results for running the
profiler_allocation
benchmark:The difference is close to the margin of error; nevertheless this feature was showing up on the native profiler, and since it was on the hot path for allocation profiling, I think it's worth it.
What does this PR do?
Motivation:
Additional Notes:
How to test the change?
Unsure? Have a question? Request a review!