-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(datadog_traces sink): APM stats payloads are sent independent of trace payloads and at a set interval. #15084
Conversation
…ocumentation. probably going to reorg a bit.
✅ Deploy Preview for vector-project ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
✅ Deploy Preview for vrl-playground canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few nity comments, just around the inconsistent grammar/capitalization of code comments (I didn't note them all...) - and then one question on an error type. Otherwise the logic seems sound, and the refactoring is appreciated.
Soak Test ResultsBaseline: d620352 ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of my comments are sort of a "take it or leave it", and I would have otherwise approved... but the possible infinite loop bug on the error path in flush_apm_stats_thread
actually stands out as a potentially big issue to me.
Regression Test Results
Baseline: cf73729 Explanation
A regression test is an integrated performance test for
The table below, if present, lists those experiments that have experienced a
statistically significant change in their No interesting changes in Fine details of change detection per experiment.
|
Soak Test ResultsBaseline: cf73729 ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This all looks really good to me. I really appreciate the extensive commenting here.
Just have a number of minor nits.
Regression Test Results
Baseline: ba02a47 Explanation
A regression test is an integrated performance test for
The table below, if present, lists those experiments that have experienced a
statistically significant change in their No interesting changes in Fine details of change detection per experiment.
|
Soak Test ResultsBaseline: ba02a47 ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
…trace payloads and at a set interval. (#15084) - APM stats are now "flushed" in a separate thread, detached from the sink's stream loop. - The stats flushing thread flushes the oldest bucket every 10 seconds, and caches the last two 10 second buckets. - When sink is shutting down, the APM stats thread flushes remaining buckets.
…trace payloads and at a set interval. (#15084) - APM stats are now "flushed" in a separate thread, detached from the sink's stream loop. - The stats flushing thread flushes the oldest bucket every 10 seconds, and caches the last two 10 second buckets. - When sink is shutting down, the APM stats thread flushes remaining buckets.
…trace payloads and at a set interval. (#15084) - APM stats are now "flushed" in a separate thread, detached from the sink's stream loop. - The stats flushing thread flushes the oldest bucket every 10 seconds, and caches the last two 10 second buckets. - When sink is shutting down, the APM stats thread flushes remaining buckets.
Reviewer notes:
Behavioral changes:
Refactor (re-organizing only) changes:
stats.rs
was up to 1.2k lines of code. That's too much for my sanity. So I reorganized the code to more closely match the Agent's Golang code organization.Manual testing results
The two plotted metrics should be "collapsed" into the same line.
// Agent only (base case)
// Agent -> Vector (before fix)
// Agent -> Vector (after fix)