ekump/APMSP-1279 benchmark trace exporting #531

ekump · 2024-07-15T16:16:47Z

What does this PR do?

Move some logic that resided in the sidecar for building a TracerPayloadCollection from msgpack bytes and TracerHeaderTags into the trace_utils crate as this logic can / should be used anywhere TracerPayloadCollection is used. No changes to the logic were made. This logic includes the deserialization of traces in msgpack format into an internal representation of traces (currently pb). Previously, this functionality wasn't directly covered under unit tests. Now it is.
Add a micro-benchmark test that covers the creation of TracerPayloadCollections for v04 traces.

Motivation

There is a known problem with excessive allocations when processing traces through libdatadog that is currently blocking the sidecar from being turned on by default for PHP customers. This problem also blocks adoption of data-pipeline by .NET and Node libraries. These benchmarks should help establish a baseline for forthcoming work to improve performance. By moving the deserialization of msgpack closer to the TracerPayloadCollection implementation it ensures common encapsulated logic for both data-pipeline and SIdecar.

Additional Notes

This implementation currently only works for v04. Work to support v05 and v07 should follow.

How to test the change?

Unit tests added.
benchmarks can be run via cargo bench --package datadog-trace-utils

codecov-commenter · 2024-07-15T16:25:28Z

Codecov Report

Attention: Patch coverage is 86.82927% with 27 lines in your changes missing coverage. Please review.

Project coverage is 70.42%. Comparing base (63d4ec8) to head (fda4ffa).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #531      +/-   ##
==========================================
+ Coverage   70.29%   70.42%   +0.13%     
==========================================
  Files         206      206              
  Lines       27818    27964     +146     
==========================================
+ Hits        19554    19694     +140     
- Misses       8264     8270       +6

Components	Coverage Δ
crashtracker	`16.87% <ø> (ø)`
datadog-alloc	`98.73% <ø> (ø)`
data-pipeline	`51.15% <0.00%> (ø)`
data-pipeline-ffi	`0.00% <ø> (ø)`
ddcommon	`86.41% <ø> (ø)`
ddcommon-ffi	`75.31% <ø> (ø)`
ddtelemetry	`59.02% <ø> (ø)`
ipc	`84.13% <ø> (ø)`
profiling	`78.68% <ø> (ø)`
profiling-ffi	`58.26% <ø> (ø)`
serverless	`0.00% <ø> (ø)`
sidecar	`35.70% <0.00%> (+0.05%)`	⬆️
sidecar-ffi	`0.00% <ø> (ø)`
spawn-worker	`54.98% <ø> (ø)`
trace-mini-agent	`71.04% <100.00%> (+0.10%)`	⬆️
trace-normalization	`98.24% <ø> (ø)`
trace-obfuscation	`95.73% <ø> (ø)`
trace-protobuf	`77.16% <ø> (ø)`
trace-utils	`90.75% <91.95%> (+0.07%)`	⬆️

pr-commenter · 2024-07-15T20:08:39Z

Benchmarks

This comment was omitted because it was over 65536 characters.Please check the Gitlab Job logs to see its output.

bwoebi · 2024-07-15T21:00:09Z

trace-utils/src/tracer_payload.rs

@@ -106,14 +105,140 @@ impl TracerPayloadCollection {
    }
 }

+pub type TracerPayloadChunkProcessor<'a> = Box<dyn Fn(&mut pb::TraceChunk, usize) + 'a>;


The proper way to implement this is having a trait TracerPayloadChunkProcessor { fn process(&mut pb::TraceChunk, usize); } and adding a PhantomData<T> to TracerPayloadParams with T: TracerPayloadChunkProcessor.

This also ~~avoids~~ allows rust fully compiling it away if not used.

This also avoids rust fully compiling it away if not used.

Could you elaborate on why that is beneficial?

I meant allows, not avoids.

Saving an extra memory allocation and call mostly. Nothing major, just idiomatic and a tiny bit better.

Probably better to prefer static dispatch. Using boxed fn is a bit like Virtual (dynamic) method dispatch in C++ - useful when polymorphism is required but adding overhead.

Since this is part of trace chunk processing that can happen in a tight loop.

In addition to extra instructions it will have higher chance to incur a page fault whenever processor is called.
Ideally we should keep all code needed to do chunk processing within the CPU cache when possible.

@bwoebi Ah ok, that makes more sense.

I agree, using dynamic dispatch is not optimal. I was just trying to limit the blast radius of the refactor and maintain the same function signature for trace_utils::collect_trace_chunks.

I didn't want to commit the changes to this branch yet as I still need to do a bunch of cleanup as well as fix tests. But is this what you had in mind? It uses a trait as you suggested and avoids the need for any dynamic dispatch. There is no need for PhantomData though. If this makes sense, I'll incorporate it into this PR.

Yes, that works too :-)

VianneyRuhlmann · 2024-07-16T15:40:20Z

tools/docker/Dockerfile.build

@@ -157,4 +159,4 @@ RUN ./build-profiling-ffi.sh /build/output

 FROM scratch as ffi_build_output

-COPY --from=ffi_build /build/output/ ./
+COPY --from=ffi_build /build/output/ ./


nitt: missing newline

I'll get my IDE configured correctly one of these days

VianneyRuhlmann · 2024-07-16T15:41:43Z

trace-utils/Cargo.toml


 [features]
-test-utils = ["httpmock", "testcontainers", "cargo_metadata"]
+test-utils = ["httpmock", "testcontainers", "cargo_metadata"]


nitt: missing newline

VianneyRuhlmann · 2024-07-16T15:59:49Z

trace-utils/src/tracer_payload.rs

 #[cfg(test)]
 mod tests {
    use super::*;
    use crate::test_utils::create_test_span;
-    use datadog_trace_protobuf::pb::TraceChunk;
+    use datadog_trace_protobuf::pb::{Span, TraceChunk};


Why do you remove the import of Span everywhere except here ?

That was not intentional, will fix.

VianneyRuhlmann · 2024-07-16T16:13:35Z

trace-utils/src/tracer_payload.rs

+    tracer_header_tags: &'a TracerHeaderTags<'a>,
+    /// A boxed closure that processes each `TraceChunk`, allowing for custom logic to be applied
+    /// during the conversion process.
+    process_chunk: TracerPayloadChunkProcessor<'a>,


Do we often use a ChunkProcessor ? If not it feels a bit heavy to have to create an empty closure every time we call it maybe it can be an Option or provide an alternative constructor which creates the empty closure.

That's a fair point. It currently only works for v07 and is only used by the mini-agent. For now, I created a DefaultTraceChunkProcessor that can be passed so we don't have to implement the trait just to do a no-op. The compiler should optimize away the call to the processor in collect_trace_chunks in this situation.

I'm inclined to leave it as is for now. I think we may want to do some further refactoring to TracerPayloads in the not to distant future and we may want to think of ways to streamline how some of this works.

pawelchcki · 2024-07-16T17:07:45Z

trace-utils/src/tracer_payload.rs

+            start: 1,
+            duration: 5,
+            error: 0,
+            meta: HashMap::new(),


Not related to this PR. But we whould generally avoid using Hashmaps for data which doesn't require it.

e.g. in many tracers we replaced HashMap for meta and metrics with a list of key,values - for internal representation - because it avoid a lot of unnecessary memory operations. And is generally alot faster since the map is very rarely looked up.

I don't know about that, but at least PHP does quite a bunch of meta/metrics lookups at serialization time.
Also, e.g. sampling can be tag based.
And there are also quite a bunch of tags which are tied to state, like sampling, decision makers etc. which end up requiring lookups.
I doubt that a key-value list would bring tangible benefits over a solid HashTable impl.

into TracerPayload. This functionality to turn msgpack bytes and header tags into a TracerPayloadCollection should live somewhere more common than the sidecar as it will need to also be used by data-pipeline (and presumably agentless). It only supports v04 at the moment, but should eventually support v05 and v07 as well.

Deserializing msgpack bytes into an internal representation is currently inefficient. Before work starts on fixing the performance issues we want to set some baseline benchmarks.

…c dispatch

trace-utils/Cargo.toml

Co-authored-by: Bob Weinand <bob.weinand@datadoghq.com>

github-actions bot added mini-agent sidecar labels Jul 15, 2024

ekump force-pushed the ekump/APMSP-1279-benchmark-trace-exporting branch from a67c5ed to 1709ecc Compare July 15, 2024 19:17

ekump marked this pull request as ready for review July 15, 2024 19:19

ekump requested review from a team as code owners July 15, 2024 19:19

ekump force-pushed the ekump/APMSP-1279-benchmark-trace-exporting branch 2 times, most recently from 7acf063 to bbc9f26 Compare July 15, 2024 19:37

ekump requested a review from a team as a code owner July 15, 2024 19:54

bwoebi reviewed Jul 15, 2024

View reviewed changes

hoolioh approved these changes Jul 16, 2024

View reviewed changes

VianneyRuhlmann approved these changes Jul 16, 2024

View reviewed changes

pawelchcki reviewed Jul 16, 2024

View reviewed changes

github-actions bot added the data-pipeline label Jul 17, 2024

ekump added 11 commits July 17, 2024 15:37

Adding some initial benchmarks for deserializing traces from msgpack

a02c221

Deserializing msgpack bytes into an internal representation is currently inefficient. Before work starts on fixing the performance issues we want to set some baseline benchmarks.

attempting to fix benchmarking on CI

9a670e2

minor cleanup

11185a2

WIP: Switch the chunk processing to use a trait impl instead of dyn

ad029ff

change the chunk processor from a closure to a trait to reduce dynami…

b28bbf1

…c dispatch

wip

101a4a0

cleanup

dee6b83

fixing new benchmarks

527a895

fix rustfmt issues

7de91fe

fixing example code

33799c5

ekump force-pushed the ekump/APMSP-1279-benchmark-trace-exporting branch from 4d97808 to 33799c5 Compare July 17, 2024 19:37

bwoebi reviewed Jul 17, 2024

View reviewed changes

trace-utils/Cargo.toml Show resolved Hide resolved

bwoebi approved these changes Jul 17, 2024

View reviewed changes

Update trace-utils/Cargo.toml

fda4ffa

Co-authored-by: Bob Weinand <bob.weinand@datadoghq.com>

ekump merged commit f07597b into main Jul 17, 2024
32 checks passed

ekump deleted the ekump/APMSP-1279-benchmark-trace-exporting branch July 17, 2024 21:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ekump/APMSP-1279 benchmark trace exporting #531

ekump/APMSP-1279 benchmark trace exporting #531

ekump commented Jul 15, 2024 •

edited

Loading

codecov-commenter commented Jul 15, 2024 •

edited

Loading

pr-commenter bot commented Jul 15, 2024 •

edited

Loading

bwoebi Jul 15, 2024 •

edited

Loading

ekump Jul 16, 2024

bwoebi Jul 16, 2024

pawelchcki Jul 16, 2024

ekump Jul 17, 2024

bwoebi Jul 17, 2024

VianneyRuhlmann Jul 16, 2024

ekump Jul 17, 2024

VianneyRuhlmann Jul 16, 2024

VianneyRuhlmann Jul 16, 2024

ekump Jul 17, 2024

VianneyRuhlmann Jul 16, 2024 •

edited

Loading

ekump Jul 17, 2024

pawelchcki Jul 16, 2024

bwoebi Jul 16, 2024

ekump/APMSP-1279 benchmark trace exporting #531

ekump/APMSP-1279 benchmark trace exporting #531

Conversation

ekump commented Jul 15, 2024 • edited Loading

What does this PR do?

Motivation

Additional Notes

How to test the change?

codecov-commenter commented Jul 15, 2024 • edited Loading

Codecov Report

pr-commenter bot commented Jul 15, 2024 • edited Loading

Benchmarks

bwoebi Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VianneyRuhlmann Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ekump commented Jul 15, 2024 •

edited

Loading

codecov-commenter commented Jul 15, 2024 •

edited

Loading

pr-commenter bot commented Jul 15, 2024 •

edited

Loading

bwoebi Jul 15, 2024 •

edited

Loading

VianneyRuhlmann Jul 16, 2024 •

edited

Loading