Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report profiling data in v2.4 intake format; compress files #53

Merged
merged 10 commits into from
Sep 27, 2022

Conversation

ivoanjo
Copy link
Member

@ivoanjo ivoanjo commented Sep 22, 2022

What does this PR do?

This PR changes libdatadog to report profiling data using the v2.4 intake format. It also compresses the files including excluding the new event.json file using lz4.

This was tested with both the Ruby (agent and agentless modes) and the PHP profilers.

This also introduces two breaking changes:

  1. Breaking API change: the
    ddog_ProfileExporter_new / ProfileExporter::new functions now take two additional arguments -- the profiling library name and version.
  2. Breaking behavior change: Files are now automatically compressed with lz4 by libdatadog so libraries mustn't do their own compression (Ruby is such an example).

We expect that other than the API change, using the v2.4 intake format and the added compression will be transparent to the libdatadog users.

Thanks to @morrisonlevi for pairing with me on this.

Note that this does not (yet) include support for including attributes in the reporting data. I'll leave that for a separate PR.

Motivation

Get libdatadog users to use intake v2.4, and lay the ground work for including attributes in the reported data.

Also save some bytes over the wire for files. A few stats:
- PHP CLI: running composer create-project symfony/symfony-demo resulted in a 40016 byte pprof which compressed to 18139 bytes in 141661 nanoseconds. That's a compression ratio of 2.2061 and a space savings of 54.67%.
- Ruby github relenv test app:

Uncompressed Gzip (Current Ruby profiler) LZ4 (libdatadog)
code-provenance.json 49K 6K 9.5K
rubyprofile.pprof 402K 48K 76K

Additional Notes

(Nothing)

How to test the change?

Validate that data can still be reported to the backend, in both agent as well as agentless modes.

This was tested with both the Ruby (agent and agentless modes) and
the PHP profilers.

This also introduces a breaking API change: the
`ddog_ProfileExporter_build` / `ProfileExporter::build` functions
now take two additional arguments -- the profiling library name
and version.

Other than that change, using the v2.4 intake format is transparent
to the libdatadog users.

Thanks to @morrisonlevi for pairing with me on this.

Note that this does not (yet) include support for including
attributes in the reporting data. I'll leave that for a separate PR.
@ivoanjo ivoanjo requested review from a team as code owners September 22, 2022 15:12
@morrisonlevi morrisonlevi changed the title Report profiling data in v2.4 intake format Report profiling data in v2.4 intake format; compress files Sep 22, 2022
@ivoanjo
Copy link
Member Author

ivoanjo commented Sep 23, 2022

@morrisonlevi I was able to test your lz4 changes successfully with the Ruby profiler as well. I had, of course, to disable my own compression -- funnily enough the intake would still accept profiles that were gzipped by Ruby + lz4'd by libdatadog but they never showed up on the profile list ;)

)
let mut encoder = FrameEncoder::new(Vec::new());
encoder.write_all(file.bytes)?;
form.add_reader_file(file.name, Cursor::new(encoder.finish()?), file.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the compression format defined in the payload ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the intake just looks for magic bits to distinguish if an uploaded file is [gzip|lz4] compressed or not. Otherwise no changes needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ivo is right, according to Florian:

Levi: How does it know what compression format to use? Inspect magic bytes?
Florian: yes
Florian: I was looking at java profiler and it uses LZ4 by default
Florian: Jaroslav experiments has shown that it's a good space/CPU tradeoff for the profiler

It's also is why I picked lz4. I didn't test or verify these claims beyond "works for me."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. There are indeed magic values at the start of the payload that you can use.
As a Datadog agnostic library, this is not super friendly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a Datadog agnostic library, this is not super friendly.

Do you mean, if other teams at Datadog want to reuse this, or if non-datadog people want to reuse this? I'd like to understand your concerns :)

@morrisonlevi
Copy link
Contributor

I'm clarifying with Florian whether it's important that name and filename are like this:

Content-Disposition: form-data; name="time"; filename="time.pprof"

Or if this is fine:

Content-Disposition: form-data; name="time.pprof"; filename="time.pprof"

Copy link
Contributor

@morrisonlevi morrisonlevi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still waiting on Florian for a question about name vs filename. Aside from this, with fresh I eyes I think we should set profiling library name and version at the same place we set family, which is when we build the exporter, not the request, as this information is unlikely to change.

I also don't know how many people are re-using an exporter vs just making a new one so... it may be a moot point.

@@ -120,26 +122,46 @@ impl ProfileExporter {
}

/// Build a Request object representing the profile information provided.
#[allow(clippy::too_many_arguments)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is fine for now, but we should probably work on breaking this up when we add support for the extra attributes feature.

)
let mut encoder = FrameEncoder::new(Vec::new());
encoder.write_all(file.bytes)?;
form.add_reader_file(file.name, Cursor::new(encoder.finish()?), file.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ivo is right, according to Florian:

Levi: How does it know what compression format to use? Inspect magic bytes?
Florian: yes
Florian: I was looking at java profiler and it uses LZ4 by default
Florian: Jaroslav experiments has shown that it's a good space/CPU tradeoff for the profiler

It's also is why I picked lz4. I didn't test or verify these claims beyond "works for me."

@ivoanjo
Copy link
Member Author

ivoanjo commented Sep 27, 2022

I'm still waiting on Florian for a question about name vs filename. Aside from this, with fresh I eyes I think we should set profiling library name and version at the same place we set family, which is when we build the exporter, not the request, as this information is unlikely to change.

Right, that makes sense. I'll probably not have time today, but I've made and note and will come back soon-ish™️

@morrisonlevi
Copy link
Contributor

I'm clarifying with Florian whether it's important that name and filename are like this:

Content-Disposition: form-data; name="time"; filename="time.pprof"

Or if this is fine:

Content-Disposition: form-data; name="time.pprof"; filename="time.pprof"

I confirmed that intake doesn't care about the form field name for files other than event.json, so we can reuse the filename as the form field name 👍🏻

Copy link
Contributor

@morrisonlevi morrisonlevi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but then again I wrote or paired for all of it, so that's expected ^_^. I'll let someone else review it.

profiling/src/exporter/mod.rs Outdated Show resolved Hide resolved
profiling/src/exporter/mod.rs Outdated Show resolved Hide resolved
) -> anyhow::Result<ProfileExporter> {
) -> anyhow::Result<ProfileExporter>
where
F: Into<Cow<'static, str>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to alias to different types ?

Also the Cow 'static lifetime seems strange to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of thinking Cow<str> as a string that gets cloned on write, think of it instead as either a Borrowed string, or an Owned one. In this case, we can borrow a 'static string because that will, by definition, live long enough. Doing so allows us to save some memory allocations if we can show the lifetime is static (such as PHP profiler calling this from Rust). However, if we can't prove it, then we need an Owned version that copies it; this is what will happen across the C FFI.
The reason for taking an Into<Cow<_>> is so that you can pass a &str or a String or anything else that the compiler knows how to convert via into or from, making it nicer for the caller since they don't have to wrap it into a Cow<_>. The reason for having it repeated 3 times is so each parameter can independently do this -- if we had only one type IntoCow: Into<Cow<'static, str>> that they all used, then they'd all have to be the same type, which isn't so nice. For instance, maybe the library name is a &str but the version is a String.

Copy link
Contributor

@r1viollet r1viollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !
Happy to integrate this in native !

@morrisonlevi morrisonlevi merged commit 5b5a120 into main Sep 27, 2022
@morrisonlevi morrisonlevi deleted the ivoanjo/intake-v24-v2 branch September 27, 2022 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants