Skip to content

Commit

Permalink
[PROF-7645] Add support for attaching internal metadata in profiling …
Browse files Browse the repository at this point in the history
…export (#181)

* [PROF-7645] Add support for attaching internal metadata in profiling exporter

**What does this PR do?**:

This PR adds support for attaching internal metadata to profiles sent
via the profiling exporter.

This is a (small) breaking API change, since the signatures of
both `ProfileExporter::build` and `ddog_prof_Exporter_Request_build`
now take an extra argument.

FYI if you're upgrading libdatadog, you'll probably want to supply
`None` / `null` until in the future you figure out that you want
to send internal metadata.

**Motivation**:

The intention of this internal metadata is to allow profilers to attach
extra pieces of information to profiles that are not very interesting
to show to customers by default (such as Ruby's
"no_signal_workaround_enabled" or Go's "execution_trace_enabled").

For more details on this, check the Datadog internal
"RFC: Attaching internal metadata to pprof profiles".

**Additional Notes**:

Design-wise, I could've gone in a few different directions for:

a. How to represent the internal metadata in `ProfileExporter::build`
b. How to represent the internal metadata across the FFI in
   `ddog_prof_Exporter_Request_build`

Starting with a: Since the information is going to be represented in
JSON, I opted to "leak" this implementation detail by making the
parameter a `serde_json::Value`. This means that callers can take full
advantage of JSON to send whatever they want (e.g. nested objects,
types other than string, etc), rather than being constrained to some
smaller subset (e.g. if I imposed a list of pairs of strings).

This seemed a reasonable trade-off; I don't expect we'll go away from
JSON for encoding this info anytime soon, and even if we do, we can
always put a JSON string inside whatever we end up going with.

Concerning b: Getting complex types across the FFI boundary is really
really really annoying, for both libdatadog (which needs to expose a
bunch of types, and handle them), and the caller (which needs to think
about how they're going to manage lifetimes and whatnot of the whole
thing). To avoid this, I chose to instead represent the parameter as a
raw JSON string across the FFI. This allows ffi users the same
expressiveness as Rust users in terms of what they can send as internal
metadata, with the trade-off that ffi callers need to serialize their
stuff as JSON and libdatadog needs to deserialize it again.

Since internal metadata is something that gets passed along only once
per minute AND it's not expected to have high complexity, I think the
small overhead of throwing JSON strings across the ffi boundary is
worth the simplification to code on both sides.

**How to test the change?**:

I have expanded the tests to test the `event.json` file, including
the new parameter.

You should shortly see linked in this PR the Ruby PR to make use
of this feature to send the `no_signal_workaround_enabled`
parameter.

* Make rustfmt happy

* Add comment asking people to track usage of internal_metadata parameter

* Minor: Fix comment using wrong name for variable
  • Loading branch information
ivoanjo authored Jun 29, 2023
1 parent 26795f3 commit 9d47fc7
Show file tree
Hide file tree
Showing 6 changed files with 282 additions and 16 deletions.
2 changes: 2 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions examples/ffi/exporter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,8 @@ int main(int argc, char *argv[]) {
}};

ddog_prof_Exporter_Slice_File files = {.ptr = files_, .len = sizeof files_ / sizeof *files_};
ddog_CharSlice internal_metadata_example =
DDOG_CHARSLICE_C("{\"no_signals_workaround_enabled\": \"true\", \"execution_trace_enabled\": \"false\"}");

ddog_prof_Exporter_Request_BuildResult build_result = ddog_prof_Exporter_Request_build(
exporter,
Expand All @@ -139,6 +141,7 @@ int main(int argc, char *argv[]) {
files,
nullptr,
nullptr,
&internal_metadata_example,
30000
);
ddog_prof_EncodedProfile_drop(encoded_profile);
Expand Down
2 changes: 2 additions & 0 deletions profiling-ffi/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,5 @@ ddcommon = { path = "../ddcommon"}
ddcommon-ffi = { path = "../ddcommon-ffi"}
libc = "0.2"
tokio-util = "0.7.1"
serde_json = {version = "1.0"}
futures = { version = "0.3", default-features = false }
214 changes: 202 additions & 12 deletions profiling-ffi/src/exporter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -202,33 +202,48 @@ impl From<RequestBuildResult> for Result<Box<Request>, String> {
/// If successful, builds a `ddog_prof_Exporter_Request` object based on the
/// profile data supplied. If unsuccessful, it returns an error message.
///
/// For details on the `optional_internal_metadata_json`, please reference the Datadog-internal
/// "RFC: Attaching internal metadata to pprof profiles".
/// If you use this parameter, please update the RFC with your use-case, so we can keep track of how this
/// is getting used.
///
/// # Safety
/// The `exporter`, `additional_stats`, and `endpoint_stats` args should be
/// valid objects created by this module, except NULL is allowed.
/// The `exporter`, `optional_additional_stats`, and `optional_endpoint_stats` args should be
/// valid objects created by this module.
/// NULL is allowed for `optional_additional_tags`, `optional_endpoints_stats` and
/// `optional_internal_metadata_json`.
#[no_mangle]
#[must_use]
pub unsafe extern "C" fn ddog_prof_Exporter_Request_build(
exporter: Option<&mut ProfileExporter>,
start: Timespec,
end: Timespec,
files: Slice<File>,
additional_tags: Option<&ddcommon_ffi::Vec<Tag>>,
endpoints_stats: Option<&profiled_endpoints::ProfiledEndpointsStats>,
optional_additional_tags: Option<&ddcommon_ffi::Vec<Tag>>,
optional_endpoints_stats: Option<&profiled_endpoints::ProfiledEndpointsStats>,
optional_internal_metadata_json: Option<&CharSlice>,
timeout_ms: u64,
) -> RequestBuildResult {
match exporter {
None => RequestBuildResult::Err(anyhow::anyhow!("exporter was null").into()),
Some(exporter) => {
let timeout = std::time::Duration::from_millis(timeout_ms);
let converted_files = into_vec_files(files);
let tags = additional_tags.map(|tags| tags.iter().cloned().collect());
let tags = optional_additional_tags.map(|tags| tags.iter().cloned().collect());

let internal_metadata =
match parse_internal_metadata_json(optional_internal_metadata_json) {
Ok(parsed) => parsed,
Err(err) => return RequestBuildResult::Err(err.into()),
};

match exporter.build(
start.into(),
end.into(),
converted_files.as_slice(),
tags.as_ref(),
endpoints_stats,
optional_endpoints_stats,
internal_metadata,
timeout,
) {
Ok(request) => {
Expand All @@ -240,6 +255,25 @@ pub unsafe extern "C" fn ddog_prof_Exporter_Request_build(
}
}

unsafe fn parse_internal_metadata_json(
internal_metadata_json: Option<&CharSlice>,
) -> anyhow::Result<Option<serde_json::Value>> {
match internal_metadata_json {
None => Ok(None),
Some(internal_metadata_json) => {
let json = internal_metadata_json.try_to_utf8()?;
match serde_json::from_str(json) {
Ok(parsed) => Ok(Some(parsed)),
Err(error) => Err(anyhow::anyhow!(
"Failed to parse contents of internal_metadata json string (`{}`): {}.",
json,
error
)),
}
}
}
}

/// # Safety
/// Each pointer of `request` may be null, but if non-null the inner-most
/// pointer must point to a valid `ddog_prof_Exporter_Request` object made by
Expand Down Expand Up @@ -390,6 +424,7 @@ pub unsafe extern "C" fn ddog_CancellationToken_drop(token: Option<&mut Cancella
mod test {
use super::*;
use ddcommon_ffi::Slice;
use serde_json::json;

fn profiling_library_name() -> CharSlice<'static> {
CharSlice::from("dd-trace-foo")
Expand All @@ -411,6 +446,26 @@ mod test {
CharSlice::from(base_url())
}

fn parsed_event_json(request: RequestBuildResult) -> serde_json::Value {
let request = Result::from(request).unwrap();

// Really hacky way of getting the event.json file contents, because I didn't want to implement a full multipart parser
// and didn't find a particularly good alternative.
// If you do figure out a better way, there's another copy of this code in the profiling tests, please update there too :)
let body = request.body();
let body_bytes: String = String::from_utf8_lossy(
&futures::executor::block_on(hyper::body::to_bytes(body)).unwrap(),
)
.to_string();
let event_json = body_bytes
.lines()
.skip_while(|line| !line.contains(r#"filename="event.json""#))
.nth(2)
.unwrap();

serde_json::from_str(event_json).unwrap()
}

#[test]
fn profile_exporter_new_and_delete() {
let mut tags = ddcommon_ffi::Vec::default();
Expand Down Expand Up @@ -478,17 +533,151 @@ mod test {
Slice::from(files),
None,
None,
None,
timeout_milliseconds,
)
};

let build_result = Result::from(build_result);
build_result.unwrap();
let parsed_event_json = parsed_event_json(build_result);

assert_eq!(parsed_event_json["attachments"], json!(["foo.pprof"]));
assert_eq!(parsed_event_json["endpoint_counts"], json!(null));
assert_eq!(
parsed_event_json["start"],
json!("1970-01-01T00:00:12.000000034Z")
);
assert_eq!(
parsed_event_json["end"],
json!("1970-01-01T00:00:56.000000078Z")
);
assert_eq!(parsed_event_json["family"], json!("native"));
assert_eq!(parsed_event_json["internal"], json!({}));
assert_eq!(parsed_event_json["tags_profiler"], json!(""));
assert_eq!(parsed_event_json["version"], json!("4"));

// TODO: Assert on contents of attachments, as well as on the headers/configuration for the exporter
}

#[test]
fn test_build_with_internal_metadata() {
let exporter_result = unsafe {
ddog_prof_Exporter_new(
profiling_library_name(),
profiling_library_version(),
family(),
None,
endpoint_agent(endpoint()),
)
};

let mut exporter = match exporter_result {
ExporterNewResult::Ok(e) => e,
ExporterNewResult::Err(_) => panic!("Should not occur!"),
};

let files: &[File] = &[File {
name: CharSlice::from("foo.pprof"),
file: ByteSlice::from(b"dummy contents" as &[u8]),
}];

let start = Timespec {
seconds: 12,
nanoseconds: 34,
};
let finish = Timespec {
seconds: 56,
nanoseconds: 78,
};
let timeout_milliseconds = 90;

let raw_internal_metadata = CharSlice::from(
r#"
{
"no_signals_workaround_enabled": "true",
"execution_trace_enabled": "false",
"extra object": {"key": [1, 2, true]}
}
"#,
);

let build_result = unsafe {
ddog_prof_Exporter_Request_build(
Some(exporter.as_mut()),
start,
finish,
Slice::from(files),
None,
None,
Some(&raw_internal_metadata),
timeout_milliseconds,
)
};

// TODO: Currently, we're only testing that a request was built (building did not fail), but
// we have no coverage for the request actually being correct.
// It'd be nice to actually perform the request, capture its contents, and assert that
// they are as expected.
let parsed_event_json = parsed_event_json(build_result);

assert_eq!(
parsed_event_json["internal"],
json!({
"no_signals_workaround_enabled": "true",
"execution_trace_enabled": "false",
"extra object": {"key": [1, 2, true]}
})
);
}

#[test]
fn test_build_with_invalid_internal_metadata() {
let exporter_result = unsafe {
ddog_prof_Exporter_new(
profiling_library_name(),
profiling_library_version(),
family(),
None,
endpoint_agent(endpoint()),
)
};

let mut exporter = match exporter_result {
ExporterNewResult::Ok(e) => e,
ExporterNewResult::Err(_) => panic!("Should not occur!"),
};

let files: &[File] = &[File {
name: CharSlice::from("foo.pprof"),
file: ByteSlice::from(b"dummy contents" as &[u8]),
}];

let start = Timespec {
seconds: 12,
nanoseconds: 34,
};
let finish = Timespec {
seconds: 56,
nanoseconds: 78,
};
let timeout_milliseconds = 90;

let raw_internal_metadata = CharSlice::from("this is not a valid json string");

let build_result = unsafe {
ddog_prof_Exporter_Request_build(
Some(exporter.as_mut()),
start,
finish,
Slice::from(files),
None,
None,
Some(&raw_internal_metadata),
timeout_milliseconds,
)
};

match build_result {
RequestBuildResult::Ok(_) => panic!("Should not happen!"),
RequestBuildResult::Err(message) => assert!(String::from(message).starts_with(
r#"Failed to parse contents of internal_metadata json string (`this is not a valid json string`)"#
)),
}
}

#[test]
Expand All @@ -511,6 +700,7 @@ mod test {
Slice::default(),
None,
None,
None,
timeout_milliseconds,
)
};
Expand Down
12 changes: 12 additions & 0 deletions profiling/src/exporter/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,10 @@ impl Request {
self.req.headers()
}

pub fn body(self) -> hyper::Body {
self.req.into_body()
}

async fn send(
self,
client: &HttpClient,
Expand Down Expand Up @@ -148,14 +152,21 @@ impl ProfileExporter {
})
}

#[allow(clippy::too_many_arguments)]
/// Build a Request object representing the profile information provided.
///
/// For details on the `internal_metadata_json`, please reference the Datadog-internal
/// "RFC: Attaching internal metadata to pprof profiles".
/// If you use this parameter, please update the RFC with your use-case, so we can keep track of how this
/// is getting used.
pub fn build(
&self,
start: DateTime<Utc>,
end: DateTime<Utc>,
files: &[File],
additional_tags: Option<&Vec<Tag>>,
endpoint_counts: Option<&ProfiledEndpointsStats>,
internal_metadata: Option<serde_json::Value>,
timeout: std::time::Duration,
) -> anyhow::Result<Request> {
let mut form = multipart::Form::default();
Expand Down Expand Up @@ -213,6 +224,7 @@ impl ProfileExporter {
"family": self.family.as_ref(),
"version": "4",
"endpoint_counts" : endpoint_counts,
"internal": internal_metadata.unwrap_or_else(|| json!({})),
})
.to_string();

Expand Down
Loading

0 comments on commit 9d47fc7

Please sign in to comment.