Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a new ddcommon-ffi crate and factorise generic ffi code inside it #24

Merged
merged 20 commits into from
Jul 19, 2022

Conversation

paullegranddc
Copy link
Contributor

@paullegranddc paullegranddc commented Jun 7, 2022

What does this PR do?

  • Create a ddcommon-ffi crate, to contain generic ffi code that can be used by both ddprof and ddtelemetry ffis
  • Move the Slice, Vec and Tag implementations to the ddcommon-ffi crate
  • Rename the generated header from ddprof/ffi.h to datadog/common.h and datadog/profiling.h
  • Add a tool crate for custom scripts used during build, that are to complex to be implemented in bash
  • Add a step to move every struct and enum definition from the ddprof ffi header to the ddcommon ffi header
  • Rename the namspace prefix from ddprof_ffi_* to ddog_*

Motivation

I want to reuse the same abstractions for the ddtelemetry-ffi layer.
By splitting these into different crate, we can generate inividual header file for ddprof-ffi, ddtelemetry-ffi, and ddcommon-ffi.

This causes issues though. Cbindgen only generates struct definition when the generics are resolved, this means that some struct headers will be generated multiple time if we try to run cbindgen for multiple crates. This is why this PR adds an extra processing step to deduplicate common struct definitions spread in multiple headers, and move them to the ddcommon-ffi header.

@paullegranddc paullegranddc requested review from a team as code owners June 7, 2022 17:13
@r1viollet
Copy link
Contributor

Factorizing is nice ! Thanks
The cbindgen files need to be updated we are no longer defining definitions to vec slice tag

Copy link
Contributor

@pawelchcki pawelchcki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!
Thanks @paullegranddc !

@paullegranddc paullegranddc force-pushed the paullgdc/factorise_generic_ffi_code branch from fc06f24 to 1b2a692 Compare July 8, 2022 17:37
Comment on lines 6 to 7
#include <libdatadog/ddcommon.h>
#include <libdatadog/ddprof.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this instead?

#include <datadog/common.h>
#include <datadog/profiling.h>

Thoughts?

Copy link
Contributor Author

@paullegranddc paullegranddc Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agreed, removing the lib and dd prefixes will be less heavy

Copy link
Contributor

@morrisonlevi morrisonlevi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the re-usability idea here. However, ddcommon-ffi defines structs and functions which are getting exported as ddprof_ffi_*. Wouldn't it be better if we did something like ddog_*, ddog_ffi_*, datadog_*, or datadog_ffi_* for these instead of ddprof_ffi_*? What do you think?

@morrisonlevi
Copy link
Contributor

morrisonlevi commented Jul 8, 2022

The clippy lint seems like a legitimate bug; see rust-lang/rust-clippy#8886. Actually, on closer inspection, it looks like a false positive after all: you are doing the resize technique! Hmm.

@morrisonlevi morrisonlevi reopened this Jul 8, 2022
@morrisonlevi
Copy link
Contributor

Whoops, didn't mean to close. Sorry about that!

@paullegranddc paullegranddc force-pushed the paullgdc/factorise_generic_ffi_code branch from 315a60e to 36ad2f3 Compare July 11, 2022 15:08
@paullegranddc
Copy link
Contributor Author

I like the re-usability idea here. However, ddcommon-ffi defines structs and functions which are getting exported as ddprof_ffi_. Wouldn't it be better if we did something like ddog_, ddog_ffi_, datadog_, or datadog_ffi_* for these instead of ddprof_ffi_*? What do you think?

Ideally we should have a single prefix for everything in libdatadog, ddprof or other. I forgot to rename it because I wasn't sure which prefix to use. For me, the shorter identifier the better, especially if it's for namespace prefixes, so my preference would be ddog_*

The clippy lint seems like a legitimate bug; see rust-lang/rust-clippy#8886. Actually, on closer inspection, it looks like a false positive after all: you are doing the resize technique! Hmm.

Actually, the code was more generic than I needed, the branch with the resize was never taken, so I removed it. Maybe the lint was confused because I didn't pass a constant to resize?

From ddprof_ffi_ to ddog_
@paullegranddc paullegranddc force-pushed the paullgdc/factorise_generic_ffi_code branch from da38569 to 0a5c2d8 Compare July 12, 2022 12:50
@@ -73,7 +74,7 @@ pub extern "C" fn endpoint_agent(base_url: CharSlice) -> EndpointV3 {
/// # Arguments
/// * `site` - Contains a host and port e.g. "datadoghq.com".
/// * `api_key` - Contains the Datadog API key.
#[export_name = "ddprof_ffi_EndpointV3_agentless"]
#[export_name = "ddog_EndpointV3_agentless"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to fix it in this PR, but this is an example of a name that's probably too generic because this endpoint is specific to profiling. Before it was fine, because it was scoped by ddprof_*.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably this could be used for agentless as well in telemetry (I think they're planning on supporting it?); the two things that go here (site and api key) are what they'd need as well.

Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few small notes and noob questions, but overall this looks good to me :)

Here, kind sir, take my very-late approval as well :)

Comment on lines +1103 to +1109
[[package]]
name = "tools"
version = "0.1.0"
dependencies = [
"lazy_static",
"regex",
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a rust noob so this question may make absolutely no sense, but is calling this crate tools problematic once we start publishing stuff on cargo? Or would we never publish this thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably shouldn't publish this one

root_name: "ddprof, ddprof-exporter, ddcommon, ddprof-profiles, ddprof-ffi"
root_name: "ddprof, ddprof-exporter, ddcommon, ddprof-profiles, ddprof-ffi, ddcommon-ffi, ddtelemetry"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should tools also go in here? (and if not, perhaps we could have a comment explaining why?)

Comment on lines 10 to +11
"ddcommon",
"ddcommon-ffi",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was talking a bit to Levi about how many packages we have yesterday and I got to wondering: why exactly do we need to have the ffi packages separated from the non-ffi ones?

I realize that if you're consuming these things from Rust, then you don't care about the ffi parts of the code. But if you're just using Rust, can't the compiler realize you're not using the ffi bits and just drop them from the final binary?

E.g. what would be the problem if we folded every -ffi package back into its "parent"?

(This may be an incredibly stupid question, apologies!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, not that bad of an idea. I don't think it would cause issues.

We could probably have an ffi module in ddcommon, ddprof, etc.. that exposes only what needs to be exposed.

find_path(DDProf_INCLUDE_DIR ddprof/ffi.h
find_path(DDProf_INCLUDE_DIR datadog/profiling.h
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we take the opportunity to rename this file LibdatadogConfig.cmake.in and possibly also adjust the stuff we call DDProf in this file?

(On the other hand I realize this PR already includes quite a lot of things, so I'm also OK with leaving this for later -- it's not like we'll forget about it, it's pretty obvious and easy to find)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving it for latter

Comment on lines +1 to +2
[package]
name = "ddcommon-ffi"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⬆️ I think we're missing the usual licence two-liner here

header = """// Unless explicitly stated otherwise all files in this repository are licensed under the Apache License Version 2.0.
// This product includes software developed at Datadog (https://www.datadoghq.com/). Copyright 2021-Present Datadog, Inc.
"""
include_guard = "DDOG_COMMON_H"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Should this be DATADOG_COMMON_H to match the path of the file in the filesystem? (Same note applies to other files in the PR as well)

Comment on lines +17 to +22
#if defined(_MSC_VER)
#define DDOG_CHARSLICE_C(string) \\
/* NOTE: Compilation fails if you pass in a char* instead of a literal */ {.ptr = "" string, .len = sizeof(string) - 1}
#else
#define DDOG_CHARSLICE_C(string) \\
/* NOTE: Compilation fails if you pass in a char* instead of a literal */ ((ddog_CharSlice){ .ptr = "" string, .len = sizeof(string) - 1 })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that we needed the top variant for Windows; I also run into issues with really old versions of GCC (https://github.com/DataDog/dd-trace-rb/blob/master/ext/ddtrace_profiling_native_extension/stack_recorder.h#L16) in some cases...

Comment on lines 26 to 33
# define DD_CHECK_RETURN [[nodiscard]]
#elif defined(_Check_return_) /* SAL */
# define DD_CHECK_RETURN _Check_return_
#elif (defined(__has_attribute) && __has_attribute(warn_unused_result)) || \\
(defined(__GNUC__) && (__GNUC__ >= 4))
# define DD_CHECK_RETURN __attribute__((__warn_unused_result__))
#else
# define DD_CHECK_RETURN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the prefix on these macros be DDOG_, for consistency?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PROBABLY.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change

@@ -73,7 +74,7 @@ pub extern "C" fn endpoint_agent(base_url: CharSlice) -> EndpointV3 {
/// # Arguments
/// * `site` - Contains a host and port e.g. "datadoghq.com".
/// * `api_key` - Contains the Datadog API key.
#[export_name = "ddprof_ffi_EndpointV3_agentless"]
#[export_name = "ddog_EndpointV3_agentless"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably this could be used for agentless as well in telemetry (I think they're planning on supporting it?); the two things that go here (site and api key) are what they'd need as well.

Comment on lines -28 to +29
#[export_name = "ddprof_ffi_NewProfileExporterV3Result_drop"]
#[export_name = "ddog_NewProfileExporterV3Result_drop"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@morrisonlevi not for this PR but this got me thinking -- since we're still iterating on the API and breaking it all the time, did we add the V3 suffix here too early?

Would you be against me submitting a PR removing the V3 suffixes from the API names?

Looking at what's needed to move to the v2.4 intake (where v3 here was actually renamed v1.3 in our internal specs), I don't think that will need changing the ffi APIs, only the libdatadog implementation, and anyway we wouldn't need to support both at the same time because the agent just proxies stuff.

Happy to talk more, just thought this was a nice time to bring it up since we're renaming these APIs again anyway so all the time so far we called it v3 it wasn't that helpful lol, we never made use of it, and broke the API anyway :P

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, V3 was meant for the v1.3 intake format/protocol. Not sure what we need to do to support both 1.3 and 2.4 and how that's negotiated, but we can figure that out when we need it and not earlier.

@paullegranddc paullegranddc merged commit 0d0ebd6 into main Jul 19, 2022
ivoanjo added a commit that referenced this pull request Jul 19, 2022
As discussed in
<#24 (comment)>
having the "V3" suffix on APIs doesn't seem that useful because:

1. We're still iterating on APIs all the time. We just broke the API
   again by changing the prefixes to `ddog_` so it seems a bit too
   early to version things. We never supported more than a version
   at a time!

2. Since the datadog agent just proxies profiling requests, we
   probably don't need to support more than one API version at a
   time.

3. The new intake format (internally called v2.4, whereas the
   old v3 was renamed to v1.3) does not need extensive API changes,
   so it's quite possible that we can move profilers to use v2.4
   without changing the libdatadog APIs. So having v3 in the API
   names would force us to break the API even if not needed.

For these reasons, I decided to ride the wave of #24 already breaking
all our APIs and also dropping the V3 suffix. Let me know your
thoughts :)
ivoanjo added a commit that referenced this pull request Jul 19, 2022
As discussed in
<#24 (comment)>
having the "V3" suffix on APIs doesn't seem that useful because:

1. We're still iterating on APIs all the time. We just broke the API
   again by changing the prefixes to `ddog_` so it seems a bit too
   early to version things. We never supported more than a version
   at a time!

2. Since the datadog agent just proxies profiling requests, we
   probably don't need to support more than one API version at a
   time.

3. The new intake format (internally called v2.4, whereas the
   old v3 was renamed to v1.3) does not need extensive API changes,
   so it's quite possible that we can move profilers to use v2.4
   without changing the libdatadog APIs. So having v3 in the API
   names would force us to break the API even if not needed.

For these reasons, I decided to ride the wave of #24 already breaking
all our APIs and also dropping the V3 suffix. Let me know your
thoughts :)
ivoanjo added a commit that referenced this pull request Jul 19, 2022
As discussed in
<#24 (comment)>
having the "V3" suffix on APIs doesn't seem that useful because:

1. We're still iterating on APIs all the time. We just broke the API
   again by changing the prefixes to `ddog_` so it seems a bit too
   early to version things. We never supported more than a version
   at a time!

2. Since the datadog agent just proxies profiling requests, we
   probably don't need to support more than one API version at a
   time.

3. The new intake format (internally called v2.4, whereas the
   old v3 was renamed to v1.3) does not need extensive API changes,
   so it's quite possible that we can move profilers to use v2.4
   without changing the libdatadog APIs. So having v3 in the API
   names would force us to break the API even if not needed.

For these reasons, I decided to ride the wave of #24 already breaking
all our APIs and also dropping the V3 suffix. Let me know your
thoughts :)
ivoanjo added a commit that referenced this pull request Jul 20, 2022
As discussed in
<#24 (comment)>
having the "V3" suffix on APIs doesn't seem that useful because:

1. We're still iterating on APIs all the time. We just broke the API
   again by changing the prefixes to `ddog_` so it seems a bit too
   early to version things. We never supported more than a version
   at a time!

2. Since the datadog agent just proxies profiling requests, we
   probably don't need to support more than one API version at a
   time.

3. The new intake format (internally called v2.4, whereas the
   old v3 was renamed to v1.3) does not need extensive API changes,
   so it's quite possible that we can move profilers to use v2.4
   without changing the libdatadog APIs. So having v3 in the API
   names would force us to break the API even if not needed.

For these reasons, I decided to ride the wave of #24 already breaking
all our APIs and also dropping the V3 suffix. Let me know your
thoughts :)
@morrisonlevi morrisonlevi deleted the paullgdc/factorise_generic_ffi_code branch January 27, 2023 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants