Skip to content
This repository has been archived by the owner on Jun 28, 2022. It is now read-only.

Add Vec type, remove Buffer, add SendResult_drop #41

Merged
merged 8 commits into from
Mar 31, 2022
Merged

Conversation

morrisonlevi
Copy link
Collaborator

What does this PR do?

Adds:

  • A new repr(C) Vec type.
  • FFI functions to deal with Vec<Tag>, including a helper for parsing
    DD_TAGS.
  • ddprof_ffi_SendResult_drop destructor.

Removes:

  • Buffer. Use Vec<u8> instead.
  • ddprof_ffi_Buffer_reset. Probably you should use
    ddprof_ffi_SendResult_drop instead, as resetting the buffer was only
    really used in the failure case of a SendResult.
  • ddprof_ffi_Buffer_free. There aren't any APIs left which create a
    Box<Vec<u8>> to use it on.

Motivation

I wanted an API in PHP for dynamic tag management, and Ivo said he'd
like it for Ruby as well.

Additional Notes

I started off using opaque types, but cbindgen was failing to generate
correct C code for Box<Vec<Tag>> when used in an enum. So, instead I
made the FFI vec type. This technically saves an allocation as well.

How to test the change?

Make the above changes, then use as normal. The functions below may be
helpful creating a dynamic array of tags:

ddprof_ffi_Vec_tag ddprof_ffi_Vec_tag_new(void);
void ddprof_ffi_Vec_tag_push(ddprof_ffi_Vec_tag *vec, ddprof_ffi_Tag tag);
ddprof_ffi_Slice_tag ddprof_ffi_Vec_tag_as_slice(const ddprof_ffi_Vec_tag *vec);
void ddprof_ffi_Vec_tag_drop(ddprof_ffi_Vec_tag vec);
void ddprof_ffi_VecTagResult_drop(ddprof_ffi_VecTagResult result);
ddprof_ffi_VecTagResult ddprof_ffi_Vec_tag_parse(ddprof_ffi_CharSlice string);
ddprof_ffi_VecTagResult ddprof_ffi_Vec_tag_clone(const ddprof_ffi_Vec_tag *vec);
void ddprof_ffi_VecTagResult_drop(ddprof_ffi_VecTagResult result);

});
}
let name = &chunk[..first_colon_position];
let value = &chunk[(first_colon_position + 1)..];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if : is the last char from chunk ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's handled as a consequence of other bits of code I think, but I'll add explicit tests; I meant to do that and forgot, so thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad! Indeed in that case value slice will be empty, which is fine, sorry for the noise

Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few notes but overall pretty happy!

ddprof-ffi/src/exporter.rs Show resolved Hide resolved
Comment on lines +270 to +282
#[export_name = "ddprof_ffi_Vec_tag_new"]
pub extern "C" fn vec_tag_new<'a>() -> crate::Vec<Tag<'a>> {
crate::Vec::default()
}

/// Pushes the tag into the vec.
#[export_name = "ddprof_ffi_Vec_tag_push"]
pub unsafe extern "C" fn vec_tag_push<'a>(vec: &mut crate::Vec<Tag<'a>>, tag: Tag<'a>) {
vec.push(tag)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming these can fail to allocate memory. What happens in that case -- does a panic get triggered?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they will panic. I previously used vec.try_reserve to detect this and return a Result instead, but that requires Rust 1.57 and we are only on 1.56 (and are stuck there for a bit due to Alpine Linux).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I think we can live with this until we move to a newer version.

Comment on lines 317 to 325
let name = &chunk[..first_colon_position];
let value = &chunk[(first_colon_position + 1)..];
Ok(Tag {
name: CharSlice::from(name),
value: CharSlice::from(value),
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I was looking at the Ruby implementation and it additionally strips any leading/trailing whitespace on each chunk. May be worth including here as well?

Copy link
Collaborator Author

@morrisonlevi morrisonlevi Mar 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Ruby also handle , and as separators, or does it require just one of them? Edit: you already answered this, just ,.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it only supports ,. I hadn't seen the before... 🤔

Comment on lines 313 to 318
return Err(ParseTagsError {
message: "Tag cannot start with a colon.",
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I think it would be a nice UX if we could include the actual tag that failed to parse in the error message (applies to other error messages added in this PR -- it would be nice to always have at least an excerpt of the actual thing that triggered the failure).

Otherwise a user needs to go dig what's up, and may get confused on which tags made it where, depending on how complex their setup is, and giving them something they can grep their codebase with here I think may save valuable time.

Comment on lines 296 to 311
impl Debug for ParseTagsError {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
write!(f, "Tag Error: {}", self.message)
}
}

impl Display for ParseTagsError {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
write!(f, "The tags string was malformed.")
}
}

impl Error for ParseTagsError {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My rust-foo is somewhat failing me here. Does this arrangement mean that by default we don't present a more specific error message, and only debug builds get it? If so, why? 😉

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Rust convention is that Debug is for developers and Display is meant for everyone else. They are both present, and it depends on whether the programmer stringifies it with {} or {:?}.

/// Parse a string of tags typically provided by environment variables
/// The tags are expected to be either space or comma separated:
/// "key1:value1,key2:value2"
/// "key1:value1 key2:value2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting; looking at the Ruby code, it really only allows commas. Can you point me in the direction where you found the space-separated version? I may need to bring this up with the rest of the Ruby folks if we're doing this wrong.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python previously just supported , and later added space separated to match the agent. Note that as far as I can tell, neither spaces nor commas are supported in tags: https://docs.datadoghq.com/getting_started/tagging/.

However, there is a flaw in this technique for choosing between them. Given tag env:staging:east, the key is env and the value is staging:east but neither method will split this correctly because it assumes only 1 colon is present per key-value chunk. Python has this same flaw.

Copy link
Member

@ivoanjo ivoanjo Mar 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional thing the page you've linked to is

A tag can be in the format value or :. For optimal functionality, Datadog recommends constructing tags in the : format.

which I'm pretty sure will not work in Ruby either. I'll raise these, thanks!

Update: Slack link to dd-trace-rb discussion.

Comment on lines +382 to 395
#[allow(clippy::ptr_arg)]
#[export_name = "ddprof_ffi_Vec_tag_clone"]
pub extern "C" fn vec_tag_clone<'a>(vec: &'a crate::Vec<Tag<'a>>) -> VecTagResult {
let mut clone = Vec::new();
for tag in vec.into_iter() {
clone.push(tag.to_owned())
}
VecTagResult::Ok(crate::Vec::from(clone))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat curious about this one -- what's the use case for cloning the vec?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In PHP I store the result of parsing DD_TAGS globally. On each request, I clone these tags and add potentially specific values like runtime-id to the clone, which might help to support fork (it's not complete for PHP though, so maybe misguided).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... have you considered doing that by parsing DD_TAGS globally, adding them once at exporter creation, and then starting from an empty vec when adding the extra tags during send?

(Although I don't mind if you prefer doing it this way, seems harmless as well)

/// Buffer holds the raw parts of a Rust Vec; it should only be created from
/// Rust, never from C.
#[repr(C)]
pub struct Buffer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RIP Buffer, you served us well but it's time to go 🙇

ddprof-ffi/src/vec.rs Show resolved Hide resolved
ddprof-ffi/src/exporter.rs Outdated Show resolved Hide resolved
@morrisonlevi morrisonlevi force-pushed the levi/vec branch 2 times, most recently from 2e39741 to 76aad26 Compare March 30, 2022 15:24
The Vec type is a generalization of Buffer. In particular, this PR
exposes an API to create and manage Vec<Tag> and replaces existing uses
of Buffer with Vec<u8>. It includes a helper to parse DD_TAGS as well.

One of the APIs, ddprof_ffi_Buffer_reset, was removed. Previously this
was required when using the failure case of the SendResult type, but
now there is ddprof_ffi_SendResult_drop, which should be used instead.
Previously it couldn't handle a tags like:
  env:staging:east
  value

Tag messages on failure have improved.

As part of adding tests, added more trait support to Slice and
CharSlice so they can be used in equivalence statements.

As part of doing that, I realized Slice could implement IntoIterator
and that it cleaned up 3+ places where we call .into_slice().iter().
I also removed unsafe from `into_slice`. Technically, it could still
fail such as if the user creates a slice of length 4 but only has 2
elements, but this is a user error that cannot really be checked.
Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Comment on lines +124 to +126
return Err(Box::new(TagsError {
message: "tag name must not be empty".to_string(),
}));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Perhaps include here the tag value?

Comment on lines +297 to +299
#[allow(clippy::ptr_arg)]
#[export_name = "ddprof_ffi_Vec_tag_as_slice"]
pub extern "C" fn vec_tag_as_slice<'a>(vec: &'a crate::Vec<Tag<'a>>) -> Slice<'a, Tag<'a>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Less important than the others, but I guess still makes sense to have as #[must_use]...?

Base automatically changed from levi/send-tags to main March 31, 2022 15:40
@morrisonlevi morrisonlevi merged commit cca6cd0 into main Mar 31, 2022
@morrisonlevi morrisonlevi deleted the levi/vec branch March 31, 2022 15:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants