-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a request id header for all communications. #17
Conversation
Ruby's UUID version 4 (UUID4) is not particularly fast and we recently removed its use in the Ruby agent, favoring a simpler random number generator. In high-throughput environments, generating a UUID4 can lead to file handle exhaustion since Ruby's implementation makes use of so the question here: can we get away with simple random number generations, or must it truly be UUID4 standard? |
That's a good point @mwlang , I assumed that requests would not be high throughput (we default to batching every 5 seconds), but that's no excuse for poor performance 😄 That being said, there are ~122 bits of entropy in UUID4 according to https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random) We can change the request ID to using a random hex number but we should probably figure out how much entropy is needed. We'll have to look at the maximum window we'd like to deduplicate payloads in (I'm thinking 1-5 minutes), how many requests we expect in that window both now and in the future, and what the probability is for a collision. We certainly want to avoid cases where a collision would cause data loss. For the entropy required, I think having input from our ingest team would be helpful. I'll reach out to them for comment. |
One or two UUIDs per harvest cycle doesn't seem too onerous. Is it really that bad, that you can't generate a couple every 5 seconds or so without impacting things? I would hope the cost of generating a UUID would be much, much less than the network latency. Am I wrong about that? |
More than the performance impact of generating a UUID4 in ruby, the fact that it opens a new file descriptor to read from I totally agree with Allan's proposal of requiring (or at least strongly suggesting) some minimum amount of entropy. My hunch is that 122 bits is far more than is strictly necessary, but also the cost of extra bits isn't too much. I wouldn't be opposed to suggesting something like tracestate -- 32 characters of hex representing 128 bits of random. |
Nope. Not wrong. Justin goes into a little more detail about our pain point. My primary goal is to avoid bringing back that pain if there's an alternative that's "good enough"(tm). @justinfoote going with the 32 characters of hex representing 128 bits of random seems plenty good to me and would entirely side-step our earlier issues. |
Ah, I get it. The problem isn't UUIDs in general, it's the SecureRandom version in ruby that requires a file descriptor to be opened. Gotcha. Sounds like this is a ruby-specific implementation detail, rather than a general problem with UUIDs. |
@mwlang @justinfoote I've updated the proposal to a 32 character hex string based on the feedback! 😄 |
Ruby agent approves this change! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
After chatting with folks, I've changed the proposal back to uuid4. Part of this decision is that we believe the risk of collisions (which would result in data loss under the proposed usage of the request id) resulting from the use of a non-uniform random number generator is too high of a risk. Using a uuid provides greater collision resistance. |
The x-request-id header is a uuid4 value that may be used by ingest in the future. See newrelic/newrelic-telemetry-sdk-specs#17 for details.
The x-request-id header is a uuid4 value that may be used by ingest in the future. See newrelic/newrelic-telemetry-sdk-specs#17 for details.
The x-request-id header is a uuid4 value that may be used by ingest in the future. See newrelic/newrelic-telemetry-sdk-specs#17 for details.
The x-request-id header is a uuid4 value that may be used by ingest in the future. See newrelic/newrelic-telemetry-sdk-specs#17 for details.
dang. wish i had seen this earlier. this will be slowwwwww in Node land. up to several milliseconds occasionally to generate (possible to be even longer right after startup). while it is only on request, we've seen attempted uses in the wild where metrics are fired off right after generating. the whole ID situation in Node is quite unfortunate. given we've likely missed the boat here, i guess we may need to get very prescriptive in documentation on forcing batching. |
Adding a request ID header makes it feasible in the future for ingest to deduplicate data being sent from the SDKs without any updates being required to the SDK clients.