-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trace Tag Replacer: functionality to scrub sensitive data from spans #111
Conversation
/// parse_rules_from_string takes an array of rules, represented as an array of length 3 arrays | ||
/// holding the tag name, regex pattern, and replacement string as strings. | ||
/// * returns a vec of ReplaceRules | ||
pub fn parse_rules_from_string<'a>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this may be changed in the future depending on how the "agentless" code consumes config variables, and how we want to parse the "replace_tags" config variables.
trace-obfuscation/src/lib.rs
Outdated
|
||
#![deny(clippy::all)] | ||
|
||
pub mod pb { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with this setup, the protocol buffer definitions will be included in the binary multiple times. Also might mean the normalization and obfuscation crates are using different types. Might be best to either bundle them all into a single crate, or move this protocol buffer definitions into a single dedicated crate and reference that instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved all the protobuf files + definitions into it's own dir/crate in the libdatadog root trace-protobuf
, though we will likely re-organize the structure of all our agentless related code in the near future
…tadog into david.lee/trace-obfuscator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nit + suggestion to add benchmark for the parsing/replacing fn.
Otherwise looks good to me!
// developed at Datadog (https://www.datadoghq.com/). Copyright 2023-Present | ||
// Datadog, Inc. | ||
|
||
#![deny(clippy::all)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this clippy setting still required?
} | ||
|
||
/// replace_trace_tags replaces the tag values of all spans within a trace with a given set of rules. | ||
pub fn replace_trace_tags(trace: &mut [pb::Span], rules: &[ReplaceRule]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth addin a criterion benchmark for this function.
In near feature I plan to add the benchmarking platform to the CI config. So that we'll get PR comments if a regression gets detected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a new criterion bench for replace_trace_tags
which can be run with cargo bench
What does this PR do?
Implements the functionality behind Scrubbing Sentitive Data from Spans
introduces:
replace_trace_tags
ReplaceRule
s to the tags of a given array of spansparse_rules_from_string
ReplaceRule
s from a given array of length 3 string arrays, each representing a rule with a tag name, regex pattern, and replacement stringMotivation
Implements a part of the greater Trace Obfuscator, to be used in the serverless agentless code.
This directory will also include
obfuscator.rs
in the future, covering more general trace obfuscation.Additional Notes
Anything else we should know when reviewing?
How to test the change?
Unit test input/expected output was copied from the go agent tag replacer