Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add design doc for processing query language. #4444
Add design doc for processing query language. #4444
Changes from all commits
fc1c4c2
de83015
16d64ef
9d42af4
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the string in the call get converted to a regexp.Regexp here? The example above shows:
replace_wildcards("/user/*/list/*", "/user/{userId}/list/{listId}", attributes["http.target"])
Is reflection used for every invocation of of replace_wildcards to figure out that
"/user/*/list/*"
must be compiled into regexp.Regexp? Or is it the responsibility of the compiler to figure this out?Related to this: do we adopt weak typing, when you can pass a string when regexp is expected? Weak typing may be more complicated to optimize (i.e. precompile regexp patterns once) so it may be worth thinking about this aspect of the language.
Do we intend to make the compiler smart enough to figure out that the regexp compilation can be done ahead of time and not for every invocation for every span? That's a fairly big ask for a simpler compiler, but without that the execution can be quite slow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great callout - while writing the POC I realized the functions need to be factories of the actual logic function.
(a bit of shorthand)
The UX for defining a function is a bit reduced but seems quite reasonable. And with this it will be possible to use reflection to convert types passed into the factory so instread of string, it could be defined to accept regex. The framework would only need to reflect, convert regex, and invoke factory, once during config parse time.
I'll add a note in the doc about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, is it the same smartness you're referring to? 😅 I think it achieves some balance by creating the factory / logic function split to make the optimization easier without making the compiler have to be too complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The factory approach requires that some arguments are compile time constants. We loose the ability to use a value computed at runtime as the regular expression, for example. It also requires us to declare which of the function parameters are compile-time parameters and which are runtime parameters. In this example the fist 2 arguments of replace_wildcards are compile-time constants.
What if I want to replace a span attribute by a value of another attribute and not by a constant string? It doesn't seem to be possible. To be honest, I have reservations about the approach you suggest. I think it limits expressiveness of the language. Implementation-wise this approach is deviating from how language compilers and VMs are typically implemented and it may be difficult to fix it without major rewrites in the future.
Alternative approach is to have Regexp type as a first class citizen in the language. So you can do this instead:
replace_wildcards(regexp("/user/*/list/*"), "/user/{userId}/list/{listId}", attributes["http.target"])
whereregexp
is a function that takes a string and returns a Regexp. Then a sufficiently smart compiler can perform the Regexp value computation at compile time and avoid calling theregexp()
function at time if the argument toregexp()
is a compile-time constant. See how our current expr evaluator added support for that: https://github.com/antonmedv/expr/pull/96/filesWe have another unclear moment with the 3rd parameter of replace_wildcards. It is not clear how
attributes["http.target"]
becomes aTelemetryPath
(or aSpan
and aResource
pair that is passed todoReplace
). Is the semantic of the language thatattributes["http.target"]
is an lvalue of typeTelemetryPath
that can be either evaluated to get the value of or can be passed as a reference to a function to store a value into?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you describe an example of a regex computed at runtime? As this is a static config (well possibly dynamically updated via a remote), I figured that by nature the config elements, such as a regex, would be static.
Yeah - since it's not a string, it is parsed as a telemetry path. If the function's argument in that position does not accept a telemetry path, it would be a configuration error and cause a warning. It's true that currently only telemetry paths are runtime values - I think this hits all of the use cases for telemetry transform (the only runtime data is the telemetry itself) but let me know what other ones there are that may need something more.
Currently the
value
type is a literal, path expression, or invocation of a functionhttps://github.com/anuraaga/opentelemetry-collector-contrib/pull/1/files#diff-3dcf16cd5cabbe637313a07c03600765fd4b043e429d66b9ef024f62f3adbd51R24
It would be relatively easy to expand support for special syntax, e.g.
/regex/
or evenregexp("foo")
if it's useful I think. That being said, I don't think we'd avoid factories (though they wouldn't necessarily need to be required). We probably can't support all types of config-time resolved values. Forkeep
, we need to convert a slice of strings into a map for quick presence testshttps://github.com/anuraaga/opentelemetry-collector-contrib/pull/1/files#diff-d38eee897f4ad52c000a3df17502753a839f5afb2f7bdf3f573807cea2588561R26
There doesn't seem to be a way to handle this generically, but the factory approach seems to work well for it.