Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query language transform processor #6985

Closed

Conversation

anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Jan 3, 2022

Description:
This implements an initial version of a telemetry transform processor that supports arbitrary processing queries. The initial version scopes to

  • Traces only for now, though the design should work for non-traces
  • Simple transformation on instants of data without temporal transformations
  • Only two functions, set and keep for now. This PR is for the framework, and after that can add many functions in followups

For this code I have interfaces like traces.getter instead of more generic getter to apply to multiple signals. The types are mostly passthrough between different parts and I think replacing Span with interface{} would mostly work to simplify things. I'd like to try that when adding a new signal since when initially trying to keep things simple I was losing sanity due to dropping the compile time safety. Or even better could be to leave such a cleanup for Go 1.18.

Link to tracking Issue: open-telemetry/opentelemetry-collector#4444

Testing: Unit tests

Documentation: README

@bogdandrutu @punya

Benchmark results

goos: linuxgoarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/processor/transformprocessor/internal/traces
cpu: Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
BenchmarkTwoSpans
BenchmarkTwoSpans/no_processing
BenchmarkTwoSpans/no_processing-16         	  689240	      1558 ns/op
BenchmarkTwoSpans/set_attribute
BenchmarkTwoSpans/set_attribute-16         	  696786	      1760 ns/op
BenchmarkTwoSpans/keep_attribute
BenchmarkTwoSpans/keep_attribute-16        	  631849	      1882 ns/op
BenchmarkTwoSpans/no_match
BenchmarkTwoSpans/no_match-16              	  611616	      1647 ns/op
BenchmarkTwoSpans/inner_field
BenchmarkTwoSpans/inner_field-16           	  708600	      1702 ns/op
BenchmarkTwoSpans/inner_field_both_spans
BenchmarkTwoSpans/inner_field_both_spans-16         	  561469	      1826 ns/op
BenchmarkHundredSpans
BenchmarkHundredSpans/no_processing
BenchmarkHundredSpans/no_processing-16              	   25196	     47882 ns/op
BenchmarkHundredSpans/set_status_code
BenchmarkHundredSpans/set_status_code-16            	   20630	     58651 ns/op
BenchmarkHundredSpans/hundred_statements
BenchmarkHundredSpans/hundred_statements-16         	    2444	    485470 ns/op
PASS

@anuraaga anuraaga requested review from a team and dmitryax January 3, 2022 05:51
package common // import "github.com/open-telemetry/opentelemetry-collector-contrib/processor/transformprocessor/internal/common"

import (
"github.com/alecthomas/participle/v2"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to replace with something like goyacc if this library causes any concerns, I appreciated how simple it was to do the parsing with it

@anuraaga anuraaga force-pushed the processor-expressions-v1 branch from ccde85f to 0c9f19c Compare January 3, 2022 05:54
@jpkrohling
Copy link
Member

@anuraaga, will you split this PR to make this easier to review? I'm also interested in hearing more about the relationship between this and the attributes processor (and potentially other similar processors).

@bogdandrutu
Copy link
Member

bogdandrutu commented Jan 3, 2022

I'm also interested in hearing more about the relationship between this and the attributes processor (and potentially other similar processors).

We decided long time ago in multiple SIG meetings that we will replace the current Attribute/Span/MetricsTransform/etc. with a more consistent "transform" (name was not decided) processor that uses a much simpler and consistent language across signals.

processor/transformprocessor/README.md Show resolved Hide resolved
- Path expressions: Fields within the incoming data can be referenced using expressions composed of the names as defined
in the OTLP protobuf definition. e.g., `status.code`, `attributes["http.method"]`. If the path expression begins with
`resource.` or `instrumentation_library.`, it will reference those values.
- Literals: Strings, ints, and floats can be referenced as literal values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is nil an accepted literal value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it isn't - it will need to be added to allow for arbitrary functions. Curious what do you think the behavior of a function like set(name, nil) should be - should it set the name to empty string? Currently the code will ignore the call due to use of val.(string) type assertion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should fail during "validation" of the config instead of "no-op". And I think is a good long-term plan, to fail fast in cases like this, especially because when you deploy the collector you want telemetry to be processed, not like in the case of an API/SDK where the telemetry is more or less a commodity.

Comment on lines +21 to +22
- `keep(target, string...)` - `target` is a path expression to a map type field. The map will be mutated to only contain
the fields specified by the list of strings. e.g., `keep(attributes, "http.method")`, `keep(attributes, "http.method", "http.route")`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we not be able to use the same func to keep an entire "span"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't quite understand what it means to "keep an entire span" could you describe? Do you mean the opposite of drop?

Copy link
Member

@bogdandrutu bogdandrutu Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so for example:

keep(span) where span.attribute["foo"] == "bar" -> equivalent to drop(span) where span.attribute["foo"] != "bar"

So do we decide on drop for spans with negative where statement, and keep with positive where statements for attributes?

Comment on lines +40 to +42
1) Set status code to OK for all spans with a path `/health`
2) Keep only `service.name`, `service.namespace`, `cloud.region` resource attributes
3) Set `name` to the `http.route` attribute if it is set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure I understand, the "where" statement applies only to the first operation, if that is the case maybe add a clarification to 2/3 like "for all spans".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried clarifying in 7745311

transform:
queries:
- set(status.code, 1) where attributes["http.path"] == "/health"
- keep(resource.attributes, "service.name", "service.namespace", "cloud.region")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't remember the final decision about the design (1 processor per signal or 1 processor for all signals). If we have 1 processor for all signals as the name suggest ("transform") then this will apply for any telemetry type, but the first line would apply only to "traces" since the status is a span concept?

I believe would be hard to understand all these subtle things, so maybe we can have a per signal processor and reuse as much code as possible?

Not sure we need the final decision now, but when reading these lines I got confused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was thinking of having just one processor for the multiple signals, with a future goal of even allowing the same set of queries to apply for cross-signal at some point.

In the current code, if a path expression is made with the wrong fields for what the process is applied to, then it will return an error at startup.

e.g.

transform:
  queries:
    - keep(resource.attributes, "service.name")

can be applied to any signal pipeline.

transform:
  queries:
    - set(status.code, 1)

Can be applied only to traces pipeline. If applied to metrics pipeline, it will return an error at startup (well at least this is the planned behavior when actually implementing ProcessMetrics). The general "best practice" would be to name it transform/traces for such processing.

I agree there is some subtlety but having a separate processor component for each signal seems like too blunt of a hammer given the processor works the same for all of them, it's only the allowed path expressions that change.

Copy link
Member

@bogdandrutu bogdandrutu Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general "best practice" would be to name it transform/traces for such processing.

If this is the case I don't see where the "too blunt of a hammer" comes from. But some more feedback (from more users) would be good.

Comment on lines +21 to +25
type Config struct {
config.ProcessorSettings `mapstructure:",squash"`

Queries []string `mapstructure:"queries"`
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be possible to validate syntax here, and actually resolve the final statement in factory.go when creating the traces processor, since we need to know which signal it's applied to for validating the paths. Currently, it is the factory that returns an error if the config is invalid, either for syntax error or bad path expressions.

Is there an advantage to using Config.Validate vs returning an error from the factory?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are looking to implement a "--dry-run" and that would not be possible if you can return error only at "start" time.

@alolita alolita added the enhancement New feature or request label Jan 4, 2022
@anuraaga
Copy link
Contributor Author

anuraaga commented Jan 4, 2022

@jpkrohling I struggle to find a good way to split the PR without losing context of other components. But let me know if it would be beneficial to go with three PRs in order of internal/common, internal/traces, and transformprocessor itself.

@jpkrohling
Copy link
Member

that we will replace the current Attribute/Span/MetricsTransform/etc. with a more consistent "transform" (name was not decided) processor

Should this PR then add deprecation notices to the processors it replaces?

@jpkrohling
Copy link
Member

I struggle to find a good way to split the PR without losing context of other components.

Given that this is a new component, it should follow what we require from other contributors as outlined in #6926.

@anuraaga
Copy link
Contributor Author

anuraaga commented Jan 5, 2022

@jpkrohling Sent #7047 to split out the skeleton

@anuraaga anuraaga closed this Jan 5, 2022
@anuraaga anuraaga reopened this Jan 5, 2022
@anuraaga
Copy link
Contributor Author

anuraaga commented Jan 5, 2022

@bogdandrutu Sorry for the troubles, as I've split out #7047 let's continue discussions there I've copied over the current threads

@anuraaga anuraaga closed this Jan 5, 2022
animetauren pushed a commit to animetauren/opentelemetry-collector-contrib that referenced this pull request Apr 4, 2023
…tly to simplify usage (open-telemetry#6985)

Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>

Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants