-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turn off spans from a scope while retaining downstream spans and without breaking traces #3867
Comments
@jack-berg, do you think this issue has some relationship with this one? #3205 -- both issues seem to look for a way to remove noise intermediate spans without loosing the span tree. |
I propose we solve this problem by starting to more fully realize the potential of scope as a concept. Today, scope is just a piece of metadata attached to records because there is no SDK level configuration surface area involving scopes (besides views) which allow you to select or do anything special with a scope. I propose we introduce the the notion of scope level configuration for all three signals:
The more I think about #1588, the more I think that extending sampler to include access to resource and scope is a leaky abstraction. Sure a sampler could be used to change the behavior of scopes, but if you play that out, we'll end up with samplers with very complicated rules. It seems much cleaner to separate concerns, with a global sampler policy applicable to all scopes, and separate scope level configuration. We really need to start to realize the potential of the scope concept. This proposal would lay the groundwork for that by introducing the simplest possible scope level configuration (enabled | disabled) while solving a very concrete use case. |
@svrnm yes I do. I think right now we only have a hammer (i.e. sampler) so everything looks like a nail. If we had a scope level configuration, it would be a no brainer to add a severity level threshold option for logs, and we probably wouldn't nearly as interested in log sampling. |
Thanks, @jack-berg! Utilizing scope to accomplish that kind of "span-level sampling" (for a lack of a better word) is really an interesting concept. It would solve the issue(s) discussed in #3205 and probably a few more. I just try to wrap my head around with an example:
Is this a correct understanding? |
Yeah you're correct. In my experience most instrumentation produces a single tier of spans. I know the idea of a single instrumentation scope producing both server + internal spans, or internal + client spans exists, but maybe those instrumentations would rethink that position if the SDK gave them tools that gave them an incentive to do so. Or maybe there is a need for additional scope level configuration beyond just the enabled or disabled. I think adding the ability to enable / disable scopes is doing the simplest possible thing while addressing a glaring omission in SDK configuration. Its the low hanging fruit. After that, we can take stock and determine what else is missing. |
I think disable/enable is a great starting point, and I see another usecase: there are plenty of questions around disabling individual instrumentation libraries. Of course one could argue that not loading them in the first place is what you should be doing, but if you start with a Java Agent, or a Node JS metapackage that has ALL the instrumentation libraries in it, it can be easier to disable those you don't care about. Now, if an instrumentation library has a well-defined scope, you could simply turn that off, and you are done. The screenshot in open-telemetry/opentelemetry-demo#659 is a good example of this use case. Of course boiling this down to the 3 spans you "really" need is a good solution, but what if I suspect an issue in Slim... or Psr... or I am a library developer and want to get that level of insight, if I can turn those scopes on/off, I have that level of flexibility without compromising on span creation. |
It would be worth getting @lmolkova's input, she has been (and is) thinking about this problems. The problems she highlights are mostly focusing on nested client spans (a client span parenting another client span), but I think the solution proposed here could also apply to those cases.
Using the scope mechanism is a much more promising approach then inventing a new mechanism, as proposed in the OTEP linked above. |
I'd like to add that .NET OTel has an existing mechanism for it. Let's say instrumentation creates a scope (via Let's say library also creates a scope called If I want to enable all sources from a library, I can do it with The same mechanism is used for meters. Effectively source name acts like a logger name where you can control loggers by their namespaces. The changes in the spec that I'd like to see to generalize this approach
Nice-to-haves
Per-scope enablement vs verbosityAs I mentioned before per-scope enablement is like a logger configuration, but there are two modes: on and off. In the meantime we can just ask instrumentations to document their stuff and users to read docs. Distros would also need to pick good defaults. |
One more example where this could be extremely useful is messaging where we have
This creates problems for some users:
The suppression mechanism could allow users to decide if they want per-message tracing and enable it then. This brings one more small requirement (which is probably already covered, but maybe worth mentioning anyway):
|
This is a use case I'm dealing with right now where it could be very useful to disable/enable a specific library instrumentation dynamically. My use case involves remote configuration for Android applications where users can decide what to disable/enable after they've deployed their app, and since library instrumentations are added at compile time in Android, unloading an instrumentation at runtime is not an option, so I think this proposal can help addressing that use case in a clean way too. |
@svrnm |
I think this is also related to some stuff I'd been kicking around for years now, #2555 I planned to attempt again to find a solution in a rebooted Convenience SIG/WG. At least I think it would fall under that SIG/WG. Is there any interest for tackling this there? |
This approach sounds good to me. Note that this approach does not exclude Sampler-related mechanisms that might solve the problem of broken traces due to sampled-out intermediate spans. |
@tsloughter my immediate interest is to add the simplest possible scope level config mechanism across signals, by just providing a way to enable / disable scopes. I think that once a scope level config mechanism exists, its more straight forward to pursue something like #2555, but also, the need diminishes because at least part of the motivation is already accommodated by disabling scopes. I think that #3103 is somewhat related, but should be separately pursued. As for open-telemetry/oteps#186, there's definitely overlapping motivation, but I question the value of having all of the tracerprovider configuration options configurable at the tracer level. I think it makes sense to consider each property on a case by case basis. This will be easier to do once a scope level config mechanism exists. |
Fixes #3867. This introduces the concept of scope-level configuration for Tracer, Meter, and Logger. Notes: - TracerProvider, MeterProvider, and LoggerProvider have a scope config provider function, which accepts an `InstrumentationScope` and returns the relevant scope config. This function is invoked when a Tracer, Meter, Logger is created, and the resulting scope config dictates the behavior of resulting Tracers, Meters, Loggers. Computing the config on scope creation avoids doing extra work on the hot path. - If TracerProvider, MeterProvider, LoggerProvider supports updating configuration, the function can be updated, at which point the scope config of outstanding Tracer, Meter, Logger will be re-computed. This accommodates emerging dynamic config use cases. - A function as the mechanism to determine relevant scope config to maximize flexibility. However, implementations are encourages to provide syntactic sugar / helpers for accommodating popular use cases: select one or more scopes by name or with pattern matching, disable one or more selected scopes, disable all scopes by default and enable one or more selective scopes. - Scope config is simple at the moment, consisting of on the ability to enable / disable a particular Tracer, Meter, Logger. When a disabled, use the respective noop Tracer, Meter, Logger. - Scope config parameters are expected to evolve over time. When it makes sense, there should be consistency across signals. - Java prototype available for this proposal: open-telemetry/opentelemetry-java#6201 Leaving as a draft for now because its not ready to merge, but I am actively soliciting feedback. If / when we agree on direction, I'll go back and add the appropriate experimental indications.
…#4063) Builds off #4020 to extend the `Enabled` API to `Tracer` and metrics `Instrument`. Adds language to Log SDK, Metrics SDK, and Trace SDK for how scope config is used to resolve the `Enabled` operation. This was discussed in #3867 but punted on in #3877 as [discussed here](#3877 (comment)). --------- Co-authored-by: Robert Pająk <pellared@hotmail.com> Co-authored-by: Trask Stalnaker <trask.stalnaker@gmail.com> Co-authored-by: Reiley Yang <reyang@microsoft.com>
…ry#3877) Fixes open-telemetry#3867. This introduces the concept of scope-level configuration for Tracer, Meter, and Logger. Notes: - TracerProvider, MeterProvider, and LoggerProvider have a scope config provider function, which accepts an `InstrumentationScope` and returns the relevant scope config. This function is invoked when a Tracer, Meter, Logger is created, and the resulting scope config dictates the behavior of resulting Tracers, Meters, Loggers. Computing the config on scope creation avoids doing extra work on the hot path. - If TracerProvider, MeterProvider, LoggerProvider supports updating configuration, the function can be updated, at which point the scope config of outstanding Tracer, Meter, Logger will be re-computed. This accommodates emerging dynamic config use cases. - A function as the mechanism to determine relevant scope config to maximize flexibility. However, implementations are encourages to provide syntactic sugar / helpers for accommodating popular use cases: select one or more scopes by name or with pattern matching, disable one or more selected scopes, disable all scopes by default and enable one or more selective scopes. - Scope config is simple at the moment, consisting of on the ability to enable / disable a particular Tracer, Meter, Logger. When a disabled, use the respective noop Tracer, Meter, Logger. - Scope config parameters are expected to evolve over time. When it makes sense, there should be consistency across signals. - Java prototype available for this proposal: open-telemetry/opentelemetry-java#6201 Leaving as a draft for now because its not ready to merge, but I am actively soliciting feedback. If / when we agree on direction, I'll go back and add the appropriate experimental indications.
…open-telemetry#4063) Builds off open-telemetry#4020 to extend the `Enabled` API to `Tracer` and metrics `Instrument`. Adds language to Log SDK, Metrics SDK, and Trace SDK for how scope config is used to resolve the `Enabled` operation. This was discussed in open-telemetry#3867 but punted on in open-telemetry#3877 as [discussed here](open-telemetry#3877 (comment)). --------- Co-authored-by: Robert Pająk <pellared@hotmail.com> Co-authored-by: Trask Stalnaker <trask.stalnaker@gmail.com> Co-authored-by: Reiley Yang <reyang@microsoft.com>
Suppose you have a span hierarchy of ({tracerName}:{spanKind}:{spanName}):
And you decide that internal spans from
tracerB
are too noisy. You want to turn them off while:tracerC:client:grandchild
because it contains important information about calls to downstream systems.tracerC:client:grandchild
's parent should betracerA:server:parent
, nottracerB:internal:child
.How can you accomplish this? Well, you pretty much can't. A couple of problems prevent it:
DROP
fortracerB
and delegates toParentBased{root=AlwaysOn}
for other spans: Then all descendants oftracerB:internal:child
will be dropped as well. Not what we want.DROP
fortracerB
and delegates toParentBased{root=AlwaysOn,localParentNotSampled=AlwaysOn}
for other spans: Then we end up with a broken trace sincetracerC:client:grandchild
will assign its parent to betracerB:internal:child
, which is dropped. Not what we want.DROP
fortracerB
and delegates toParentBased{root=AlwaysOn,localParentNotSampled=AlwaysOn}
and instrumentation is aware and only sets spans in context ifSpan.isRecording()=true
: Then we achieve the desired affect.tracerB:internal:child
is dropped andtracerC:client:grandchild
will assign its parent to betracerA:server:parent
, so we have a complete trace.So there is no way to get the behavior we want today. And even if samplers were given access to scope, we'd still need: a complex / non-standard sampling policy and coordination with instrumentation (i.e. checking
Span.isRecording()==true
). Ouch.This is related to #530 and #1588, but the problem framing here is a little different so I think it warrants a separate issue.
Originally opened as open-telemetry/opentelemetry-java#6197 but think we should provide a spec level solution for this.
The text was updated successfully, but these errors were encountered: