Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype disabling scope across traces, metrics, and logs #6201

Closed
wants to merge 2 commits into from

Conversation

jack-berg
Copy link
Member

@jack-berg jack-berg commented Feb 6, 2024

A prototype solution for #6197.

The idea is to start to more fully realize the potential of scope as a concept. Today, scope is just a piece of metadata attached to records because there is no SDK level configuration surface area involving scopes besides views which allow you to select on scope.

Updated 2/14/24 to reflect latest commit

This PR introduces the notion of scope level configuration for all three signals:

  • Each provider has the ability to register configuration which applies scope configuration to matching scopes
  • When a Tracer, Meter, Logger is created, the config is computed and stored as a local variable. This prevents needing to compute scope config on the hot path. Later, we can add the ability to dynamically update scope configuration, which would trigger the scope config to be recomputed for outstanding tracers, loggers, meters.
  • Scope configs manifest as holders of configuration options relevant to the particular signal. We have TracerConfig, MeterConfig, LoggerConfig.
  • For now, each scope config type only has the ability to specify that a scope is disabled. This capability is shared across all signals, but in the future, scope config may be extended to support signal level properties. For example, its common in logging to specify a log level threshold for a particular logger. This concept is not applicable to metrics.
  • When a scope is disabled, the tracer / meter / logger uses the behavior of the relevant noop tracer / meter / logger.
  • For tracing, the noop tracer returns Span.wrap(parentContext). This means that disabling a tracer is an effective solution to the broken trace problem discussed in Turn off spans from a scope while retaining downstream spans and without breaking traces #6197.
OpenTelemetrySdk sdk =
    OpenTelemetrySdk.builder()
        .setTracerProvider(
            SdkTracerProvider.builder()
                .addSpanProcessor(SimpleSpanProcessor.create(spanExporter))
                .setTracerConfigProvider(
                    applyToMatching(scopeNameEquals("scopeB"), TracerConfig.disabled()))
                .build())
        .setMeterProvider(
            SdkMeterProvider.builder()
                .registerMetricReader(metricReader)
                .setMeterConfigProvider(
                    applyToMatching(scopeNameEquals("scopeB"), MeterConfig.disabled()))
                .build())
        .setLoggerProvider(
            SdkLoggerProvider.builder()
                .addLogRecordProcessor(SimpleLogRecordProcessor.create(logRecordExporter))
                .setLoggerConfigProvider(
                    applyToMatching(scopeNameEquals("scopeB"), LoggerConfig.disabled()))
                .build())
        .build();


@AutoValue
@Immutable
public abstract class LoggerConfig {
Copy link
Contributor

@lmolkova lmolkova Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking out loud, not blocking:

I don't understand how otel combined with specific logging provider enablement would work. We'll have many ways to configure it, which one will be the source of truth?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the pipeline is:

Log API (SLF4J) -> Log Framework (Log4j) -> Otel log bridge API -> Otel log SDK
  • If Log4j disables a logger, the logs will never make it to the otel log bridge API, so any intersecting Otel log SDK config is ignored
  • If log4j enables a logger, the Otel log SDK config has an additional opportunity
  • If there is overlapping responsibilities, and multiple ways to achieve the same affect, why include Otel log SDK logger config at all?
    • Events are also going to flow through the Otel log SDK, and they will not have any equivalent mechanism.
    • Symmetry with Tracer and Meter is important.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, so as a user I'd use log4j config as the first filter that also does perf optimization. Then otel filter is an additional one and you might want to use it to narrow down what goes into otel.

Unrelated to this work, but eventually it'd be nice if OTel config could be used as a source of logging provider configuration.

TracerConfig getTracerConfig(InstrumentationScopeInfo instrumentationScopeInfo) {
for (ScopeSelector scopeSelector : tracerConfigMap.keySet()) {
if (scopeSelector.matchesScope(instrumentationScopeInfo)) {
return tracerConfigMap.get(scopeSelector);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about the places where we provide "customizers" for tracer/meter/logger provider builders, and I'm wondering how many troubles (if any) could happen if two or more customizers added config selectors that overlap, given that only one config will be picked. It definitely seems to be a tricky issue to solve though, but I still would like to see if there might be ways to reduce or eliminate the unexpected behaviors that this situation could bring, maybe by combining here all the matching configs (similar to what happens when multiple SpanProcessors are added to the same tracerprovider)? - Which is something that could work for the isEnabled config option use case though I guess it's difficult to ensure that it would work well for other types of config options that might be added in the future too, so that's why I believe it's a tricky thing to address.

Nevertheless, I don't think this is a blocker, I really like this approach the way it is now and it seems like it allows for dynamic configs as well so I'm looking forward to it. Just wanted to mention that potential issue, which I'm not even sure might actually happen, or happen enough times to become a big problem, just to check if I'm missing something here or if someone can come up with a better scalable alternative to the one I mentioned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No you make a good point and its something I've thought about as well.

Consider someone who wants to have a distribution which disables all scopes by default, and wants to re-enable them on a case by case basis. This seems like a perfectly reasonable thing to do, but isn't possible with the simplistic first matching wins semantics I've prototyped here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I was just thinking further about it, what if the configs weren't tied to a scope and instead got the InstrumentationScopeInfo as a parameter on every config function?

So for example, we'd add configs like so:

SdkTracerProvider.builder()
                .addScopeConfig(TracerConfig.disabled())
                .build())

And then the config would look like the following:

public abstract class TracerConfig {
  //...
  public abstract boolean isEnabled(InstrumentationScopeInfo scope);
}

That way we could also be able to evaluate each config differently where it's read, depending on the case, e.g. for the isEnabled one, we could loop through all the available configs until we found one that returned false, making the span not to get created, or to move forward with the span creation in case no config had disabled the provided scope. And, for the case of the distribution use case, a distribution could provide a single config where it can change its decision dynamically to cover as many or as few (or zero) scopes within its isEnabled impl.

Copy link
Member Author

@jack-berg jack-berg Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The evaluation of whether a tracer is enabled will happen on the creation of every span (i.e. the hot path), so we want it to be very fast. The idea you suggest requires evaluation of the rules on the hot path. I think we need to be able to evaluate the rules off the hot path, and be able to update each SdkTracer with a property indicating whether its enabled or disabled so that evaluation is fast.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, fair enough, I wasn't considering the performance penalty.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably having a setter like:

SdkTracerProvider.builder()
    .setTracerConfigProvider(TracerConfigProvider.getDefault())
    .build();

Should that be enough without having to provide a SdkTracerProvider#updateTracerConfigProvider too? - I'm mostly thinking about the dynamic config use-case and I think it should be enough to provide my own TracerConfigProvider once and change its response dynamically later, so maybe it's not needed to provide a setter for a new TracerConfigProvider after the TracerProvider is built?

For when the config provider changes its response and we'd like to update an existing Tracer that's using the old config, maybe we could create a pub/sub mechanism to notify the creator of said Tracer to create a new one with the new config.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that be enough without having to provide a SdkTracerProvider#updateTracerConfigProvider too?

Not sure. Couple potential issues:

  • TracerConfigProvider would have to be an interface such that its possible to have an implementation which is dynamic.
  • I mentioned above that we would evaluate TracerConfigProvider#getConfig(InstrumentationScopeInfo) each time a tracer is created. So changing the implementation of TracerConfigProvider#getConfig(..) would only affect the Tracers affected after the update. We need some sort of signal that we need to recompute TracerConfigProvider#getConfig(..) for all existing Tracers. Calling a SdkTracerProvider#updateTracerConfigProvider is one way to provide that signal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, if I understood correctly, It seems like we're talking about the same type of solution composed by:

  • Single config provider.
  • Config evaluation on tracer/logger/meter creation.
  • Observer pattern to notify existing tracer/logger/meters when the config has changed.

But the implementations are different, the one I'm talking about I think could have TracerConfigProvider as an interface as you mentioned, and then an abstract implementation of it, maybe called ObservableTracerConfigProvider, where there would be an ObservableTracerConfigProvider#notifyConfigChange protected method which could serve as the signal to update the existing Tracers. This would place all the responsibility on the TracerConfigProvider impl to make the dynamic config work properly though, which I'm not sure if it's a good or a bad thing, and I guess it could be annoying having to check the type of the configured TracerConfigProvider to verify its the observable one before subscribing to config changes.

SdkTracerProvider#updateTracerConfigProvider works too though, it's just that it seems uncommon to have setters for already built tracer/meter/logger providers and I'm not sure if that "mutability" might cause some unexpected behavior scenarios, i.e. somebody can call GlobalOpenTelemetry.get().getTracerProvider().updateTracerConfigProvider, even from a library, and turn all the traces off without the host project being aware of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SdkTracerProvider#updateTracerConfigProvider works too though, it's just that it seems uncommon to have setters for already built tracer/meter/logger providers

I think we've resisted these because up till this point we've punted on dynamic configuration, but once we cross that bridge we'll have no choice but to add setters / update methods on the provider classes.

somebody can call GlobalOpenTelemetry.get().getTracerProvider().updateTracerConfigProvider, even from a library, and turn all the traces off without the host project being aware of it.

Not possible (or rather difficult). GlobalOpenTelemetry obfuscates the underlying OpenTelemetrySdk instance such that you can't access it without using reflection. And if someone is willing to use reflection, then not having setters / update methods won't be an effective deterrent either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've resisted these because up till this point we've punted on dynamic configuration, but once we cross that bridge we'll have no choice but to add setters / update methods on the provider classes.

Gotcha.

Not possible (or rather difficult). GlobalOpenTelemetry obfuscates the underlying OpenTelemetrySdk instance such that you can't access it without using reflection. And if someone is willing to use reflection, then not having setters / update methods won't be an effective deterrent either.

Ah, I see! Yeah definitely if someone wanted to use reflection it wouldn't matter anyway. Now that you mentioned the obfuscation, I was taking a look at the current OTel impl and doing some dry runs where I needed to provide dynamic config changes, and I noticed that it seems like the [Singal]Providers are also obfuscated inside OpenTelemetrySdk, which would mean that even if I had my OTel instance handy (without having to get it from GlobalOpenTelemetry), I still wouldn't be able to cast the providers to later call their setters. So I guess the way to make the dynamic config work would require storing the [Signal]Providers somewhere else during the building of the OpenTelemetrySdk instance, and I just wanted to confirm if that's the expected behavior? I'm asking just to get as close as possible to the full picture of what we'd have to do in OTel Android to allow users and/or vendors to provide dynamic configs, since the OTel instance creation is done internally in the Android lib, I guess similarly to what's done in AutoConfiguredOpenTelemetrySdkBuilder, so, in this case, we'd have to provide the real [Signal]Providers somehow over there after the OTel instance is created. Although we currently have a builder option that receives a previously created OpenTelemetry instance, in case some users would like to control its creation and we later take it to send telemetry for them, though in this scenario it seems like the dynamic config would not be possible as is.

Btw I appreciate your patience with all my comments, I know they're a lot, is just that I'm trying to come up with everything that might look like a potential problem for my use case early, though hopefully all of them are false positives or have alternative solutions.

@jack-berg
Copy link
Member Author

I've pushed an update reflecting my current thoughts on the design:

  • Each config provider builder has a set{Signal}ConfigProvider(Function<InstrumentationScopeInfo, {Signal}Config>) (i.e. setTracerConfigProvider(Function<InstrumentationScopeInfo, TracerConfig)
  • A class called ScopeConfig contains helper functions for constructing Function<InstrumentationScopeInfo, {Signal}Config> functions to accommodate a variety of common use cases
  • These helper functions can be used across signals, so there is more familiarity in configuration.
  • These helper functions can allow you to change the default behavior. For example, the default behavior is to enable all scopes, and you can selectively disable scopes of your choosing. But maybe you want to do opposite, and disable all scopes and selectively turn them on.

/**
* Utilities for configuring scopes.
*/
public final class ScopeConfig {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the various Provider interfaces, we never call it Scope independently, but always instrumentationScope. Consider following that same convention here to avoid confusion from other types of scope.

@@ -147,14 +151,26 @@ public SdkTracerProviderBuilder addSpanProcessor(SpanProcessor spanProcessor) {
return this;
}

public SdkTracerProviderBuilder setTracerConfigProvider(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking this PR, but I think we'll need to add addTracerConfigProvider at some point.

I'm trying to model how the following case can be supported (which I consider to be very common):

  • distro wants to enable certain scopes and disable others by default
  • end user comes and wants to enable/disable a few more scopes, but does not want to override the whole distro filter

I assume we'd introduce addTracerConfigProvider.
Another option that autoconfiguration would provide a customizer that would build composite provider. Or distros would be forced to provide some config API to let users customize the config provider.

Leaving autoconfiguration aside, with vanilla addTracerConfigProvider we'd need to come up with conflict resolution approach (one component enables the scope, another disables the same one).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. We could say that if the SDK interprets a response of null from Function<InstrumentationScopeInfo, TracerConfig> as indicating that it has no preference, and to use the SDK system default. And when a user calls addTracerConfigProvider, we could compose a higher order function which calls previously added functions in order, and returns the first non-null value.

Leaving autoconfiguration aside, with vanilla addTracerConfigProvider we'd need to come up with conflict resolution approach (one component enables the scope, another disables the same one).

First config provider with an opinion wins? Then it becomes a distro problem to make sure that it registers its default scopes in the right order with respect to end user customizations.

}

public static TracerConfig enabled() {
return DEFAULT_CONFIG;
Copy link
Contributor

@lmolkova lmolkova Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really minor:

to improve debuggability and extensibility, it could make sense to have a 3-value state: default, enabled and disabled.

Then default state can be (in theory) configured. I assume distros might have different opinions on the default. Also I think there will be a lot of discussions on what default looks like in the spec.

Also we can identify conflicting cases: e.g. if someone explicitly enabled and then disabled scope looks more like an error comparing to someone changed scope config from default to disabled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 the 3-value state makes the behavior harder to understand for users. I think we'd want to come up with a concrete list of use cases, and confirm its not possible to accommodate them with a 2-value state before considering the 3-value state. Simpler is best, but may need to accept additionally complexity if use cases demand it.

jack-berg added a commit to open-telemetry/opentelemetry-specification that referenced this pull request Apr 8, 2024
Fixes #3867.

This introduces the concept of scope-level configuration for Tracer,
Meter, and Logger. Notes:

- TracerProvider, MeterProvider, and LoggerProvider have a scope config
provider function, which accepts an `InstrumentationScope` and returns
the relevant scope config. This function is invoked when a Tracer,
Meter, Logger is created, and the resulting scope config dictates the
behavior of resulting Tracers, Meters, Loggers. Computing the config on
scope creation avoids doing extra work on the hot path.
- If TracerProvider, MeterProvider, LoggerProvider supports updating
configuration, the function can be updated, at which point the scope
config of outstanding Tracer, Meter, Logger will be re-computed. This
accommodates emerging dynamic config use cases.
- A function as the mechanism to determine relevant scope config to
maximize flexibility. However, implementations are encourages to provide
syntactic sugar / helpers for accommodating popular use cases: select
one or more scopes by name or with pattern matching, disable one or more
selected scopes, disable all scopes by default and enable one or more
selective scopes.
- Scope config is simple at the moment, consisting of on the ability to
enable / disable a particular Tracer, Meter, Logger. When a disabled,
use the respective noop Tracer, Meter, Logger.
- Scope config parameters are expected to evolve over time. When it
makes sense, there should be consistency across signals.
- Java prototype available for this proposal:
open-telemetry/opentelemetry-java#6201

Leaving as a draft for now because its not ready to merge, but I am
actively soliciting feedback. If / when we agree on direction, I'll go
back and add the appropriate experimental indications.
@jack-berg jack-berg mentioned this pull request Apr 10, 2024
@jack-berg
Copy link
Member Author

Closing in favor of #6375.

@jack-berg jack-berg closed this Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants