-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Composite Sampler #1844
Comments
/cc @carlosalberto |
I feel this is connected with the topic of "View configuration" in the metrics SDK: |
I probably know too little about metrics, but I cannot see any connection between |
People use "Sampler" to describe a standard mechanism for selecting which spans will be reported. People use "View" to describe a standard mechanism for selecting which metrics will be reported. |
After reading @iNikem's comment on open-telemetry/oteps#175, I will take a stronger position here. The term "non-probability sampler" is an oxymoron. Outside of OpenTelemetry, "Sampling" refers to a statistical process used to estimate something, and if there aren't probabilities involved, what you have may be called "Subjective analysis". We are stuck with the term "Sampling" in OTel being used to mean both filtering and excluding spans from the population (what I call "View construction") and also the statistical process of counting the non-excluded non-filtered spans (what I call "Probability sampling", a form of span aggregation). OTEPs 170 and 168 are trying to finish this second aspect of what "Sampling" means, and you are asking to refine the first aspect of what Sampling means. In that sense, your comment open-telemetry/oteps#175 (comment) makes sense. It's confusing that the document talks about composition but doesn't answer to the need for View construction. @iNikem asked above:
OTEP 170 specifies how to compose Samplers using logical-OR, which is useful in the sense that it allows composition of a probability sampler with a non-probability sampler (i.e., probability sampling plus subjective analysis). The concept of "zero adjusted count" is critical to understanding how we compose probability samplers and non-probability samplers. Combining probability sampling with AND- and short-circuiting goes against the goals of probability sampling. Any span that is excluded before a probability sampler sees it is effectively uncounted, so placing filters in front of a Sampler distorts the result. I'm not convinced there's a real need to specify AND- and short-circuit types of composition, that doesn't seem like a natural way to configure what (IMO) users want to configure. The form of sampler configuration we've seen from e.g., Jaeger remote sampling and AWS xRay look like a single Sampler with a list of rules that are internally followed, with short-circuiting. In pseudo-code you have a probability lookup like:
My understanding is that users would like a way to configure this sort of logic, but it won't be through composition of multiple samplers, it will be through configuring one sophisticated sampler. The reason we talk about composing a probability and non-probability sampler is that we are aware of adaptive schemes that combine the two (open-telemetry/oteps#168 (comment)). Please treat the review of open-telemetry/oteps#168 and open-telemetry/oteps#175 as an independent topic from more general forms of View construction for spans. It's possible in the distant future that OTel will evolve support for computing multiple samples in parallel with multiple independent head and tail probabilities. That will offer meaningfully more ways to sample and track down rare conditions in large-scale systems, but I don't think users will be asking for that soon. Besides, all of this discussion points to a larger problem with configuration in general; for now we do not expect to configure sophisticated samplers through the environment, we expect to use "jaeger_remote" or similar to select a module of code that performs the above logic and returns one of the built-in samplers. Please approve the two OTEPs so that we can proceed with necessary work on the built-in samplers. |
Thank you @jmacd for this explanation, I now understand current landscape much better. I can agree with your suggestion that non-probabilistic samplers don't need to be composed and my problem of span "filtering" can be solved with one exhaustive Sampler. This approach will resolve this current issue. @anuraaga, taking this into account, should we change https://github.com/open-telemetry/opentelemetry-java-contrib/blob/53e8a363afe1d5ec6cd7afdf4285e4bf8dbc63b3/contrib-samplers/src/main/java/io/opentelemetry/contrib/samplers/RuleBasedRoutingSampler.java#L38 to accept fallback decision instead of fallback sampler? @jmacd Do you think that we need to introduce a new entity, |
#2047 specifies the rules for composing samplers that I believe we need. However, we discovered a problem in the way the sampler APIs interact, discussed in #2179. I would like to see #2047 merge, after that point I believe we can implement a composite sampler based on the rules there. Please review and approve #2047 -- we are blocked. |
Let's imagine for a second that #1597 is resolved and we have a sampler that drops server spans based on the url. It is safe to assume that for any particular configured Otel SDK we still will need one more sampler, to decide the fate of all other spans. This is already a second example of "composite samplers", the first being existing ParentBased sampler.
This raises several questions:
OTEL_TRACES_SAMPLER
be expanded to allow for configuration of sampler composition?The text was updated successfully, but these errors were encountered: