Specify how to propagate consistent head sampling probability #168

jmacd · 2021-07-23T21:36:09Z

This OTEP specifies how to propagate head trace sampling probability, so that child spans can be counted on the fly. This specification would allow ParentBased samplers to output sampling probabilities in a way that supports span-to-metrics pipelines, for example.

This OTEP is paired with #170 which discusses additions to the span data model.

text/trace/0000-sampling-propagation.md

oertl · 2021-07-28T14:44:18Z

@jmacd

This makes sense. It makes me wonder, though, if we have a way to generate the correct randomness, what's really keeping us from using random bits in TraceID? If we have random bits in the TraceID, the number-of-leading-zeros test can be applied directly, right?

Yes, if the trace ID is specified to be truly random, it would be possible to use it without hashing.

...and I gather that the answer is -- it requires less randomness with your proposal. In the version where we add a single byte to convey randomness, the expected number of random bits required is 2 in this case (for the reader, I believe we're talking about the mean of a https://en.wikipedia.org/wiki/Geometric_distribution with p=0.5). Can you confirm?

Yes, the given pseudocode generates a geometrically distributed random number with p=0.5 and takes just 2 random bits on average.

What I was trying to say is that all samplers (AlwaysOn, AlwaysOff, Parent, and TraceIDRatio) can be implemented using the same logic as the TraceIDRatio sampler, provided that the sampling rate of the parent is propagated. They only differ how the sampling rate is chosen for each span. For AlwaysOn it is 1, for AlwaysOff it is 0, for Parent it is the parent sampling rate, and for TraceIDRatio it is the fixed sampling rate. One could therefore say that a sampler should only be responsible for the selection of the sampling rate. The sampling decision is then always made according to the trace ID ratio logic.

To better reflect that in the interface, I would even redefine the sampler interface which is currently a function mapping to a boolean indicating the sampling decision. If it is defined as a function that maps each individual span to the corresponding sampling rate, the sampler implementations would be very simple. For example, TraceIDRatio would just return its fixed sampling rate and Parent would return the sampling rate of its parent. The final sampling decision would be made outside of the sampler following a specified and fixed logic based on the shared random number (or trace ID) and the sampling rate only.

The proposed redefined sampler interface would have following benefits:

Any sampling decision will be consistent, because the same logic is used to get a sampling decision based on the sampling rate and the shared random number (or trace ID).
The sampling rate and therefore the adjusted count is known for all spans as the interface enforces returning a sampling rate for every span.
For estimation and further processing it is sufficient to know the sampling rate. It is not necessary to know which sampler was used to determine the sampling rate. What matters is just the sampling rate. This makes recording the sampler name (which would introduce additional overhead and which was mentioned here Probability sampling basics for telemetry events #148 (comment)) obsolete.
Furthermore, new samplers which, for example, choose sampling rates more dynamically than TraceIDRatio based on span attributes, are easy to implement and would not break estimation algorithms.

jmacd · 2021-07-28T19:47:17Z

@oertl Thank you. I definitely understand your proposal. You've described a simplification in the Sampler interface, which is an SDK specification topic that I was hoping to avoid. I agree with your summary, the overall Sampler outcome would be easier to reason about with the change you suggest, but I am not sure that legacy users of the API will agree. The amount of energy it will take to make this happen may exceed the benefit produced.

I am trying to keep these OTEPs separate:

#170 is about the data model for Span (formerly known as #148)
#168 is about uses of the W3C trace context (this PR)

Changing the Sampler API would be another issue entirely.

I will revise this PR based on the discussion in #168 (comment), which means combining the two pieces of additional information that go into the traceparent to both include the head probability and the randomly distributed value, as they are both apparently needed to complete a functional span-to-metrics pipeline.

yurishkuro · 2021-07-28T19:53:31Z

Personally I would prefer to separate the propagation proposal from the W3C format discussion. Whatever happens with this OTEP is not going to have any (formal) impact on changing the W3C spec, the discussion would need to happen there. On the other hand, the actual functionality is not dependent on changing the W3C format because it can be achieved by using tracestate, which would be completely in the hands of OTEL & if adopted & successful would serve as a strong motivation for upgrading the traceparent spec.

jmacd · 2021-07-28T19:56:03Z

@oertl and @yurishkuro

This makes recording the sampler name (which would introduce additional overhead and which was mentioned here -- #148) obsolete.

In my proposal, in the text of #170 (now), the sampler.name attribute is only a MUST when the adjusted count is unknown, which would only result from legacy trace contexts (if we succeed). I see no reason to record the sampler.name otherwise, especially if as you suggest the TraceIDRatio decision is used consistently.

jmacd · 2021-07-28T21:48:53Z

@yurishkuro Thank you for your guidance. I have revised this proposal only to use tracestate.

@oertl I have revised this to include your proposal with a syntax like:

tracestate: otelprob=PPRR

where PP is the base16 probability value and RR is the base16 randomness value. Please take another look.

text/trace/0168-sampling-propagation.md

bogdandrutu

One big question that I don't get the answer from this OTEP is if this is a required feature. I feel that you need this to be required to work (especially when the trace travers multiple customers), but I also feel this is an extra overhead and complicated logic for cases when the customer does not care about this.

text/trace/0168-sampling-propagation.md

bogdandrutu · 2021-09-15T07:18:12Z

@jmacd please fix the check-errors.

jmacd · 2021-09-15T18:43:30Z

One big question that I don't get the answer from this OTEP is if this is a required feature.

Discussed in "Default behavior". I'd prefer this were on-by-default, but with the present tracestate solution it's expensive. Maybe after there's a W3C traceparent, where this costs 1-3 bytes per context, this can be on by default. I'll accept off-by-default for now-- Lightstep's OTel launchers would turn this on by default. I imagine "yet another" environment variable to say whether the TraceIDRatioBased sampler generates r or not.

bogdandrutu · 2021-09-16T10:31:59Z

@jmacd where should we discuss/agree if this is or not the default behavior?

carlosalberto · 2021-09-16T14:52:25Z

@bogdandrutu Just FYI there's a SIG call on 8 minutes, although don't know who package are you today ;)

text/trace/0168-sampling-propagation.md

…ow in [0, 62]

…raceprop

…o default-on discussion

bogdandrutu

Unblocking, since @jmacd promised that the default behaviors will be discussed in specs.

jmacd · 2021-09-28T15:22:56Z

default behaviors will be discussed in specs

Yes. FWIW, I've already changed this OTEP to state that the default will be opt-in. We're just going to debate the name and form of the option.

…elemetry#168)

…elemetry/oteps#168)

jmacd added 4 commits July 23, 2021 14:25

Specify how to propagate head sampling probability

14bd54e

edit

1d5d60a

version

c741f7e

links to OTEP 148 are TODOs

6adbd1a

jmacd requested review from a team July 23, 2021 21:36

jmacd mentioned this pull request Jul 23, 2021

Probability sampling basics for telemetry events #148

Closed

oertl reviewed Jul 24, 2021

View reviewed changes

text/trace/0000-sampling-propagation.md Outdated Show resolved Hide resolved

carlosalberto reviewed Jul 26, 2021

View reviewed changes

text/trace/0000-sampling-propagation.md Outdated Show resolved Hide resolved

carlosalberto reviewed Jul 26, 2021

View reviewed changes

text/trace/0000-sampling-propagation.md Outdated Show resolved Hide resolved

jmacd mentioned this pull request Jul 27, 2021

Probability sampling: Encode Span's head-adjusted count #170

Merged

jmacd added 2 commits July 27, 2021 13:45

rename

11206d7

Add a tracestate variation

4085972

jmacd added 2 commits July 28, 2021 14:39

redraft using tracestate and two values

5cd3b9a

edits

5aedc9c

jmacd changed the title ~~Specify how to propagate head sampling probability~~ Specify how to propagate consistent head sampling probability Jul 28, 2021

jmacd added 2 commits July 28, 2021 15:01

Drop mention of inflationary

32544ea

detail about samplers

aa22609

yurishkuro reviewed Jul 28, 2021

View reviewed changes

text/trace/0168-sampling-propagation.md Outdated Show resolved Hide resolved

yurishkuro reviewed Jul 28, 2021

View reviewed changes

text/trace/0168-sampling-propagation.md Outdated Show resolved Hide resolved

yurishkuro reviewed Jul 28, 2021

View reviewed changes

text/trace/0168-sampling-propagation.md Outdated Show resolved Hide resolved

yurishkuro reviewed Jul 28, 2021

View reviewed changes

text/trace/0168-sampling-propagation.md Outdated Show resolved Hide resolved

oertl approved these changes Jul 29, 2021

View reviewed changes

edit

73f3b6f

jmacd mentioned this pull request Jul 29, 2021

Complete the TraceIdRatio specification open-telemetry/opentelemetry-specification#1826

Open

bogdandrutu requested changes Sep 15, 2021

View reviewed changes

jmacd mentioned this pull request Sep 15, 2021

Composite Sampler open-telemetry/opentelemetry-specification#1844

Open

jmacd added 2 commits September 15, 2021 11:33

lint

3097dcb

lint

0acc729

jmacd mentioned this pull request Sep 16, 2021

Introduce sampling score and propagate it with the trace #135

Closed

yurishkuro reviewed Sep 21, 2021

View reviewed changes

text/trace/0168-sampling-propagation.md Outdated Show resolved Hide resolved

Remove log_head_adjusteed_count; remove the +1 bias for p-values; r n…

fa2ded1

…ow in [0, 62]

bogdandrutu mentioned this pull request Sep 21, 2021

Randomness flag bit w3c/trace-context#467

Closed

jmacd and others added 7 commits September 21, 2021 10:39

Use 7/16

d119c57

Use 7/16

5ea047e

Use 7/16

28779fe

Merge branch 'main' into jmacd/traceprop

04b37e4

5%

32c384e

Merge branch 'jmacd/traceprop' of github.com:jmacd/oteps into jmacd/t…

efc4bb0

…raceprop

mention w3c trace context issue 467 (randomess bit); move issue 463 t…

f6ffd02

…o default-on discussion

bogdandrutu approved these changes Sep 28, 2021

View reviewed changes

whitespace

0a296b5

jmacd merged commit dee7389 into open-telemetry:main Sep 29, 2021

jmacd mentioned this pull request Sep 29, 2021

Add sampling constant w3c/trace-context#468

Draft

PeterF778 mentioned this pull request Jun 14, 2022

REQUEST: New membership for PeterF778 open-telemetry/community#1077

Closed

oertl mentioned this pull request Jun 14, 2022

REQUEST: New membership for @oertl open-telemetry/community#1078

Closed

6 tasks

carlosalberto pushed a commit to carlosalberto/oteps that referenced this pull request Oct 23, 2024

Specify how to propagate consistent head sampling probability (open-t…

0255d7e

…elemetry#168)

carlosalberto pushed a commit to carlosalberto/oteps that referenced this pull request Oct 23, 2024

Specify how to propagate consistent head sampling probability (open-t…

9f55a26

…elemetry#168)

carlosalberto pushed a commit to carlosalberto/oteps that referenced this pull request Oct 30, 2024

Specify how to propagate consistent head sampling probability (open-t…

51e28d5

…elemetry#168)

carlosalberto pushed a commit to open-telemetry/opentelemetry-specification that referenced this pull request Nov 8, 2024

Specify how to propagate consistent head sampling probability (open-t…

1aa28f9

…elemetry/oteps#168)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specify how to propagate consistent head sampling probability #168

Specify how to propagate consistent head sampling probability #168

jmacd commented Jul 23, 2021 •

edited

Loading

oertl commented Jul 28, 2021

jmacd commented Jul 28, 2021 •

edited

Loading

yurishkuro commented Jul 28, 2021 •

edited

Loading

jmacd commented Jul 28, 2021 •

edited

Loading

jmacd commented Jul 28, 2021

bogdandrutu left a comment

bogdandrutu commented Sep 15, 2021

jmacd commented Sep 15, 2021

bogdandrutu commented Sep 16, 2021

carlosalberto commented Sep 16, 2021

bogdandrutu left a comment

jmacd commented Sep 28, 2021

Specify how to propagate consistent head sampling probability #168

Specify how to propagate consistent head sampling probability #168

Conversation

jmacd commented Jul 23, 2021 • edited Loading

oertl commented Jul 28, 2021

jmacd commented Jul 28, 2021 • edited Loading

yurishkuro commented Jul 28, 2021 • edited Loading

jmacd commented Jul 28, 2021 • edited Loading

jmacd commented Jul 28, 2021

bogdandrutu left a comment

Choose a reason for hiding this comment

bogdandrutu commented Sep 15, 2021

jmacd commented Sep 15, 2021

bogdandrutu commented Sep 16, 2021

carlosalberto commented Sep 16, 2021

bogdandrutu left a comment

Choose a reason for hiding this comment

jmacd commented Sep 28, 2021

jmacd commented Jul 23, 2021 •

edited

Loading

jmacd commented Jul 28, 2021 •

edited

Loading

yurishkuro commented Jul 28, 2021 •

edited

Loading

jmacd commented Jul 28, 2021 •

edited

Loading