Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review approach & specify algorithm for TraceIdRatioBasedSampler (ProbabilitySampler) #1413

Open
Oberon00 opened this issue Feb 9, 2021 · 29 comments · May be fixed by #4166
Open

Review approach & specify algorithm for TraceIdRatioBasedSampler (ProbabilitySampler) #1413

Oberon00 opened this issue Feb 9, 2021 · 29 comments · May be fixed by #4166
Labels
area:sampling Related to trace sampling area:sdk Related to the SDK release:after-ga Not required before GA release, and not going to work on before GA sig-issue A specific SIG should look into this before discussing at the spec spec:trace Related to the specification/trace directory

Comments

@Oberon00
Copy link
Member

Oberon00 commented Feb 9, 2021

See also discussion on #1412 (comment)

The sampling alorithm for TraceIdRatioBasedSampler is unspecified. As a result, trace IDs that are sampled by some implementations might get non-sampled or re-sampled by SDKs in other languages, even though they have the same or a a higher probability than the parent.

TODO list for this issue:

  1. Is this a problem at all? Or is the ParentBased approach enough (in combination with any, not necessarily trace-id-based probability-based sampling of root spans)?
  2. If it is a problem, since trace IDs can come from untrusted, non-random sources, do we open up a DDoS/Security/performance issue when using trace IDs as sole, deterministic input for our sampling algorithm? Do we need to put a warning there? Let's assume it is no problem for this issue, this should be handled in Support restarting the trace with a different trace ID #1188.
  3. If applicable after checking the above we determine that we need a consistent algorithm, actually do specify it (maybe based on Add probability sampler details #331).
@Oberon00 Oberon00 added area:sdk Related to the SDK area:sampling Related to trace sampling spec:trace Related to the specification/trace directory release:after-ga Not required before GA release, and not going to work on before GA labels Feb 9, 2021
@bogdandrutu

This comment has been minimized.

@Oberon00

This comment has been minimized.

@bogdandrutu
Copy link
Member

bogdandrutu commented Feb 9, 2021

Is this a problem at all? Or is the ParentBased approach enough (in combination with any, not necessarily trace-id-based probability-based sampling of root spans)?

I've seen implementation that use "inflationary sampling probability", and the algorithm is important for them. For example if Service A (0.2% sampling) -> Service B (0.4% sampling) and you want to end up with about 0.2 full traces and another 0.4 (at Service B level, including the Service A traces).

The algorithm that I've seen (I think @jmacd was one of the author) is that if you have the same algorithm across Services (which means across languages as well) that has a deterministic implementation, and also ensures that every trace sampled at a lower rate will be sampled at a higher rate, then you can achieve that.

traceId1 = "..." // sampled at 0.2 rate
traceId2 = "..." // sampled at 0.4 rate, but not at 0.2
traceId3 = "..." // not sampled at 0.4 rate

ServiceASampler = new TraceIdBasedSampler(0.2):

traceId result
1 true
2 false
3 false

ServiceBSampler = new TraceIdBasedSampler(0.4):

traceId result
1 true
2 true
3 false

@jmacd
Copy link
Contributor

jmacd commented Mar 15, 2021

See open-telemetry/oteps#148

@jmacd
Copy link
Contributor

jmacd commented May 26, 2021

The algorithm that I've seen (I think @jmacd was one of the author)

FYI for the record this wasn't my doing. This "inflationary" sampling technique predated me on that project. 😀

@MrAlias
Copy link
Contributor

MrAlias commented Sep 12, 2022

The Go implementation of this algorithm has lead to an incompatibility Amazon XRay Trace IDs. The first 4 bytes of XRay Trace IDs are time based and the Go ratio sampler expects these bytes to be random.

It would be advantageous to the Go SIG, and likely others, if we could resolve this issue. That way, when we change our algorithm we will only need to do so once.

@MrAlias
Copy link
Contributor

MrAlias commented Sep 12, 2022

@Aneurysm9 had mentioned we switch to "hashing" the middle part of the trace ID to make a sampling decision.

@MrAlias
Copy link
Contributor

MrAlias commented Sep 12, 2022

Current Go implementation for reference:

func (ts traceIDRatioSampler) ShouldSample(p SamplingParameters) SamplingResult {
	psc := trace.SpanContextFromContext(p.ParentContext)
	x := binary.BigEndian.Uint64(p.TraceID[0:8]) >> 1
	if x < ts.traceIDUpperBound {
		return SamplingResult{
			Decision:   RecordAndSample,
			Tracestate: psc.TraceState(),
		}
	}
	return SamplingResult{
		Decision:   Drop,
		Tracestate: psc.TraceState(),
	}
}

@dyladan
Copy link
Member

dyladan commented Sep 13, 2022

@MrAlias that is why the w3c specification is actually moving to make it explicit which bytes should be random. You can see the draft here https://github.com/w3c/trace-context/blob/main/spec/20-http_request_header_format.md#trace-id

The reason we used 7 and not 8 bytes is that the 8th byte contains the sign bit. This is the same reason go has the method to generate a 63 bit random number.

@MrAlias
Copy link
Contributor

MrAlias commented Sep 13, 2022

@MrAlias that is why the w3c specification is actually moving to make it explicit which bytes should be random. You can see the draft here https://github.com/w3c/trace-context/blob/main/spec/20-http_request_header_format.md#trace-id

The reason we used 7 and not 8 bytes is that the 8th byte contains the sign bit. This is the same reason go has the method to generate a 63 bit random number.

Ah, super helpful! Thanks 🙏

@MrAlias
Copy link
Contributor

MrAlias commented Sep 13, 2022

Is the plan for OTel to just adopt this ^ when it lands?

@MrAlias
Copy link
Contributor

MrAlias commented Sep 13, 2022

@MrAlias that is why the w3c specification is actually moving to make it explicit which bytes should be random. You can see the draft here https://github.com/w3c/trace-context/blob/main/spec/20-http_request_header_format.md#trace-id

The reason we used 7 and not 8 bytes is that the 8th byte contains the sign bit. This is the same reason go has the method to generate a 63 bit random number.

@dyladan does this mean that the AWS XRay TraceID which uses a non psudo-random value for the left-most 4 bytes (based on the time) would not be W3C complaint?

Looking into switching the Go implementation to the mentioned algorithm, I think the XRay spans would continue to be sampled in a non-random manner given this static prefix.

cc @Aneurysm9

@Aneurysm9
Copy link
Member

Trace IDs are 16 bytes, so there should be no issue.

|  0 1 2 3 | 4 5 6 7 | 8 | 9 a b c d e f |
      A         B      C         D

X-Ray IDs will have non-random data in region A (bytes 0-3) and random data in regions B, C, and D (bytes 4-f). The w3c proposal is to guarantee that region D (bytes 9-f) contain random data. The current implementation uses regions A and B, shifted right by one place so it is effectively bytes 0-6:

x := binary.BigEndian.Uint64(p.TraceID[0:8]) >> 1

Instead, we could use region D directly:

x := binary.BigEndian.Uint64(p.TraceID[9:])

This would also make the sampler safe to use with 64-bit trace IDs still generated by some legacy systems.

@jmacd
Copy link
Contributor

jmacd commented Sep 14, 2022

I am expecting us to use the W3C trace context "random" flag to address this issue:
w3c/trace-context#474
@dyladan can you provide guidance?

@dyladan
Copy link
Member

dyladan commented Sep 14, 2022

We spoke about this at the w3c meeting yesterday actually. I was going to bring it up at the next maintainers meeting. The level 2 spec for trace context was delayed by some extended summer vacations but is about to go into wide review for publication as a recommendation. Obviously until something is an official recommendation the working group can't guarantee anything, but we do not expect any major changes. Here are the important points:

  1. The current spec requires that the rightmost 7 bytes MUST be random if the random bit is set. 7 Was chosen because 8 would have included the sign bit for some methods of generating random numbers.
  2. If the random bit is not set, there are no guarantees.
  3. The exact number of bytes is not expected to change, but is not guaranteed until the wide review is complete. During the review it is conceivable that someone challenges the number we have chosen (either to say 7 is too much and wastes resources of a RNG or to say that it is not enough to cover some usecase). The farther right in the ID you go, the "safer" the assumption is that the random bit will apply to the bits/bytes in question.
  4. The "rightmost" semantic is extremely unlikely to change due to the prior art of some tracing systems using the left bits for nonrandom data, and 0-padding short trace ids is done on the left side.

@dyladan does this mean that the AWS XRay TraceID which uses a non psudo-random value for the left-most 4 bytes (based on the time) would not be W3C complaint?

No. The randomness requirement only applies to traceparents where the random flag is set to 1 for backwards compatibility reasons. Also, the nonrandom bits for xray are the leftmost bits and the flag only applies to the rightmost bits. I am not sure how the rest of the ID is generated.

edit: Missed that @Aneurysm9 clarified the region in question is random.

@dyladan can you provide guidance?

I would feel safe using the rightmost 7 bytes as my random number, and the fewer rightmost bytes that are used the safer I would feel. Hashing the random part of the ID or the whole ID would be another way to guarantee safety, but also comes at the cost of implementation complexity (and ensuring all implementations are the same).

I would probably recommend to restrict to inverse power of 2 sampling probabilities (1/2, 1/4, 1/8, etc) which would allow you to use the minimum number of bits without hashing. For example, a 50% sampling rate only needs the single rightmost bit, a 25% sampling rate needs only the rightmost 2 bits, etc. This has also been discussed to have other benefits with respect to @jmacd's and @oertl's probability propagation proposal.

edit: alternative to power of 2 restriction would be restricting probability to a whole number percent. Only 7 bits are required to represent every whole number from 1 to 99

@jmacd
Copy link
Contributor

jmacd commented Sep 16, 2022

@dyladan Thank you. I am pleased to see that we are nearly ready to adopt the random bit for w3c traceparent. Those familiar with the current tracestate-based proposal for probability sampling may remember that we would be able to eliminate the r variable once we had a consistent random source. Meanwhile, the p variable would still compactly propagate the power-of-two sampling probabilities (for use in span-to-metrics pipelines), but that if we had "plenty" of random bits available we could easily extend that specification to accommodate propagating non-powers-of-two sampling probabilities.

If we have 7-bytes of randomness, it means we can agree to a consistent method to evaluate TraceIDRatioBased sampling policy like the OTel-Go example in #1413 (comment).

@dyladan
Copy link
Member

dyladan commented Sep 16, 2022

p value propagation I hope is next in level 3. for now it was agreed to leave it out of the level 2 spec because the r value was less controversial and less likely to get held up in wider review

@jmacd
Copy link
Contributor

jmacd commented Sep 22, 2022

We discussed this in the 9/22 Sampling SIG. The action items were loosely discussed and will continue in the next SIG (10/6).

My opinions, roughly, are:

  1. Assuming W3C gives us 56 bits of randomness, compute a sampling threshold T = S * 2^57 in the range [0, 2^57].
  2. When a W3C tracecontext w/o the new random flag is sampled, SDKs should use an unspecified hashing algorithm on the TraceID to construct 56 questionably-random bits. They should not expect consistent sampling, in this case, and should restrict usage to root-only sampling.
  3. Take the 56 bits of randomness and construct a 56-bit number R in a specified, consistent manner, return the sampling decision if R < T.

The same decision would be returned by a Consistent Probability Sampler as in the experimental specification here: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md.

Moreover, the current experimental specification can be updated to rely on the W3C randomness bit, which is a huge improvement for us (thank you W3C TraceContext group, thank you @dyladan!) as follows:

  1. R-value is no-longer needed, can be dropped (assuming we do not support probability sampling for TraceIDs without the random flag). The smallest sampling probability supported becomes 2^-57
  2. P-value would be used when the threshold T equals a power-of-two (i.e., has one bit set)
  3. a new T-value would be used when the threshold T is not a power-of-two, encoded presumably in hexadecimal. To express a sampling rate like 3-in-4, we want a threshold like T=c0000000000000 (which makes me want to drop trailing zeros of the threshold, so 3-in-4 could be encoded as just T=c).

cc: @oertl @spencerwilson @PeterF778 @kalyanaj @dyladan

@dyladan
Copy link
Member

dyladan commented Sep 22, 2022

Thanks for the update. @jmacd mind sharing what the reasoning was behind using 56 bits? Is it just because that is the number the W3C already has in the draft spec or was there some need for sampling thresholds with that level of randomness?

@jmacd
Copy link
Contributor

jmacd commented Sep 22, 2022

I do not believe anyone requires 56-bits of sampling precision. I'm interested in what others think is a good value, maybe 16 or 20 bits will do.

@oertl
Copy link
Contributor

oertl commented Sep 23, 2022

@jmacd

I do not believe anyone requires 56-bits of sampling precision. I'm interested in what others think is a good value, maybe 16 or 20 bits will do.

I am not sure if 16 bits are enough if really arbitrary sample rates should be supported. With 16 bits, the smallest possible sampling rate would be 1/2^16 = 0.00001525878 and the second smallest possible sampling rate would be 2/2^16 = 0.00003051757. There is a large relative gap between these two sampling rates.

@oertl
Copy link
Contributor

oertl commented Sep 23, 2022

@jmacd

  1. When a W3C tracecontext w/o the new random flag is sampled, SDKs should use an unspecified hashing algorithm on the TraceID to construct 56 questionably-random bits. They should not expect consistent sampling, in this case, and should restrict usage to root-only sampling.

If the hashing algorithm is not specified, the SDKs can simply use any random number. Using the trace ID is not very useful in this case.

@PeterF778
Copy link

For an assessment on the lower end of needed probabilities, let's look at one example. Google handles about 100 billion requests daily. If we design long term storage for traces which decreases traces cardinality as the data gets older, and keeps only 1000 traces per day (for example, for data older than 5 years), we need probability of about 2^-28. So, I'd say, we need at least 31 random bits, and this is still playing with chances.

@kalyanaj
Copy link

kalyanaj commented Oct 6, 2022

When a W3C tracecontext w/o the new random flag is sampled, SDKs should use an unspecified hashing algorithm on the TraceID to construct 56 questionably-random bits.

When the new random flag is NOT set, couldn't we still require SDKs to use the SAME hashing algorithm (instead of an unspecified algorithm)? i.e., a best effort in treating that the same last set of bytes in traceID is random...

Reasoning: My understanding is that the TraceID generated by many systems today, though not required by the Level 1 of the W3C TraceContext spec, have their rightmost bytes randomly generated. So, shouldn't we do a "best effort" consistent probability sampling as adoption of the new flag can take a while (W3C TraceContext level 2 spec has to get to recommendation stage, implementations have to adopt it etc.).?

@jmacd
Copy link
Contributor

jmacd commented Oct 6, 2022

This was discussed by me @oertl @PeterF778 @kalyanaj and @kentquirk in the Sampling SIG today. Notes:

The W3C random flag is anticipated to go ahead, but we also expect it could be a year-or-so before it is deployed in OTel SDKs.

We debated whether 56-bits, 48-bits, or 32-bits of randomness would be preferred. There is not a strong preference between 48 and 56, but we think 32 bits is not sufficient.

There was a brief question of whether we might wish to reserve some (e.g., 8) bits of the TraceID for future/alternative use on the assumption that 16-bytes is more than sufficient for global uniqueness (provided 48 bits are truly-random bits). For example, we could directly encode today's powers-of-two-sampling r-value using 6 bits of the TraceID, which drastically reduces the amount of randomness required per trace for consistent probability sampling. (Why: the step to generate r-value uses 2 (expected) random bits per TraceID.). On the other hand, with W3C support we could simply append extra bytes to the traceparent header to reduce the cost of randomness by a logarithmic factor.

We discussed hashing approaches and were reminded why we don't like them (they're expensive, faulty, and not portable, see https://github.com/rurban/smhasher).

We discussed how to test for suitably random TraceIDs, it could follow this previous work: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md#appendix-statistical-test-requirements

We arrived at the following recommendation to address this issue. Assume that the least-significant 7 bytes of the TraceID are random _as though the anticipated W3C random flag were set. Thus, we will generate a 56-bit number and follow exactly the recommendation made by @Aneurysm9, i.e.,

This is compatible with X-ray and we believe it is compatible with all existing OTel SDKs. The work remaining for this issue, if the proposal is accepted, will be to update the Trace SDK specification with details. The TraceIDRatioBasedSampler.ShouldSample() logic uses

traceThreshold := uint64(samplingProbability * (1 << 57))

func ShouldSample() bool {
   traceValue := binary.BigEndian.Uint64(p.TraceID[9:])
   return traceValue < traceThreshold
}

The group reasons that this is no worse than doing nothing at all, and assuming that the W3C proposal does not change this is also forward-compatible.

This traceThreshold shown above can be used to convey non-power-of-two adjusted counts. They could be propagated (as described in my comment above) using a tracestate t-value with an update to https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md.

@PeterF778
Copy link

During our group meeting on Oct 20, @PeterF778, @kalyanaj, @spencerwilson and @kentquirk identified some use cases which are easier to handle if sampling decisions are based on r-value generated independently from the trace-id. One is with tracking user sessions and the other is with linked traces.

Assuming non-instrumented browser, user sessions involve several requests to some backend, each generating a new trace. It is beneficial to keep all these traces consistently sampled. This can be achieved by generating the r-value for the first request as usual, but reusing it for all traces belonging to the same session. This requires a mapping from the session-id to the r-value, which can be technically challenging, but should be feasible.

Linked traces can be used in a number of ways, but one typical use case is when one trace leaves a message in queue to be picked up by another trace hours or days later. The root span of the consuming trace links itself to the producing trace. Again, it is beneficial to make the same sampling decisions for both traces. This can be helped by the consuming trace cloning the r-value from the producing trace.

@jmacd
Copy link
Contributor

jmacd commented Nov 17, 2022

@PeterF778 would you agree that would be possible for an SDK to do what you described by fixing the 7 bytes of random TraceID and then generating multiple correlated TraceIDs from the single random source?

Can you see any problems that might result from avoiding r-value, in that case?

@PeterF778
Copy link

@PeterF778 would you agree that would be possible for an SDK to do what you described by fixing the 7 bytes of random TraceID and then generating multiple correlated TraceIDs from the single random source?

Can you see any problems that might result from avoiding r-value, in that case?

In the SDK trace-id is generated automatically when the root span is created. There are no mechanisms that could be used to customize this behavior. In contrast, the r-values are created within calls to Sampler.shouldSample() where span attributes are available.

Reusing the 7 random bytes of trace id remains a theoretical possibility, but it would be very hard to implement. Even if we could somehow get it to work, such a change could break some vendors' features if they assume uniqueness of these 7 bytes.

jpkrohling added a commit to open-telemetry/opentelemetry-collector-contrib that referenced this issue Jan 31, 2024
…29720)

**Description:** This is the `pkg/sampling` portion of of
#24811.

**Link to tracking Issue:** 
#29738

open-telemetry/opentelemetry-specification#1413

**Testing:** Complete.

**Documentation:** New README added.

---------

Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>
Co-authored-by: Kent Quirk <kentquirk@gmail.com>
cparkins pushed a commit to AmadeusITGroup/opentelemetry-collector-contrib that referenced this issue Feb 1, 2024
…pen-telemetry#29720)

**Description:** This is the `pkg/sampling` portion of of
open-telemetry#24811.

**Link to tracking Issue:** 
open-telemetry#29738

open-telemetry/opentelemetry-specification#1413

**Testing:** Complete.

**Documentation:** New README added.

---------

Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>
Co-authored-by: Kent Quirk <kentquirk@gmail.com>
@austinlparker austinlparker added the sig-issue A specific SIG should look into this before discussing at the spec label Apr 23, 2024
@austinlparker
Copy link
Member

@jmacd Is this something for sampling SIG?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:sampling Related to trace sampling area:sdk Related to the SDK release:after-ga Not required before GA release, and not going to work on before GA sig-issue A specific SIG should look into this before discussing at the spec spec:trace Related to the specification/trace directory
Projects
None yet
10 participants