How sampler.type=remote works #832

xihw · 2018-05-18T02:57:27Z

" Remote (sampler.type=remote, which is also the default) sampler consults Jaeger agent for the appropriate sampling strategy to use in the current service. This allows controlling the sampling strategies in the services from a central configuration in Jaeger backend, or even dynamically (see Adaptive Sampling). "

This is excerpted from Jaeger Doc and it looks pretty confusing to me. Can you help me to understand it with the following questions ?

"consults Jaeger agent for the appropriate sampling strategy" -- As I know there are two places to configure sampling rate: jaeger-client and jaeger-collector. What role does Jaeger agent play here?
"This allows controlling the sampling strategies in the services from a central configuration in Jaeger backend" -- Does "a central configuration in Jaeger backend" mean jaeger-collector ?
What if we use zipkin-client + jaeger backend (jaeger-collector + jaeger-ui + storage) ? In this case we don't have jaeger-agent running, how does the "remote consulting" work ?
A follow-up question on 3. Without jaeger-agent, how is batch handled ? According to zipkin's doc: https://zipkin.io/pages/architecture.html, I interpret "Transport" as "jaeger-agent", in zipkin-client + jaeger backend scenario, do we discard "Transport" ?

yurishkuro · 2018-05-18T03:20:50Z

agent proxies the requests to the collector, so that the client does not need to know where collectors are located (agent is usually on the localhost)
yes, configuration (and soon adaptive calculations) come from the collectors, but clients receive them via agent
remotely controlled samplers are only supported by Jaeger clients, not Zipkin clients.
Not sure which "batch" you are referring to.

xihw · 2018-05-18T03:45:49Z

Okay, based on the two configurations mentioned by:
https://www.jaegertracing.io/docs/sampling/#ClientSamplingConfiguration
https://www.jaegertracing.io/docs/sampling/#CollectorSamplingConfiguration
and considering following scenario, can you answer how many percent of spans reported from service to agent ? How many percent sent from agent to collector ? And how many percent saved to storage by collector ?

a. ClientSamplingConfiguration says probabilistic 0.1, and CollectorSamplingConfiguration says probabilistic 0.2

b. ClientSamplingConfiguration says remote, and CollectorSamplingConfiguration says probabilistic 0.2

"batch" I mean send spans to collector in batches to avoid heavy traffic issue. That's my understanding to description here: https://www.jaegertracing.io/docs/architecture/ (search 'batch'). With Zipkin + jaeger backend, I don't think we have a mechanism to do the batch and that's my concern.

black-adder · 2018-05-18T03:55:06Z

Any span that is reported by the service will be persisted, ie the decision is made once. In your example, the ClientSamplingConfiguration will be used instead of the CollectorSamplingConfiguration so the sampling probability will be 0.1. If you instead were to use sampler.type=remote in the ClientSamplingConfiguration, then the client will use the CollectorSamplingConfiguration of 0.2. (client MUST be configured with sampler.type=remote in order for it to receive sampling rates from the collector, or else it will use the sampling rate provided by the service owner)

black-adder · 2018-05-18T03:56:17Z

The jaeger clients are designed to always batch spans before sending them. If no jaeger-agent is present, the golang and java jaeger clients can be configured to send batch spans over http.

xihw · 2018-05-18T04:02:10Z

Any span that is reported by the service will be persisted

You mean persisted into storage?

In your example, the ClientSamplingConfiguration will be used instead of the CollectorSamplingConfiguration so the sampling probability will be 0.1

Sorry still confused when is the 0.1 used ? service -> agent or agent -> collector or collector -> storage ? or all of them (if all of them then finally 0.1 * 0.1 * 0.1 will be stored in DB right) ?

@black-adder

black-adder · 2018-05-18T04:07:46Z

The sampling rate is only used at the service, 0.1 of traces will be stored in the DB.

xihw · 2018-05-18T17:40:03Z

Ok! so the sampling only happens in service before sending spans out.
One more question:

configuration (and soon adaptive calculations) come from the collectors, but clients receive them via agent

What is the flow ? From service's standpoint, is it pull / push ?
And when does it happen? Once when service is up or periodically ?

black-adder · 2018-05-18T18:10:32Z

services pulls from agent every minute, this is configurable: https://github.com/jaegertracing/jaeger-client-go/blob/master/config/config.go#L86

We haven't done this yet but I've always wanted to do push. It's on my personal road map.

xihw · 2018-05-18T22:25:42Z

Can you also help me understand another sampling propagation question --
Will a service generate a span for incoming request before deciding sample or not sample it ?
Will unsampled span propagated between services ?

black-adder · 2018-05-19T00:09:47Z

Sampling and generation of a span happens roughly at the same time. Context is always propagated between services (even if unsampled).

xihw · 2018-05-19T01:02:27Z

If service B receives a request with context saying something like {"span_a", "unsampled"}, B will still create a span as child of "span_a" and propagate continuously , but won't report it, is it correct ?

black-adder · 2018-05-19T01:39:00Z

yes

xihw · 2018-05-19T03:19:42Z

Ok so does it mean that it's possible for every request being traced by putting it's span info into the log even though we do sampling ?
If so do you have any resource showing how to do that ?

black-adder · 2018-05-19T03:48:38Z

I'm not sure I understand the question. Are you asking if span logs are always persisted even if we do sampling?

xihw · 2018-05-19T04:34:54Z

I'm asking is it possible to use some logging framework like MDC (http://www.baeldung.com/mdc-in-log4j-2-logback) to log the trace id for every single request even if we do sampling.

black-adder · 2018-05-19T14:14:03Z

yes you can log the trace id for every request but since you're sampling, some logs will have trace ids without a persisted trace.

black-adder · 2018-05-19T14:16:31Z

This is a golang example: https://github.com/jaegertracing/jaeger/blob/master/examples/hotrod/pkg/log/spanlogger.go

however, here we're doing more than just logging the traceid, we're dual logging to both the log reporter and into the span.

black-adder · 2018-05-22T21:17:07Z

closing issue, feel free to open if you have more questions

ecourreges-orange · 2019-09-20T16:47:10Z

1. agent proxies the requests to the collector, so that the client does not need to know where collectors are located (agent is usually on the localhost)

The agent proxies the config request to the collector through which connection? The TChannel or gRPC, whichever one is connected?
These docs don't really explain where the sampling config is sent through:
https://www.jaegertracing.io/docs/1.14/getting-started/#all-in-one
https://www.jaegertracing.io/docs/1.14/deployment/#collectors
Also it would be a nice improvement to know which are encryptable or encrypted by default,
This page does not detail which protocol between which component has encryption support:
#458

Thank you.

yurishkuro · 2019-09-20T16:59:45Z

The agent proxies the config request to the collector through which connection? The TChannel or gRPC, whichever one is connected?

whichever one you configure on the agent. We recommend gRPC.

Also it would be a nice improvement to know which are encryptable or encrypted by default,

See #1718

These docs don't really explain where the sampling config is sent through:

Can you elaborate what can be improved in the docs? If you're using remote sampler, then the sampling configuration is defined in the collectors, and is pulled by the clients periodically client<-agent<-collector

ecourreges-orange · 2019-09-20T17:26:08Z

Thanks, now it's all coming together through different info from the different refered github issues.

Can you elaborate what can be improved in the docs? If you're using remote sampler, then the sampling configuration is defined in the collectors, and is pulled by the clients periodically client<-agent<-collector

For improvements to the doc, here are ideas:

add something like a column or an additionnal comment about TLS/encryption support on the port/protocol/function tables of the getting started and deployment (all-in-one and collector)
add in these tables the fact that jaeger-agent doesn't just send spans through gRPC but also pulls config.
add a different architecture schema here:
https://www.jaegertracing.io/docs/1.14/architecture/
With more of a network aspect of showing the actual protocol+ports on the arrows, where it currently shows in dotted red the "control flow" which is not a network connection.
Or split it per component if it becomes a graphic nightmare.

Thanks.

vamshi67 · 2020-06-16T03:19:55Z

Does 'remote' sampling work with http-sender? In my aks cluster setup, I haven't configured 'jaeger-agent'.

yurishkuro · 2020-06-16T03:23:01Z

Sampler has nothing to do with Sender, it's an independent component. It can work with both the agent and the collector.

vamshi67 · 2020-06-16T03:54:24Z

Thanks Yuri for the quick response. I really appreciate your help on this.

I'm using Jaeger K8s operators and has following sampling strategy in the configmap:
_

apiVersion: v1
data:
sampling: '{"default_strategy":{"operation_strategies":[{"operation":"/health","param":0,"type":"probabilistic"},{"operation":"/metrics","param":0,"type":"probabilistic"}],"param":0.1,"type":"probabilistic"}}'
kind: ConfigMap
metadata:
creationTimestamp: "2020-06-15T23:42:14Z"
labels:
app: jaeger
app.kubernetes.io/component: sampling-configuration
app.kubernetes.io/instance: jaeger
app.kubernetes.io/managed-by: jaeger-operator
app.kubernetes.io/name: jaeger-sampling-configuration
app.kubernetes.io/part-of: jaeger
name: jaeger-sampling-configuration
namespace: monitoring

*** We're using monitoring namespace instead of observability.
_

Client application has following properties:

**

sampler.type=const
sampler.sampling-rate=1

**

Since these properties are defined in the application's properties file, I'm overriding using k8s environment variables. I have set sampler.type to remote. As I don't know what value should be given to sampling-rate when sampler.type is set to remote, I set it as 1

With this when I created the pod, every sample is being collected. I'm not sure why it is not honoring remote configuration.

Am I missing anything?

yurishkuro · 2020-06-16T05:41:07Z

The numeric value of 1 is treated as 100% default probability when the sampler cannot contact the backend. It's possible that in your deployment it cannot reach the backend and never gets the 0.1 probability. The sampler emits metrics about unsuccessful configuration pulls.

tcoln · 2020-08-20T10:05:00Z

1. agent proxies the requests to the collector, so that the client does not need to know where collectors are located (agent is usually on the localhost)

2. yes, configuration (and soon adaptive calculations) come from the collectors, but clients receive them via agent

3. remotely controlled samplers are only supported by Jaeger clients, not Zipkin clients.

4. Not sure which "batch" you are referring to.

Dear yuri,
I have a question, if remotely contorlled samplers are only suporter via agent, and agent pulls config via gRPC+protobuff. Then what is the sampling.thrift for?

yurishkuro · 2020-08-20T16:24:10Z

Previously agent was using Thrift to retrieve sampling from collector. Not it uses protobuf, but the clients consume sampling as JSON, and that JSON is still generated from Thrift.

tcoln · 2020-08-23T14:48:40Z

Previously agent was using Thrift to retrieve sampling from collector. Not it uses protobuf, but the clients consume sampling as JSON, and that JSON is still generated from Thrift.

You mean the sample strategies are sent to agents from collector via thrift previously but via protobuff+gRPC now ? I know client get sampling json using http+5778 port. So I care about how collector sent them to agent.

yurishkuro · 2020-08-23T15:25:28Z

collector to agent is grpc

tcoln · 2020-08-24T02:15:11Z

collector to agent is grpc

Thanks, yuri.

sharninder · 2021-10-31T14:25:28Z

The numeric value of 1 is treated as 100% default probability when the sampler cannot contact the backend. It's possible that in your deployment it cannot reach the backend and never gets the 0.1 probability. The sampler emits metrics about unsuccessful configuration pulls.

I'm not sure this is completely correct. Or there is a bug in this code path. I'm setting sampler type to remote and leaving the param yet, the param value is being set to 1 by default even when the remote actually has a param of 0.5. Seems like a bug to me.

yurishkuro added the question label May 18, 2018

black-adder closed this as completed May 22, 2018

yurishkuro mentioned this issue Sep 23, 2019

Some improvements to sampling mechanism description jaegertracing/documentation#303

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How sampler.type=remote works #832

How sampler.type=remote works #832

xihw commented May 18, 2018 •

edited

Loading

yurishkuro commented May 18, 2018 •

edited

Loading

xihw commented May 18, 2018 •

edited

Loading

black-adder commented May 18, 2018

black-adder commented May 18, 2018

xihw commented May 18, 2018 •

edited

Loading

black-adder commented May 18, 2018

xihw commented May 18, 2018 •

edited

Loading

black-adder commented May 18, 2018

xihw commented May 18, 2018 •

edited

Loading

black-adder commented May 19, 2018

xihw commented May 19, 2018

black-adder commented May 19, 2018

xihw commented May 19, 2018

black-adder commented May 19, 2018

xihw commented May 19, 2018

black-adder commented May 19, 2018

black-adder commented May 19, 2018

black-adder commented May 22, 2018

ecourreges-orange commented Sep 20, 2019

yurishkuro commented Sep 20, 2019

ecourreges-orange commented Sep 20, 2019

vamshi67 commented Jun 16, 2020

yurishkuro commented Jun 16, 2020

vamshi67 commented Jun 16, 2020

yurishkuro commented Jun 16, 2020

tcoln commented Aug 20, 2020

yurishkuro commented Aug 20, 2020

tcoln commented Aug 23, 2020

yurishkuro commented Aug 23, 2020

tcoln commented Aug 24, 2020

sharninder commented Oct 31, 2021

How sampler.type=remote works #832

How sampler.type=remote works #832

Comments

xihw commented May 18, 2018 • edited Loading

yurishkuro commented May 18, 2018 • edited Loading

xihw commented May 18, 2018 • edited Loading

black-adder commented May 18, 2018

black-adder commented May 18, 2018

xihw commented May 18, 2018 • edited Loading

black-adder commented May 18, 2018

xihw commented May 18, 2018 • edited Loading

black-adder commented May 18, 2018

xihw commented May 18, 2018 • edited Loading

black-adder commented May 19, 2018

xihw commented May 19, 2018

black-adder commented May 19, 2018

xihw commented May 19, 2018

black-adder commented May 19, 2018

xihw commented May 19, 2018

black-adder commented May 19, 2018

black-adder commented May 19, 2018

black-adder commented May 22, 2018

ecourreges-orange commented Sep 20, 2019

yurishkuro commented Sep 20, 2019

ecourreges-orange commented Sep 20, 2019

vamshi67 commented Jun 16, 2020

yurishkuro commented Jun 16, 2020

vamshi67 commented Jun 16, 2020

yurishkuro commented Jun 16, 2020

tcoln commented Aug 20, 2020

yurishkuro commented Aug 20, 2020

tcoln commented Aug 23, 2020

yurishkuro commented Aug 23, 2020

tcoln commented Aug 24, 2020

sharninder commented Oct 31, 2021

xihw commented May 18, 2018 •

edited

Loading

yurishkuro commented May 18, 2018 •

edited

Loading

xihw commented May 18, 2018 •

edited

Loading

xihw commented May 18, 2018 •

edited

Loading

xihw commented May 18, 2018 •

edited

Loading

xihw commented May 18, 2018 •

edited

Loading