Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SamplerProvider: add a Provider for configuring Samplers #2555

Closed
wants to merge 4 commits into from

Conversation

tsloughter
Copy link
Member

The SamplerProvider is used by the TracerProvider to get the Sampler
to use based on the SamplerProvider's configuration for choosing
a Sampler by InstrumentationScope name and/or version.

I put the SamplerProvider only in the SDK but maybe it should be defined in the API?

Changes

Adds the SamplerProvider as an interface for getting samplers in the TracerProvider.

@tsloughter tsloughter requested review from a team May 19, 2022 16:19
@reyang reyang added area:sampling Related to trace sampling spec:trace Related to the specification/trace directory labels May 24, 2022
@github-actions
Copy link

github-actions bot commented Jun 1, 2022

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 1, 2022
@tsloughter
Copy link
Member Author

I've encountered another use case.

I have a database client instrumented and want to give the user flexibility on if spans are started for a query. The current solution has been an option on the query function the user can set to enable/disable span creation.

The issue there is now if a user is using the library not for a project they control but to create a library they then too have to make options the user can configure to ultimately pass to the client, and so on.

The Sampler Provider wouldn't in itself be able to make per-query span samplers, at least in the database client use-case I have, since the Scope would be per-connection pool, not per-query. But it would make it possible to use Samplers specific to the database client. Then I can provide the user with configurable Samplers from the database client library to make it easy for them to setup.

Technically it is possible to currently have a sampler that can be written to not sample based on something like the query string, but that would be a Sampler run for any Span in the system. Any sampler helper from the database client library must be able to compose with other samplers. The complexity vs just being able to have a Sampler specific to the database client scope feels like enough to warrant a solution like the SamplerProvider.

@github-actions github-actions bot removed the Stale label Jun 5, 2022
@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 12, 2022
The SamplerProvider is used by the TracerProvider to get the Sampler
to use based on the SamplerProvider's configuration for choosing
a Sampler by InstrumentationScope name and/or version.
@tsloughter
Copy link
Member Author

Updated to support an argument to ShouldSample for those that a SamplerProvider would not be able to be done in a backwards compatible way.

@github-actions github-actions bot removed the Stale label Jun 13, 2022

* `InstrumentationScope`: If the SDK does not support `SamplerProviders` then it
MUST include the `InstrumentationScope` as an argument.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like duplication of functionality. Once we have this optional argument, we do not need the notion of SamplerProvider at all, as we can provide our own Sampler which will delegate the sampling decision to different samplers based on the InstrumentationScope.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It requires the user provide a sampler that does that delegation The point of the SamplerProvider is the user is able to instead configure that Tracer X use always_off or some ratio, rather than having to write a sampler themselves that does this work.

Implementations that prefer the argument to Sampler instead of a SamplerProvider could provide that delegation sampler of course and essentially implement the same resulting end user experience I suppose.

@PeterF778
Copy link

For consistency and completeness, the section on Span creation (https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#sdk-span-creation) needs a new point between steps 1 and 2 which would say that if the SDK supports SamplerProvider then it needs to query it to obtain the Sampler.

@PeterF778
Copy link

I feel there's a potential conflict between the old Sampler and the new SamplerProvider. Configurable SDKs will support setting a custom Sampler for TraceProvider, and they will also need to support setting or configuring SamplerProvider. This may lead to user confusion: which will take effect if she first configures SamplerProvider and then sets her own Sampler?

@tsloughter
Copy link
Member Author

@PeterF778 yea, to be backwards compatible existing implementations will need to still accept a Sampler and create a simple SamplerProvider, that just returns that Sampler, from it.

Implementations would have to document how their sampler configuration works, esp since some will have a SamplerProvider (and must decide if it takes precedence over a configured Sampler -- I think it should) and some a Scope passed to the Sample call.

Comment on lines +248 to +249
Alternatively, the `Sampler` `ShouldSample` function can be extended to accept
the `InstrumentationScope` as an additional argument.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should leave this option out; the reason for a SamplerProvider is that we do not want this to happen, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason to add it is to keep backwards compatibility. My understanding from the discussion in the spec meeting was that some implementations will find it hard to impossible to move to a SamplerProvider and keep backwards compatibility.

Comment on lines +240 to +246
By default the builtin `SamplerProvider` MUST return the `Sampler`
`ParentBased(root=AlwaysOn)` for any `InstrumentationScope`. The builtin
provider MUST also support configuring a specific `Sampler` to be returned based
on the name or (name,version) pair of the `InstrumentationScope`. For example,
in an HTTP server the user may define an `InstrumentationScope` for each route
in their service and configure the `SamplerProvider` to return the `AlwaysOff`
sampler for the `/healthcheck` scope.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add something to say that:

  • If a single Sampler is configured and not a SamplerProvider, then the SDK MUST use a SamplerProvider that returns the singleton Sampler.
  • If a Sampler is not configured, the default rule given above applies.
    The default sampler value ParentBased(root=AlwaysOn) is stated somewhere else, right? Can we consolidate the specs so that ParentBased(root=AlwaysOn) appears only here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I thought I replied tot his. But I think the desire for some to not add a SamplerProvider means not adding something like a MUST wrapping a single Sampler in a Provider.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 24, 2022
@github-actions
Copy link

github-actions bot commented Jul 1, 2022

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:sampling Related to trace sampling spec:trace Related to the specification/trace directory Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants