-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation for sampling #882
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Well written. We may want to expand on this once tail based sampling gets implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a way to better surface this documentation in the left-hand navigation? or if that's already there, apologies but if you could illustrate that a little.
@@ -1,5 +1,5 @@ | |||
opentelemetry.trace.sampling | |||
============================ | |||
Sampling Traces |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the right file to make these changes? One con I see is that documentation on traces won't show up in the top level search tree, so it'll be much harder to find.
Also if you look at the current API docs page, there'll be no information that there are more meaningful docs here:
https://opentelemetry-python.readthedocs.io/en/stable/api/api.html
It'll just say "opentelemetry.trace package".
I would advocate for something to bring this documentation into top-level nav, for easy discovery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed - does it make sense to just move it up a level into the OpenTelemetry Python API
package or should it be alone in its own subsection somewhere? Or maybe this just means every package with actual documentation should have a non-default name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's actually fine to leave this doc as a submodule of trace: it doesn't seem to be a "top-level" concept or a pillar of observability. We might have sampling for logging in the future, but the API surface and behaviour are different enough that I think we should have a separate opentelemetry.logging.sampling package for that when that days comes. Addressing @toumorokoshi comment about difficulty finding the doc, this behaviour is the same for all the documentation that isn't in the top level search tree, so why should sampling be any different? Unless if we have a flat tree and place all the modules in the top level, there's going to be some searching involved to find these topics (i'm also against a flat tree because it is too complicated visually).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess for me, common intermediate-advanced use cases should be called out. In this case, I feel that configuring a sample is a common use case.
@cnnradams if you want to make a final call here, this PR content is good regardless of whether it's exposed at the top level. So if you at least address my existing comments, I'll approve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these docs are in a reasonable location if the names of the parent/sibling pages don't look like they are auto-generated (specifically, the OpenTelemetry Python API
page makes every child look auto generated). Common use cases like sampling probably should be described through root level examples which link to the actual documentation.
either way, addressed your other comments.
""" | ||
Trace data is often produced in large volumes, it is not only expensive to collect and store but also expensive to transmit. | ||
|
||
In order to strike a balance between observability and expenses, traces are sampled. Sampling is the process by which a decision is made on whether to process/export a span or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to strike a balance between observability and expenses, traces are sampled. Sampling is the process by which a decision is made on whether to process/export a span or not. | |
In order to strike a balance between observability and expenses, traces can be sampled. Sampling is the process by which a decision is made on whether to process/export a span or not. |
Also, it looks like sampling is an SDK concept? https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#sampling This looks like it means library authors do not have control over what sampler is finally used by the app user, and SDK authors will have to write their own samplers. I believe it is similar to EDIT:Created an issue [here](#906} |
.. code:: python | ||
|
||
from opentelemetry import trace | ||
from opentelemetry.trace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
) | ||
|
||
# created spans will now be sampled by the ProbabilitySampler | ||
with trace.get_tracer().start_as_current_span("Test Span"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think get_tracer()
needs a module name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks!
Couldn't find any useful docs for Samplers when I was trying to figure out how they worked, so I added the documentation myself :)
Haven't added sampling to any of the examples in the docs since they seem to rely on traces always being printed to the console, but if anyone has any suggestions for which ones should have sampling I can add it.