Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for sampling #882

Merged
merged 4 commits into from
Jul 14, 2020
Merged

Conversation

cnnradams
Copy link
Member

Couldn't find any useful docs for Samplers when I was trying to figure out how they worked, so I added the documentation myself :)

Haven't added sampling to any of the examples in the docs since they seem to rely on traces always being printed to the console, but if anyone has any suggestions for which ones should have sampling I can add it.

@cnnradams cnnradams requested a review from a team July 6, 2020 15:17
Copy link
Contributor

@lzchen lzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Well written. We may want to expand on this once tail based sampling gets implemented.

Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a way to better surface this documentation in the left-hand navigation? or if that's already there, apologies but if you could illustrate that a little.

@@ -1,5 +1,5 @@
opentelemetry.trace.sampling
============================
Sampling Traces
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right file to make these changes? One con I see is that documentation on traces won't show up in the top level search tree, so it'll be much harder to find.

Also if you look at the current API docs page, there'll be no information that there are more meaningful docs here:

https://opentelemetry-python.readthedocs.io/en/stable/api/api.html

It'll just say "opentelemetry.trace package".

I would advocate for something to bring this documentation into top-level nav, for easy discovery.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed - does it make sense to just move it up a level into the OpenTelemetry Python API package or should it be alone in its own subsection somewhere? Or maybe this just means every package with actual documentation should have a non-default name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's actually fine to leave this doc as a submodule of trace: it doesn't seem to be a "top-level" concept or a pillar of observability. We might have sampling for logging in the future, but the API surface and behaviour are different enough that I think we should have a separate opentelemetry.logging.sampling package for that when that days comes. Addressing @toumorokoshi comment about difficulty finding the doc, this behaviour is the same for all the documentation that isn't in the top level search tree, so why should sampling be any different? Unless if we have a flat tree and place all the modules in the top level, there's going to be some searching involved to find these topics (i'm also against a flat tree because it is too complicated visually).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess for me, common intermediate-advanced use cases should be called out. In this case, I feel that configuring a sample is a common use case.

@cnnradams if you want to make a final call here, this PR content is good regardless of whether it's exposed at the top level. So if you at least address my existing comments, I'll approve.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these docs are in a reasonable location if the names of the parent/sibling pages don't look like they are auto-generated (specifically, the OpenTelemetry Python API page makes every child look auto generated). Common use cases like sampling probably should be described through root level examples which link to the actual documentation.

either way, addressed your other comments.

"""
Trace data is often produced in large volumes, it is not only expensive to collect and store but also expensive to transmit.

In order to strike a balance between observability and expenses, traces are sampled. Sampling is the process by which a decision is made on whether to process/export a span or not.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In order to strike a balance between observability and expenses, traces are sampled. Sampling is the process by which a decision is made on whether to process/export a span or not.
In order to strike a balance between observability and expenses, traces can be sampled. Sampling is the process by which a decision is made on whether to process/export a span or not.

@lzchen
Copy link
Contributor

lzchen commented Jul 13, 2020

Also, it looks like sampling is an SDK concept? https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#sampling

This looks like it means library authors do not have control over what sampler is finally used by the app user, and SDK authors will have to write their own samplers.

I believe it is similar to Resource because both samplers and resources are not needed for instrumentation, so both don't really belong in the API.

EDIT:Created an issue [here](#906}

.. code:: python

from opentelemetry import trace
from opentelemetry.trace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

)

# created spans will now be sampled by the ProbabilitySampler
with trace.get_tracer().start_as_current_span("Test Span"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think get_tracer() needs a module name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!

@codeboten codeboten merged commit 5cb01d6 into open-telemetry:master Jul 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants