Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exemplars prototype #936

Closed
wants to merge 6 commits into from
Closed

Conversation

cnnradams
Copy link
Member

This PR introduces exemplars into the SDK (for details on exemplars, see OTEP #113)

These change are (of course) experimental, as no major backend supports "statistical" exemplars and there is no concrete SDK spec for exemplars yet (only an OTEP). So any feedback is very much appreciated :)

Built off of the views API prototype: #596

@cnnradams cnnradams requested a review from a team July 23, 2020 20:01
@cnnradams cnnradams force-pushed the exemplars branch 6 times, most recently from ab2c1fa to 0be0964 Compare July 24, 2020 14:43
@cnnradams cnnradams changed the title WIP: Exemplars prototype Exemplars prototype Jul 27, 2020
Copy link

@jmacd jmacd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

My approval here should carry only so much weight because I'm not very familiar with the Python SDK but this code generally looks readable and feels familiar. I have been thinking about how precisely I would integrate sampling in the SDK, and came up with two possibilities. One of the possibilities is to implement exemplars in the Aggregators themselves, as you've done, an d another possibility is to implement something in the Accumulator that might support exemplars in a less Aggregator-dependent way. Of course, you're also showing why we want the Aggregator to see the Update, so that Histograms can select exemplars by bucket.

To get this behavior w/ the first approach is straight forward, though it requires the Aggregator API to receive the excluded labels (as you have done), whereas up until now the Aggregators only know about numbers and math--labels are orthogonal to Aggregator state.

I thought about the other possibility (that exemplars can be Accumulator functionality) because it would be nice if we could compose something with any other aggregator to make this work. Imagine a class, named ExemplarAggregator, that can be used to select Aggregators in parallel any other aggregation (e.g., Histogram, Exact, ...). This is less straight forward, because now there's some kind of dependency between exemplar selection and an independent aggregator, and that's why I'm approving your approach.

For the record though, I put up a draft PR that begins this other approach here: open-telemetry/opentelemetry-go#1023. The idea is that the Accumulator can be told which dimensions to include (thus which to exclude) for each metric instrument, and that it therefore knows when to call the sampling manager. See: https://github.com/open-telemetry/opentelemetry-go/pull/1023/files#diff-4c3de179542bac3c5ceacd2090215160R225. I haven't worked out the details.

@cnnradams cnnradams force-pushed the exemplars branch 2 times, most recently from 142c4ea to 654f185 Compare August 4, 2020 14:11
@cnnradams cnnradams changed the base branch from views to master August 4, 2020 14:14
- Exemplars will be picked to represent the input distribution, without unquantifiable bias
- A "sample_count" attribute will be set on each exemplar to quantify how many measurements each exemplar represents

See 'statistical_exemplars.ipynb' for the example (TODO: how do I link this?)
Copy link
Contributor

@lzchen lzchen Aug 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you not able to just link to the python file instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also do something like this if you want it inlined in the docs.

.. literalinclude:: basic_trace.py
     :language: python
     :lines: 1-

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did literalinclude of the python examples, would have preferred a link to the jupyter notebook but this will work for now

@@ -0,0 +1,84 @@
# Copyright The OpenTelemetry Authors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can we rename basic_meter to metrics and put this folder under that in the examples?

docs/examples/exemplars/README.rst Outdated Show resolved Hide resolved
docs/examples/exemplars/statistical_exemplars.py Outdated Show resolved Hide resolved
docs/examples/exemplars/statistical_exemplars.py Outdated Show resolved Hide resolved
docs/examples/exemplars/statistical_exemplars.py Outdated Show resolved Hide resolved
self.sample_set[replace_index] = exemplar

def merge(self, set1, set2):
combined = set1 + set2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the algorithm Josh linked, I don't think merging like this would give a uniform sample. In the extreme case, imagine k = 1 and you have two samplers; sampler1's rand_count = 1 (population size of 1) and sampler2 finishes with rand_count = 10. sampler1's single value would be weighted much more strongly than any of the values sampler2 saw.

E.g. sampler1 see value [0], sampler2 sees values [1, 2, 3, ..., 10] but keeps only k = 1 sample, say, [6]. Randomly sampling k = 1 values from [0, 6] would give [0] 50% of the time, even though it was one of 11 values sampled in the whole population.

I think there is a similar issue when arg_count > k. The wiki page says each item is sampled with probability k/n; if the samplers being merged saw different n, then you would end up giving more weight to the sampler with smaller n.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed offline, just keeping the second set's exemplars with that assumption based on the current SDK implementation. Might need to change in the future but for now that keeps this much simpler

@aabmass
Copy link
Member

aabmass commented Aug 6, 2020

I love the Jupyter Notebook! I think it would be awesome to add nbsphinx to the docs setup so you could include the notebook directly in the docs, if others are cool with that.

@cnnradams
Copy link
Member Author

I love the Jupyter Notebook! I think it would be awesome to add nbsphinx to the docs setup so you could include the notebook directly in the docs, if others are cool with that.

I actually tried to use nbsphinx, and found that the notebook was awful looking inline, due to its margins being the full page but the docs being only half a page. Maybe if I put it on its own page it would be different?

@aabmass
Copy link
Member

aabmass commented Aug 6, 2020

I actually tried to use nbsphinx, and found that the notebook was awful looking inline, due to its margins being the full page but the docs being only half a page. Maybe if I put it on its own page it would be different?

Ya, not too sure, ive never used it myself. If nothing else works, you could probably render it to HTML then link or embed that somehow?

Copy link
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more small nits


def __init__(
self,
config: dict,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you know the exact keys and types, try a TypedDict

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't know the exact types - it may change depending on what aggregators need to know

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean for custom aggregators?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the values already used are int, str, bool, list, and there is no restriction on passing in something like a float to the config dict

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow you. Are you saying you would use the same key with different value types depending on the aggregator?

@cnnradams cnnradams force-pushed the exemplars branch 2 times, most recently from bc1b06b to e4e8c20 Compare August 7, 2020 17:13
@jmacd
Copy link

jmacd commented Aug 8, 2020

I left a lengthy remark on this topic here:
open-telemetry/opentelemetry-specification#617 (comment)

@cnnradams
Copy link
Member Author

Closing for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants