Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommend a default size of 1 for the SimpleFixedSizeExemplarReservoir #3670

Merged
merged 4 commits into from
Aug 24, 2023

Conversation

MrAlias
Copy link
Contributor

@MrAlias MrAlias commented Aug 21, 2023

This reservoir type is used for all aggregations other than a histogram with more than one bucket. Each attribute set the aggregation records will have an instance of this reservoir. Therefore, limiting this to a small value by default when enabled is preferable.

This does not address the way a user will configure this value. That is left for a future PR/Issue.

Part of #2421

Additional context

For similar aggregation types, OpenMetrics allows one exemplar.

It looks like OpenCensus expected libraries to "retain and export at least [one] exemplar". But it is a little unclear if this is comparable.

This reservoir type is used for all aggregations other than a histogram
with more than one bucket. Each attribute set the aggregation records
will have reservoir. Therefore, limiting this to a small value by
default when enabled is preferable.

This does not address the way a user will configure this value. That is
left for a future PR/Issue.

Part of open-telemetry#2421
@MrAlias MrAlias added the spec:metrics Related to the specification/metrics directory label Aug 21, 2023
@MrAlias MrAlias marked this pull request as ready for review August 21, 2023 21:14
@MrAlias MrAlias requested review from a team August 21, 2023 21:14
Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fully convinced the limit of "1" by OpenMetrics is a great default. IIUC - It's more a limitation of their protocol (only one exemplar per "point").

Specifically -

SimpleFixedSizeExemplarReservoir is likely the right reservoir for ExponentialHistograms. For these, we should definitely encourage a larger pool size.

Limiting to 1 is fine for now, but with ExponentialHistograms moving towards stable I'd encourage us to call out a different default for those.

@MrAlias
Copy link
Contributor Author

MrAlias commented Aug 22, 2023

SimpleFixedSizeExemplarReservoir is likely the right reservoir for ExponentialHistograms. For these, we should definitely encourage a larger pool size.

Would it make sense in this case to use a default size equal to the exponential histogram's max size configuration?

I can update for that aggregator in a follow up PR if that makes sense @jsuereth.

@jsuereth
Copy link
Contributor

Would it make sense in this case to use a default size equal to the exponential histogram's max size configuration?

I actually think the exponential histogram's default may be a bit too large right now, at least if we also track exemplars there. I'd suggest limiting to a range of 5-20 exemplars (or some division of the exponential histogram's max size) purely from an overhead standpoint on tracking these points. I don't have hard numbers to back this up, just some anecdotes.

Here's some VERY lazy math with naive assumptions:

Explicit Bucket histograms

  • Default 15 boundaries, 16 buckets
  • 1:1 exemplar to bucket
  • Likely storage per bucket count = 8 bytes
  • Max, min, sum storage = 8*3 = 24 bytes
  • Memory overhead without Exemplars: 168 + 83 = 152 bytes
  • Likely storage per exemplar (best-case pointer to shared attributes, pointer to shared context + measurement + timestamp) = 32 bytes
  • Memory head just for Exemplars = 512 bytes
  • Total overhead / histogram = 664 bytes

Exponential Bucket Histogram

  • Default 320 buckets + overflow (Let's assume negative measurements don't exist)
  • Likely storage per bucket count = ~2 bytes (assuming we're using "adapting integer" algorithm where we scale arrays as counts grow, and we're in a sweet spot of measurement-counts)
  • max, min, sum storage = 8*3 bytes
  • Memory overhead without exemplars - ~640 + 24 = 664 bytes

Given the key size differential, I don't think we should be spending quite as much memory overhead on exemplars. Particularly because Exponential Buckets are intended to give us very high accuracy percentiles. I think we're better of attempting to grab lower number but higher entropy exemplars.

I can update for that aggregator in a follow up PR if that makes sense @jsuereth.

that's good with me!

@MrAlias
Copy link
Contributor Author

MrAlias commented Aug 24, 2023

I can update for that aggregator in a follow up PR [...]

Tracking with #3674

@reyang reyang merged commit 59455fc into open-telemetry:main Aug 24, 2023
7 checks passed
@MrAlias MrAlias deleted the default-fixed-size-res branch August 24, 2023 23:24
mateuszrzeszutek pushed a commit to mateuszrzeszutek/opentelemetry-specification that referenced this pull request Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:metrics Related to the specification/metrics directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants