-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
output/cloudv2: Compact histogram #3169
Conversation
3019add
to
8657eeb
Compare
Codecov Report
@@ Coverage Diff @@
## master #3169 +/- ##
==========================================
- Coverage 72.83% 72.63% -0.20%
==========================================
Files 255 253 -2
Lines 19611 19630 +19
==========================================
- Hits 14283 14259 -24
- Misses 4432 4469 +37
- Partials 896 902 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
|
8657eeb
to
5ba1008
Compare
The histogram now uses a more compact solution for storing the distribution. It tracks only the significant buckets.
5ba1008
to
b7cb2b3
Compare
output/cloud/expv2/hdr.go
Outdated
// It does not include counters for the untrackable values, | ||
// because they contain exception cases and require to be tracked in a dedicated way. | ||
Buckets map[uint32]uint32 | ||
|
||
// Indexes keeps an ordered slice of unique-seen buckets' indexes. | ||
// It allows to iterate the buckets in order. It uses an ascendent order. | ||
Indexes []uint32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may consider using a Btree as an optimization. But I would like to achieve stability in the protocol before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can't just order the buckets indexes at the end 🤷. There is no need to constantly keep track of it -if we only want it to be in order 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it simplifies the code a lot, good suggestion.
output/cloud/expv2/hdr.go
Outdated
spans []*pbcloud.BucketSpan | ||
) | ||
|
||
// allocate only if at least one item is available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A dummy question, is it possible to have a histogram without values (and indexes)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we only send observed metrics in the aggregation time period - it should not be possible. The first time we see a metric we create the histogram structure and add the first sample.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without values not, but it is possible to have Indexes and Buckets empty when there are only untrackable values that are tracked into the special extreme buckets.
output/cloud/expv2/hdr.go
Outdated
if h.Buckets == nil { | ||
h.Buckets = make(map[uint32]uint32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be done just when we make the histogram struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I move it to a constructor
output/cloud/expv2/hdr.go
Outdated
} | ||
// if the current and the previous indexes are not consecutive | ||
// consider as closed the current on-going span and start a new one. | ||
if diff := h.Indexes[i] - h.Indexes[i-1]; diff > 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For more optimal space use this likely should be 3+ as adding 1 span is at least 2 counters (offset and length) .
So we can add at least 2 zeros before we became even. "at least" here is as I don't know if there is some more overhead for this in the protobuf protocol
edit: This can be done after the original merge as an optimization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! But we can probably fix some of the small fixes before merging.
The optimizations can likely wait a while and maybe benchmarks ;)
3a2f799
to
976a8ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! 🚀
What?
Implementation of a more compact Protobuf representation for the Histogram type. This new version only stores and pushes the histogram's significant buckets (non-zero).
Why?
The current implementation only trims the zero at the limit of the histogram, doing a massive waste of memory if the histogram has a sparse representation. The new version instead allocates only for non-zero buckets.
Related PR(s)/Issue(s)
Updates #3117