-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory for active series custom trackers #5853
Conversation
activeNativeHistogramBuckets uint32 // Number of buckets in active native histogram entries in this stripe. Only decreased during purge or clear. | ||
activeMatchingNativeHistogramBuckets []uint32 // Number of buckets in active native histogram entries in this stripe matching each matcher of the configured Matchers. | ||
active uint32 // Number of active entries in this stripe. Only decreased during purge or clear. | ||
activeMatching map[uint16]uint32 // Number of active entries in this stripe matching each matcher of the configured Matchers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the past we didn't use a map because it would be significantly slower to access than slices, if there are few elements. I'm very curious to see the benchmarks for different scenarios (no custom trackers, few custom trackers, many custom trackers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually in the past we did use a map, and then we switched to a slice. The reason behind this is that usually a series only matches between zero and one tracker (the trackers are usually the integration where they come from) with a few more in some use cases, but never many.
Edit: oh, I see we're not talking about specific series here but about the stripes, in which case you're right, an array is much faster to access, while the gain is probably not that big. A customer with 100 custom matchers would use 4 * 100 * 128 = 50Kb for all stripes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation for this is #5318 (comment) and https://raintank-corp.slack.com/archives/C04AT2ANL90/p1690989928160289
Before @bboreham 's recent change in #5665 it was taking up a lot of memory, and even afterwards we are still tripling the amount needed as we are tracking native histogram series and buckets as well. Discussed with @krajorama and we do need to track those, and it makes sense to use the same set of custom matchers for float series and native histogram series, so we are exploring other ways to minimise the memory usage especially in cases where native histograms are not being used. So we could just implement the change for native histograms and leave the float active series tracking alone, but we thought we might as well explore the possibility of making them consistent.
Benchmark results:
compare_BenchmarkActiveSeries_UpdateSeries$.txt
compare_BenchmarkMatchesSeries.txt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note in your benchmarks that UpdateSeries
is 5% slower in meaningful cases (when there are series, with or without matchers) while there's no memory improvement for the meaningful cases (there's only improvement when there are no series, saving 150Ki, same order of magnitude that I made above, but when there are 100K series the saving is 0.01% (because it's constant, doesn't depend on the series amount). There's also an increase in allocations, which will slow down the process more (as it generates more garbage to collect).
I wouldn't trade a 5% speed (which stretches with number of series) for 150Ki of constant memory.
2a852bb
to
278add9
Compare
Closing for now due to inactivity, but feel free to re-open it anytime if you plan to get back to this. Thanks!. |
What this PR does
Reduce memory used by active series custom trackers by using maps (sparse) instead of slices. Aim is to at least reduce it for native histograms, but experimenting with applying the same change to normal active series as well, benchmark results below.
Which issue(s) this PR fixes or relates to
Fixes #5812
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]