Primary caching 13: stats & memory panel integration for range queries #4785

teh-cmc · 2024-01-11T12:20:20Z

Title.

24-01-11_13.21.53.patched.mp4

Part of the primary caching series of PR (index search, joins, deserialization):

Checklist

I have read and agree to Contributor Guide and the Code of Conduct
I've included a screenshot or gif (if applicable)
I have tested the web demo (if applicable):
- Using newly built examples: app.rerun.io
- Using examples from latest main build: app.rerun.io
- Using full set of examples from nightly build: app.rerun.io
The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG

Wumpf

looking great, but that global switch weirds me out 😄

Wumpf · 2024-01-12T16:13:39Z

crates/re_query_cache/src/cache_stats.rs

@@ -20,6 +20,19 @@ pub fn set_detailed_stats(b: bool) {
    ENABLE_DETAILED_STATS.store(b, std::sync::atomic::Ordering::Relaxed);
 }

+/// If `true`, will show stats about empty caches too, which likely indicates a bug (dangling bucket).


meaning empty caches should be removed automatically?

I find it a bit weird to encounter this atomic here. Isn't that just a property of the ui? We should put it on the MemoryPanel struct instead.
Personally, I'd just always enable this anyways doesn't seem to hurt 🤷

meaning empty caches should be removed automatically?

Yes, or not exist to begin with: either way, empty buckets are bug and we want to be able to see it.

I find it a bit weird to encounter this atomic here. Isn't that just a property of the ui? We should put it on the MemoryPanel struct instead.

I just blindly followed how the other memory panel checkboxes are implemented 🤷‍♂️.
I'll see if I can easily make it a part of the panel.

Personally, I'd just always enable this anyways doesn't seem to hurt 🤷

No it's insanely costly with a real dataset

thanks for upgrading all of them, liking this much better :)

) _99% grunt work, the only somewhat interesting thing happens in `query_archetype`_ Our query model always operates with two distinct timestamps: the timestamp you're querying for (`query_time`) vs. the timestamp of the data you get back (`data_time`). This is the result of our latest-at semantics: a query for a point at time `10` can return a point at time `2`. This is important to know when caching the data: a query at time `4` and a query at time `8` that both return the data at time `2` must share the same single entry or the memory budget would explode. This PR just updates all existing latest-at APIs so they return the data time in their response. This was already the case for range APIs. Note that in the case of `query_archetype`, which is a compound API that emits multiple queries, the data time of the final result is the most recent data time among all of its components. A follow-up PR will use the data time to deduplicate entries in the latest-at cache. --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800

…ation (#4712) Introduces the notion of cache deduplication: given a query at time `4` and a query at time `8` that both returns data at time `2`, they must share a single cache entry. I.e. starting with this PR, scrubbing through the OPF example will not result if more cache memory being used. --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800

Introduces a dedicated cache bucket for timeless data and properly forwards the information through all APIs downstream. --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800

This implements cache invalidation via a `StoreSubscriber`. We keep track of the timestamps to invalidate in the `StoreSubscriber`, but we only do the actual removal of components at query time. This is similar to how we handle bucket sorting in the main store: doing it at query time has the benefit that the frame time effectively behaves as natural micro-batching mechanism that vastly improves performance. --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800

) The primary cache now tracks memory statistics and display them in the memory panel. This immediately highlights a very stupid thing that the cache does: missing optional components that have been turned into streams of default values by the `ArchetypeView` are materialized as such :man_facepalming: - #4779 https://github.com/rerun-io/rerun/assets/2910679/876b264a-3f77-4d91-934e-aa8897bb32fe - Fixes #4730 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800

**Prefer on a per-commit basis, stuff has moved around** Range queries are back!... in the most primitive form possible. No invalidation, no bucketing, no optimization, no nothing. Just putting everything in place. https://github.com/rerun-io/rerun/assets/2910679/a65281e4-9843-4598-9547-ce7e45197995 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800

All done!

… range queries (#4793) Our low-level range APIs used to bake the latest-at results at `range.min - 1` into the range results, which is a big problem in a multi tenant setting because `range(1, 10)` vs. `latestat(1) + range(2, 10)` are two completely different things. Side-effect: a plot with a window of len 1 now behaves as expected: https://github.com/rerun-io/rerun/assets/2910679/957ac367-35a6-4bea-9f40-59d51c556639 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800

The most obvious and most important performance optimization when doing cached range queries: only upsert data at the edges of the bucket / ring-buffer. This works because our buckets (well, singular, at the moment) are always dense. - #4793 ![image](https://github.com/rerun-io/rerun/assets/2910679/7246827c-4977-4b3f-9ef9-f8e96b8a9bea) - #4800: ![image](https://github.com/rerun-io/rerun/assets/2910679/ab78643b-a98b-4568-b510-2b8827467095) --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800

Range queries used to A) return the frame a T-1, B) accumulate state starting at T-1 and then C) yield frames starting at T. A) was a huge issue for many reasons, which #4793 took care of by eliminating both A) and B). But we need B) for range queries to be context-free, i.e. to be guaranteed that `Range(5, 10)` and `Range(4, 10)` will return the exact same data for frame `5`. This is crucial for multi-tenant settings where those 2 example queries would share the same cache. It also is the nicer-nicer version of the range semantics that we wanted anyway, I just didn't realize back then that it would require so little changes, or I would've gone straight for that. --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800 - #4851 - #4852 - #4853 - #4856

Simply add a timeless path for the range cache, and actually only iterate over the range the user asked for (we were still blindly iterating over everything until now). Also some very minimal clean up related to #4832, but we have a long way to go... - #4832 --- - Fixes #4821 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800 - #4851 - #4852 - #4853 - #4856

Implement range invalidation and do a quality pass over all the size tracking stuff in the cache. **Range caching is now enabled by default!** - Fixes #4809 - Fixes #374 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800 - #4851 - #4852 - #4853 - #4856

- Quick sanity pass over all the intermediary locks and refcounts to make sure we don't hold anything for longer than we need. - Get rid of all static globals and let the caches live with their associated stores in `EntityDb`. - `CacheKey` no longer requires a `StoreId`. --- - Fixes #4815 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800 - #4851 - #4852 - #4853 - #4856

teh-cmc added 🔍 re_query affects re_query itself 🚀 performance Optimization, memory use, etc do-not-merge Do not merge this PR exclude from changelog PRs with this won't show up in CHANGELOG.md labels Jan 11, 2024

teh-cmc force-pushed the cmc/primcache_13_range_stats branch 2 times, most recently from a9af62b to 77a0113 Compare January 12, 2024 07:37

This was referenced Jan 12, 2024

Primary caching 14: don't bake LatestAt(T-1) results into low-level range queries #4793

Merged

Primary caching 15: range read performance optimization #4800

Merged

Wumpf self-requested a review January 12, 2024 16:18

Wumpf previously requested changes Jan 12, 2024

View reviewed changes

teh-cmc force-pushed the cmc/primcache_12_barebone_range branch from 17b548b to 5fb6abd Compare January 15, 2024 12:26

Base automatically changed from cmc/primcache_12_barebone_range to main January 15, 2024 14:49

teh-cmc added 4 commits January 15, 2024 15:51

implement stats for range cache

16e9f79

improve timerange formatting

2d54599

turns out we _really_ need some scroll areas in there

52a1d27

toggable 'show_empty_caches'

a79d00d

teh-cmc force-pushed the cmc/primcache_13_range_stats branch from 77a0113 to a79d00d Compare January 15, 2024 14:51

teh-cmc removed the do-not-merge Do not merge this PR label Jan 15, 2024

out-of-band stats settings

53e051a

teh-cmc force-pushed the cmc/primcache_13_range_stats branch from 9ef1839 to 53e051a Compare January 15, 2024 15:15

teh-cmc merged commit bdae240 into main Jan 15, 2024
22 of 31 checks passed

teh-cmc deleted the cmc/primcache_13_range_stats branch January 15, 2024 15:17

This was referenced Jan 18, 2024

Primary caching 16: context-free range semantics #4851

Merged

Primary caching 17: timeless range #4852

Merged

Primary caching 18: range invalidation (ENABLED BY DEFAULT 🎊) #4853

Merged

Primary caching 19 (final): de-staticify cache globals #4856

Merged

teh-cmc added include in changelog and removed exclude from changelog PRs with this won't show up in CHANGELOG.md labels Feb 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Primary caching 13: stats & memory panel integration for range queries #4785

Primary caching 13: stats & memory panel integration for range queries #4785

teh-cmc commented Jan 11, 2024 •

edited by github-actions bot

Loading

Wumpf left a comment

Wumpf Jan 12, 2024

Wumpf Jan 12, 2024

teh-cmc Jan 15, 2024 •

edited

Loading

Wumpf Jan 15, 2024

Primary caching 13: stats & memory panel integration for range queries #4785

Primary caching 13: stats & memory panel integration for range queries #4785

Conversation

teh-cmc commented Jan 11, 2024 • edited by github-actions bot Loading

Checklist

Wumpf left a comment

Choose a reason for hiding this comment

Wumpf Jan 12, 2024

Choose a reason for hiding this comment

Wumpf Jan 12, 2024

Choose a reason for hiding this comment

teh-cmc Jan 15, 2024 • edited Loading

Choose a reason for hiding this comment

Wumpf Jan 15, 2024

Choose a reason for hiding this comment

teh-cmc commented Jan 11, 2024 •

edited by github-actions bot

Loading

teh-cmc Jan 15, 2024 •

edited

Loading