Invalidate hub-wide caches on deletions and overwrites #7525

teh-cmc · 2024-09-26T16:16:58Z

Hub-wide caches now subscribe to store events and invalidate accordingly in the face of deletions and overwrites.

This is a crutch to compensate for the lack of secondary caching, but a much needed crutch: the Rerun Viewer can now effectively be used as a soft realtime telemetry system.

24-09-26_18.23.02.patched.mp4

Fixes static=True image logging still stores all the data #7404

Checklist

`EncodedImage`

for _ in range(0, 100):
    rr.log("image", rr.EncodedImage(path=image_file_path), static=True)
    time.sleep(0.01) # give time for the viewer to query and cache it

Before: 🟥
After: 🟢

`Mesh3D`

for _ in range(0, 100):
    rr.log(
        "triangle",
        rr.Mesh3D(
            vertex_positions=np.tile(np.array([[0.0, 0.0, 0.0], [1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]), (33333, 1)),
            vertex_normals=[0.0, 0.0, 1.0],
            vertex_colors=[255, 0, 0],
        ),
        static=True,
    )
    time.sleep(0.01) # give time for the viewer to query and cache it

Before: 🟥
After: 🟢

`Asset3D`

for _ in range(0, 100):
    rr.log("world/asset", rr.Asset3D(path=sys.argv[1]), static=True)
    time.sleep(0.01) # give time for the viewer to query and cache it

Before: 🟥
After: 🟢

`TensorData`

for _ in range(0, 1000):
    rr.log("tensor", rr.Tensor(tensor, dim_names=("width", "height", "channel", "batch")), static=True)
    time.sleep(0.01) # give time for the viewer to query and cache it

Before: 🟥
After: 🟢

`AssetVideo`

frame_timestamps_ns = video_asset.read_frame_timestamps_ns()
rr.send_columns(
    "video",
    times=[rr.TimeNanosColumn("video_time", frame_timestamps_ns)],
    components=[rr.VideoFrameReference.indicator(), rr.components.VideoTimestamp.nanoseconds(frame_timestamps_ns)],
)

for _ in range(0, 100):
    rr.log("video", video_asset, static=True)
    time.sleep(0.01) # give time for the viewer to query and cache it

Before: 🟥
After: 🟢

Checklist

I have read and agree to Contributor Guide and the Code of Conduct
I've included a screenshot or gif (if applicable)
I have tested the web demo (if applicable):
- Using examples from latest main build: rerun.io/viewer
- Using full set of examples from nightly build: rerun.io/viewer
The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
If applicable, add a new check to the release checklist!
If have noted any breaking changes to the log API in CHANGELOG.md and the migration guide

To run all checks from main, comment on the PR with @rerun-bot full-check.

Wumpf

👍 nice

looking forward to making the caches more of a store aware thing so we can streamline some of the quasi-copy-pasted code here, but this will do for now; at least it's flexible enough (via being very stupid/simple) to deal with fallbacks & blueprint shenanigans.
(that's pretty much what you mean when complaining about the lack of secondary caches, right?)

examples/python/face_tracking/face_tracking.py

teh-cmc · 2024-09-27T10:00:34Z

that's pretty much what you mean when complaining about the lack of secondary caches, right?

A real secondary cache would be aware of query semantics, and any shadowing introduced by these semantics, which is way more powerful and capable than working with row-ids directly.

E.g.:

it would know to evict temporal data that is shadowed by newly logged static data.
it would know to evict data that is using latest-at semantics if the same index was relogged to.
it would know to evict data if its index is shadowed by a (possibly recursive) clear.
etc.

None of those are possible when working with row-ids directly.

Co-authored-by: Andreas Reich <andreas@rerun.io>

Wumpf · 2024-09-27T10:31:53Z

query awareness is the keyword I was missing for my mental model, makes sense, thanks. Given what visualizer queries look like right now it sound quite challenging.
I figure a good first/next step would be to have more unified mechanism to store all data sources of a given query-entry (i.e. a list of rowids potentially from several stores (rowids are unique so the several store bit doesn't technically matter)) 🤔. Unfortunately that's not really enough since fallbacks may be generated using heuristic output that may change every frame. In that sense we can't make these caches "pure" secondary caches.

teh-cmc added 5 commits September 26, 2024 18:11

forward store events to hub-wide caches

72a90cf

invalidate image caches on deletions and overwrites

1b79ec1

invalidate mesh cache on deletions and overwrites

27e4d4b

invalidate tensor cache on deletions and overwrites

0734138

invalidate video cache on deletions and overwrites

718409b

teh-cmc added 📺 re_viewer affects re_viewer itself 📉 performance Optimization, memory use, etc include in changelog labels Sep 26, 2024

add --static flag to face_tracking example for future reference

33f53a2

teh-cmc marked this pull request as ready for review September 26, 2024 16:56

teh-cmc requested a review from Wumpf September 26, 2024 17:02

Wumpf approved these changes Sep 27, 2024

View reviewed changes

examples/python/face_tracking/face_tracking.py Outdated Show resolved Hide resolved

teh-cmc and others added 2 commits September 27, 2024 12:01

Update examples/python/face_tracking/face_tracking.py

09ad22a

Co-authored-by: Andreas Reich <andreas@rerun.io>

lint

7501d08

teh-cmc merged commit f5aa0a0 into main Sep 27, 2024
24 of 27 checks passed

teh-cmc deleted the cmc/viewer_cache_cleanup_2_subscriptions branch September 27, 2024 10:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalidate hub-wide caches on deletions and overwrites #7525

Invalidate hub-wide caches on deletions and overwrites #7525

teh-cmc commented Sep 26, 2024 •

edited by github-actions bot

Loading

Wumpf left a comment

teh-cmc commented Sep 27, 2024

Wumpf commented Sep 27, 2024

Invalidate hub-wide caches on deletions and overwrites #7525

Invalidate hub-wide caches on deletions and overwrites #7525

Conversation

teh-cmc commented Sep 26, 2024 • edited by github-actions bot Loading

Checklist

EncodedImage

Mesh3D

Asset3D

TensorData

AssetVideo

Checklist

Wumpf left a comment

Choose a reason for hiding this comment

teh-cmc commented Sep 27, 2024

Wumpf commented Sep 27, 2024

teh-cmc commented Sep 26, 2024 •

edited by github-actions bot

Loading

`EncodedImage`

`Mesh3D`

`Asset3D`

`TensorData`

`AssetVideo`