Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic removal of unreachable static chunks #7518

Merged
merged 2 commits into from
Sep 26, 2024

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Sep 26, 2024

Unreachable static chunks are dead weight: there exists no query that can access their data (at least when using Rerun as a Visualizer).

By automatically removing dangling chunks, we make it possible for user to use the Rerun Viewer as a soft-realtime telemetry system (provided we properly invalidate our caches too, which is the subject of an upcoming PR).

This raises the question of what should happen when using Rerun as a database: should this data be kept and made accessible?
If so, this behavior should probably be made configurable (e.g. when instantiating a ChunkStore in the SDK).


Test:

from pathlib import Path

import rerun as rr

image_file_path = Path(__file__).parent / "ferris.png"

rr.init("rerun_example_encoded_image", spawn=True)

for _ in range(0, 10):
    rr.log("image", rr.EncodedImage(path=image_file_path), static=True)

Command:

RERUN_FLUSH_NUM_ROWS=0 python test.py

Before:
image

After:
image

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!
  • If have noted any breaking changes to the log API in CHANGELOG.md and the migration guide

To run all checks from main, comment on the PR with @rerun-bot full-check.

@teh-cmc teh-cmc added ⛃ re_datastore affects the datastore itself 🚀 performance Optimization, memory use, etc include in changelog 🔩 data model labels Sep 26, 2024
@Wumpf Wumpf self-requested a review September 26, 2024 12:50
Copy link
Member

@Wumpf Wumpf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so but what if I have a static image+mediatype and then were to override only the image? Then I'd always keep the first image+mediatype chunk around?

@teh-cmc
Copy link
Member Author

teh-cmc commented Sep 26, 2024

so but what if I have a static image+mediatype and then were to override only the image? Then I'd always keep the first image+mediatype chunk around?

This is the subject of the next PR, but spoiler: no, if you overwrite the buffer, you'll drop every different media-type instantiations of that image..!

@teh-cmc teh-cmc merged commit ca44717 into main Sep 26, 2024
38 checks passed
@teh-cmc teh-cmc deleted the cmc/viewer_cache_cleanup_1_static_overwrites branch September 26, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔩 data model include in changelog 🚀 performance Optimization, memory use, etc ⛃ re_datastore affects the datastore itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants