Tracking issue: end-to-end batching #1619

teh-cmc · 2023-03-20T11:16:02Z

Will create individual issues as the need arises.
Most likely an evolving document.

The text was updated successfully, but these errors were encountered:

teh-cmc · 2023-04-17T09:58:00Z

Copy pasting the discord thread regarding Data{Cell,Row,Table} and the new datastore for posterity.

===

Hey folks, ~~quick~~ update on the data front: we've been putting a lot of effort into redesigning and reimplementing some of our core data structures and pipelines lately. The goal is to make them align better with the user-facing data model that we've been refining for the past year.

In practice, this already translates into very significant compute & memory performance improvements across the stack starting today, and paves the way for even more of those in the future (ingestion speed, query speed, memory usage, network bandwidth, garbage collection throughput & latency...).

These changes are available right now on latest main (starting with 925f531), and should ship as part of the next (0.5.0) release.
Note: this breaks compatibility for .rrd files, you'll have to regenerate those!

The first big chunk of this work was the introduction of new core data types to abstract over raw Arrow data: DataCell, DataRow & DataTable (#1634, #1636, #1673, #1679).

These new abstractions make it much more manageable to work efficiently with raw Arrow data across the entire stack (SDK, transport, datastore, query layer... all the way from the clients up to the renderer!), as well as guard against common Arrow pitfalls.
It is now easier to implement new data centric features, one example of which is micro-batching: an upcoming feature (#1619) for our SDKs that will significantly improve network bandwidth and ingestion speeds.

Then comes the new datastore itself (#1727, #1735, #1739, #1785, #1791, #1795, #1801), which builds upon these new types and gets the store's internals closer to the overarching data model.

The result is much faster query speeds and drastically reduced memory usage.
This new store also comes with a precise garbage collector that should never miss a single byte, meaning you can now use our memory limit feature (https://www.rerun.io/docs/howto/limit-ram) to stream in never-ending workloads.

Applications that put the most stress on the store will of course be the one benefiting the most from these changes.
Since the performance of the store scales relative to the number of events (i.e. log calls) being stored rather than the size of the data, this means applications that are using a large amount of scalars/plots, text logs, range queries (e.g. the Visible History feature), and other things of that nature (i.e. many small events rather than a few large ones) will see the most drastic improvements.

To demonstrate all of this we can use our official clocks example, coupled with the Visible History feature, which is the ultimate stress test for our datastore.

Running the simulation for 50'000 frames, then replaying it at 180x speed with 1000 frames of visible history buffer for the minute hand of the clock 👇

Before: ~15ms per frame / ~4.5GiB of RAM required:

23-04-13_144446.patched.mp4

After: ~7ms per frame / ~920MiB of RAM required:

23-04-13_144915.patched.mp4

So, roughly a ~2x improvement in frame times and ~5x in memory usage!

emilk · 2023-04-17T12:20:11Z

The win from not logging log_time is quite small. The RowId is 16 bytes, and the log_time column is 8 bytes, so even for zero-sized components the memory wins will be at most 33%. I'm not sure that justifies the added complexity at this point.

emilk mentioned this issue Mar 20, 2023

Batch data ingestion #1395

Closed

teh-cmc self-assigned this Mar 20, 2023

This was referenced Mar 28, 2023

Datastore revamp 1: new indexing model & core datastructures #1727

Merged

Datastore: revamp bench suite #1733

Merged

Datastore revamp 2: serialization & formatting #1735

Merged

Datastore revamp 3: efficient incremental stats #1739

Merged

This was referenced Apr 5, 2023

Datastore revamp 4: sunset MsgId #1785

Merged

Datastore revamp 5: DataStore::to_data_tables() #1791

Merged

Datastore revamp 6: sunset LogMsg storage + save store to disk #1795

Merged

Datastore revamp 7: garbage collection #1801

Merged

teh-cmc mentioned this issue Apr 13, 2023

RFC: datastore state of the union & end-to-end batching #1610

Merged

This was referenced Apr 20, 2023

Release 0.5.0 #1919

Merged

SDK batching/revamp 1: impl DataTableBatcher #1980

Merged

Replace Session with RecordingContext #1983

Merged

teh-cmc mentioned this issue Apr 27, 2023

SDK batching/revamp 3: sunset PythonSession #1985

Merged

6 tasks

teh-cmc closed this as completed in #1985 May 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking issue: end-to-end batching #1619

Tracking issue: end-to-end batching #1619

teh-cmc commented Mar 20, 2023 •

edited by emilk

Loading

teh-cmc commented Apr 17, 2023

emilk commented Apr 17, 2023

Tracking issue: end-to-end batching #1619

Tracking issue: end-to-end batching #1619

Comments

teh-cmc commented Mar 20, 2023 • edited by emilk Loading

teh-cmc commented Apr 17, 2023

emilk commented Apr 17, 2023

teh-cmc commented Mar 20, 2023 •

edited by emilk

Loading