Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datastore: serialize back into a stream of MsgBundles #1527

Closed
wants to merge 8 commits into from

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Mar 7, 2023

This PR adds DataStore::as_msg_bundles(), which serializes a DataStore back into a stream of MsgBundles that is functionally equivalent to the original stream it was built from.

Some shortcuts are taken, which is why the output stream is functionally equivalent but not yet identical to the original input stream: in particular, autogenerated cluster keys are dumped as if they were user-defined. That's for another PR.

This gets us most of the way towards #1394, although we'll still need another PR to integrate it all into the save-to-file logic.

This also fixes a nasty instability issue when sorting the dataframes used for testing, which cost me some hair while writing this and might possibly be the reason for the flaky gc_correct test we've seen for a while.

@teh-cmc teh-cmc added enhancement New feature or request ⛃ re_datastore affects the datastore itself labels Mar 7, 2023
@teh-cmc
Copy link
Member Author

teh-cmc commented Mar 7, 2023

Just realized we'll want to be able to optionally specify a time range ultimately (e.g. for saving a selection); this is trivial to do at this point and will come in due time in another PR.

@teh-cmc teh-cmc changed the title re_datastore: serialize back into a stream of MsgBundles datastore: serialize back into a stream of MsgBundles Mar 7, 2023
@teh-cmc
Copy link
Member Author

teh-cmc commented Mar 8, 2023

Putting this on hold for now, see #1535 for rationale.

@teh-cmc teh-cmc marked this pull request as draft March 8, 2023 16:49
@teh-cmc
Copy link
Member Author

teh-cmc commented Mar 9, 2023

Whether this is on hold or not, i need to backport the changes to store_polars.rs into main; this is very likely the fix for #894.

@teh-cmc
Copy link
Member Author

teh-cmc commented Mar 9, 2023

Fixed by #1535, which went the other route instead (#1397), for the reasons explained in detail in #1535.

The fixes related to sorting dataframes are backported via #1549.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ⛃ re_datastore affects the datastore itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant