Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement safe storage handles #7934

Merged
merged 10 commits into from
Oct 31, 2024
Merged

Implement safe storage handles #7934

merged 10 commits into from
Oct 31, 2024

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Oct 29, 2024

This implements ChunkStoreHandle & QueryCacheHandle which, among other things, allow for streaming dataframes across FFI and network barriers.

/// A ref-counted, inner-mutable handle to a [`QueryCache`].
///
/// Cheap to clone.
///
/// It is possible to grab the lock behind this handle while _maintaining a static lifetime_, see:
/// * [`QueryCacheHandle::read_arc`]
/// * [`QueryCacheHandle::write_arc`]
#[derive(Clone)]
pub struct QueryCacheHandle(Arc<parking_lot::RwLock<QueryCache>>);

/// A ref-counted, inner-mutable handle to a [`ChunkStore`].
///
/// Cheap to clone.
///
/// It is possible to grab the lock behind this handle while _maintaining a static lifetime_, see:
/// * [`ChunkStoreHandle::read_arc`]
/// * [`ChunkStoreHandle::write_arc`]
#[derive(Clone)]
pub struct ChunkStoreHandle(Arc<parking_lot::RwLock<ChunkStore>>);

Those handles on their own are extremely problematic though: letting them loose all across the codebase effectively wraps the entire codebase in a semantically-unsafe{} block where every row interacting with these handles can lead to very nasty race conditions and even deadlocks.
That's why this PR also introduces the StorageEngine type, which makes using these handles actually safe in practice:

/// Keeps track of handles towards a [`ChunkStore`] and its [`QueryCache`].
///
/// A [`StorageEngine`] doesn't add any feature on top of what [`ChunkStoreHandle`] and
/// [`QueryCacheHandle`] already offer: the job of the [`StorageEngine`] is to leverage the type
/// system in order to protect against deadlocks and race conditions at compile time.
///
/// The handles stored within will never be publicly accessible past construction.
///
/// The underlying [`ChunkStore`] and [`QueryCache`] can be accessed through one of the
/// following methods:
/// * [`StorageEngine::read`]
/// * [`StorageEngine::read_arc`]
/// * [`StorageEngine::write`]
/// * [`StorageEngine::write_arc`]
#[derive(Clone)]
pub struct StorageEngine {
    store: ChunkStoreHandle,
    cache: QueryCacheHandle,
}

Balancing these safety guarantees with the flexibility we need (today... and tomorrow!) for all our corner use cases has proven to be a subtle art...

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!
  • If have noted any breaking changes to the log API in CHANGELOG.md and the migration guide

To run all checks from main, comment on the PR with @rerun-bot full-check.

@teh-cmc teh-cmc added ⛃ re_datastore affects the datastore itself 🔍 re_query affects re_query itself include in changelog labels Oct 29, 2024
@teh-cmc teh-cmc marked this pull request as ready for review October 29, 2024 17:33
@teh-cmc teh-cmc force-pushed the cmc/storage_engine branch from 25ad40f to fc310cc Compare October 29, 2024 18:00
@teh-cmc
Copy link
Member Author

teh-cmc commented Oct 30, 2024

Quickly drafting to check whether there are some remnants from the first four iterations that have made it into the final thing and that could just be removed/simplified now, which is very possible because it's all a big blur at this point 🫠

@teh-cmc teh-cmc marked this pull request as draft October 30, 2024 07:49
@teh-cmc
Copy link
Member Author

teh-cmc commented Oct 30, 2024

Quickly drafting to check whether there are some remnants from the first four iterations that have made it into the final thing and that could just be removed/simplified now, which is very possible because it's all a big blur at this point 🫠

Nope, I mostly just ran into differently shaped walls.

@teh-cmc teh-cmc marked this pull request as ready for review October 30, 2024 08:32
@teh-cmc teh-cmc force-pushed the cmc/storage_engine branch from f417f9d to 2fc33cc Compare October 31, 2024 08:43
@teh-cmc
Copy link
Member Author

teh-cmc commented Oct 31, 2024

🙏

@teh-cmc teh-cmc merged commit 962b7de into main Oct 31, 2024
35 checks passed
@teh-cmc teh-cmc deleted the cmc/storage_engine branch October 31, 2024 09:02
Copy link
Contributor

@zehiko zehiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I learned a lot reviewing this PR and I really appreciate nice documentation! I believe it's clear to me how this addresses use cases like streaming and why the protection is in place, but folks with more experience with the code base I'm sure can provide more valuable feedback.


// NOTE: None of these fields should ever be publicly exposed, either directly or through a method,
// as it is always possible to go back to an actual `RwLock` via `RwLockWriteGuard::rwlock`.
// Doing so would defeat the deadlock protection that the `StorageEngine` offers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps it's worth documenting an example use case of doing the wrong thing without the protection that StorageEngine provides?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
include in changelog ⛃ re_datastore affects the datastore itself 🔍 re_query affects re_query itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce cheaply clonable ChunkStoreHandle and QueryCacheHandle
2 participants