-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New ChunkStore APIs to facilitate data access #7284
Comments
pub fn range_query(
&self,
query: RangeQueryExpression,
columns: Option<Vec<ColumnDescriptor>>,
) -> QueryHandle {
todo!()
}
impl QueryHandle {
pub fn get(row_range: RangeInclusive) -> RecordBatch; // <<< semantically
// All returned RecordBatches have the same schema, which might lead to empty columns.
pub fn get(row_range: RangeInclusive) -> impl Iterator<Item = RecordBatch>;
let recbatch = get().collect();
} No MVCC (yet :wink_wink:) |
6 tasks
This was referenced Sep 3, 2024
teh-cmc
added a commit
that referenced
this issue
Sep 4, 2024
All the boilerplate for the new `re_dataframe`. Also introduces all the new types: * `QueryExpression`, `LatestAtQueryExpression`, `RangeQueryExpression` * `QueryHandle`, `LatestAtQueryHandle` (unimplemented), `RangeQueryHandle` (unimplemented) * `ColumnDescriptor`, `ControlColumnDescriptor`, `TimeColumnDescriptor`, `ComponentColumnDescriptor` No actual code logic, just definitions. * Part of #7284 --- Dataframe APIs PR series: - #7338 - #7339 - #7340 - #7341 - #7345
teh-cmc
added a commit
that referenced
this issue
Sep 4, 2024
Implements the latest-api dataframe API. Examples: ``` cargo r --all-features -p re_dataframe --example latest_at -- /tmp/helix.rrd cargo r --all-features -p re_dataframe --example latest_at -- /tmp/helix.rrd /helix/structure/scaffolding/** ``` ```rust use itertools::Itertools as _; use re_chunk::{TimeInt, Timeline}; use re_chunk_store::{ChunkStore, ChunkStoreConfig, LatestAtQueryExpression, VersionPolicy}; use re_dataframe::QueryEngine; use re_log_types::StoreKind; fn main() -> anyhow::Result<()> { let args = std::env::args().collect_vec(); let get_arg = |i| { let Some(value) = args.get(i) else { eprintln!( "Usage: {} <path_to_rrd> <entity_path_expr>", args.first().map_or("$BIN", |s| s.as_str()) ); std::process::exit(1); }; value }; let path_to_rrd = get_arg(1); let entity_path_expr = args.get(2).map_or("/**", |s| s.as_str()); let stores = ChunkStore::from_rrd_filepath( &ChunkStoreConfig::DEFAULT, path_to_rrd, VersionPolicy::Warn, )?; for (store_id, store) in &stores { if store_id.kind != StoreKind::Recording { continue; } let cache = re_dataframe::external::re_query::Caches::new(store); let engine = QueryEngine { store, cache: &cache, }; let query = LatestAtQueryExpression { entity_path_expr: entity_path_expr.into(), timeline: Timeline::log_time(), at: TimeInt::MAX, }; let query_handle = engine.latest_at(&query, None /* columns */); let batch = query_handle.get(); eprintln!("{query}:\n{batch}"); } Ok(()) } ``` * Part of #7284 --- Dataframe APIs PR series: - #7338 - #7339 - #7340 - #7341 - #7345
teh-cmc
added a commit
that referenced
this issue
Sep 4, 2024
Implements the paginated dense range dataframe APIs. If there's no off-by-one anywhere in there, I will eat my hat. Getting this in the hands of people is the highest prio though, I'll add tests later. ![image](https://github.com/user-attachments/assets/e865ba62-21db-41c1-9899-35a0e7aea134) ![image](https://github.com/user-attachments/assets/32934ba8-2673-401a-aafc-409dfbe9b2c5) * Fixes #7284 --- Dataframe APIs PR series: - #7338 - #7339 - #7340 - #7341 - #7345
abey79
added a commit
that referenced
this issue
Sep 4, 2024
) ### What - Part of: #7279 This PR updates the dataframe query override UI as per design in #7279, in particular adding PoV entity and component. - updated UI layout - time boundaries default to `+∞`/`–∞` button which, when clicked, turn into editable time drag value - reset buttons to go back to the `∞` state - auto-selection of PoV component based on PoV entity (picks a required component for one of the entity archetypes) **Note**: - This is a pure UI PR. The PoV entity/component are not yet used at all for the dataframe's content (that will be addressed in a follow-up PR currently blocked on #7284). - Ignore the ugly "Time range table order" part, this will be cleaned up later (#7070) https://github.com/user-attachments/assets/32151a1f-b0ca-4e99-99df-ea730451d4dc ### Checklist * [x] I have read and agree to [Contributor Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and the [Code of Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md) * [x] I've included a screenshot or gif (if applicable) * [x] I have tested the web demo (if applicable): * Using examples from latest `main` build: [rerun.io/viewer](https://rerun.io/viewer/pr/7331?manifest_url=https://app.rerun.io/version/main/examples_manifest.json) * Using full set of examples from `nightly` build: [rerun.io/viewer](https://rerun.io/viewer/pr/7331?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json) * [x] The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG * [x] If applicable, add a new check to the [release checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)! * [x] If have noted any breaking changes to the log API in `CHANGELOG.md` and the migration guide - [PR Build Summary](https://build.rerun.io/pr/7331) - [Recent benchmark results](https://build.rerun.io/graphs/crates.html) - [Wasm size tracking](https://build.rerun.io/graphs/sizes.html) To run all checks from `main`, comment on the PR with `@rerun-bot full-check`.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This proposal covers 3 newish things.
ComponentColumnDescriptor
that is roughly inspiried by our future vision of tagged componentsTimeColumnDescriptor
as a way of talking about how temporal data will be materialized in the record-batches. Because the user has the abillity to pass in the list of columns they are interested in, they can choose whether or not they want the query results to include a temporal column. This includes supporting use-cases like including thelog_time
column when executing a range query according toframe
as the timeline.RowJoinPolicy
in the range query that determines the behavior. This policy seems fairly unambiguous and the existence of the different options and abillity to switch between them seems like it has the potential to clarify for a user what's happening rather than making it always magic.Proposed interfaces:
The text was updated successfully, but these errors were encountered: