Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
re_query: support for range queries (#653)
* get is supposed to return a row, not a [row] * unwrap note * the bench too * self review * doc test also * and re_query ofc! * slicing is _very_ slow, don't do it if you don't have to * no more col_arrays in re_query * there's actually no need for concatenating at all * incrementally compute and cache bucket sizes * cleaning up and documenting existing limitations * introducing bucket retirement * issue ref * some more doc stuff * self-review * polars/fmt should always be there for tests * streamlining batch support * take list header into account * it's fine * self-review * just something i want to keep around for later * (un)wrapping lists is a bit slow... and slicing them is _extremely_ slow! * merge cmc/datastore/get_a_single_row (#590) * no more col_arrays in re_query * introducing the notion of clustering key, thankfully breaking all tests by design * making good use of that shiny new Instance component * merge cmc/datastore/get_rid_of_copies (#584) * missed one * introducing arrow_util with is_dense_array() * finding the clustering comp of the row... or creating it! * rebasin' * post rebase clean up * addressing PR comments, I hope * ensure that clustering components are properly sorted, failing the existing test suite * build_instances now generate sorted ids, thus greenlighting the test suite * missed a couple * addressed PR comments * going for the ArrayExt route * completing the quadrifecta of checks * the unavoidable typed error revolution is on its way, and it's early * where we're going we don't need polars * update everything for the new APIs * error for unsupported clustering key types * clean up and actually testing our error paths * move those nasty internal tests into their own dirty corner * finally some high-level tests in here * i happen to like where this is going * shuffling things * demonstrating that implicit instances are somehow broken * fully working implicit clustering keys, but demonstrating a sorting issue somewhere * there is still something weird going on tho * latest_at behaving as one would expect * automatically cache generated cluster instances * time to clean up en masse * still want to put some stress on the bucketing * make ArrayExt::is_dense a little more friendly, just in case... * TimeType::format_range * independent latest_at query and using appropriate types everywhere * re_query: use polars/fmt in tests * re_query: remove implicit instances * fixing the u32 vs u64 instance drama * really starting to like how this looks * cluster-aware polars helpers :> * cleanin up tests * continuing cleanup and doc * updating visuals for this brave new world * docs * self-review * bruh * bruh... * ... * outdated comment * no reason to search for it multiple times * polars_helpers => polars_util for consistency's sake * addressing PR comments and a couple other things * xxx * post-merge fixes * TimeInt should be nohash * high-level polar range tools + making first half of range impl pass * implement the streaming half * finally defeated all demons * still passes? * it looks like we've made it out alive * polars util: join however you wish * fixed formatting * point2d's PoVs working as expected * passing full ranges * docs and such part 1, the semantics are hell * fixing the filtering mess in tests * me stoopid * polars docs * addressing the clones * xxx * missed a gazillon conflict somehow * polars util spring cleaning * do indicate and demonstrate that range_components is _not_ a real streaming join * fixed some comments * bruh * screw it, going for the real deal: full streaming joins * YESgit sgit s FINALLY SEMANTICS I ACTUALLY LIKE * yep yep i like this * I hereby declare myself _satisfied_ * initiating the great cleanup * add notes for upcoming terminology pr * bringing IndexRowNr into the mix and slowly starting to fix terminology mess * improving range_components ergonomics * putting it all in self-reviewable state * self-review * add bench * xxx * addressing PR comments * first impl * ported simple_query() to simple_range * doc and such * added e2e example for range queries * self-review * support for new EntityView * demonstrating nasty edge-case with streaming-joins * update streaming-join merging rules to fix said edge case * mimicking range_components' new merging rules * implement PoV-less, always-yield lower-level API + adapt higher-level one * addressing PR comments * ported to new low-level APIs * xxx * addressed PR comments * self and not-so-self reviews * the future is quite literally here
- Loading branch information
1d28875
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust Benchmark
datastore/insert/batch/rects/insert
170871
ns/iter (± 1144
)172198
ns/iter (± 573
)0.99
datastore/latest_at/batch/rects/query
726
ns/iter (± 1
)732
ns/iter (± 4
)0.99
datastore/latest_at/missing_components/primary
299
ns/iter (± 0
)298
ns/iter (± 1
)1.00
datastore/latest_at/missing_components/secondaries
384
ns/iter (± 0
)383
ns/iter (± 0
)1.00
datastore/range/batch/rects/query
45826
ns/iter (± 64
)45707
ns/iter (± 84
)1.00
obj_mono_points/insert
984012092
ns/iter (± 5857099
)844107087
ns/iter (± 4108346
)1.17
obj_mono_points/query
351515
ns/iter (± 1547
)365166
ns/iter (± 3017
)0.96
obj_batch_points/insert
95383856
ns/iter (± 513603
)86515767
ns/iter (± 322674
)1.10
obj_batch_points/query
11312
ns/iter (± 47
)11345
ns/iter (± 25
)1.00
obj_batch_points_sequential/insert
23146684
ns/iter (± 236515
)22972935
ns/iter (± 163465
)1.01
obj_batch_points_sequential/query
7909
ns/iter (± 19
)7900
ns/iter (± 35
)1.00
mono_points_classic/generate_messages
4381574
ns/iter (± 117545
)4460557
ns/iter (± 98276
)0.98
mono_points_classic/encode_log_msg
10687902
ns/iter (± 661819
)10992254
ns/iter (± 490935
)0.97
mono_points_classic/encode_total
15207799
ns/iter (± 956548
)16087941
ns/iter (± 1247907
)0.95
mono_points_classic/decode_total
36342832
ns/iter (± 949868
)36034636
ns/iter (± 821158
)1.01
mono_points_arrow/generate_message_bundles
27691863
ns/iter (± 1768002
)28177100
ns/iter (± 1020898
)0.98
mono_points_arrow/generate_messages
102563364
ns/iter (± 1046854
)94086421
ns/iter (± 857424
)1.09
mono_points_arrow/encode_log_msg
129161710
ns/iter (± 951038
)120267805
ns/iter (± 994832
)1.07
mono_points_arrow/encode_total
262380439
ns/iter (± 1762644
)241927627
ns/iter (± 1419060
)1.08
mono_points_arrow/decode_log_msg
146318512
ns/iter (± 969580
)138261775
ns/iter (± 624091
)1.06
mono_points_arrow/decode_message_bundles
59214198
ns/iter (± 923106
)51319149
ns/iter (± 711426
)1.15
mono_points_arrow/decode_total
201380477
ns/iter (± 1758582
)187226736
ns/iter (± 2144223
)1.08
batch_points_classic/generate_messages
3458
ns/iter (± 13
)3399
ns/iter (± 23
)1.02
batch_points_classic/encode_log_msg
393779
ns/iter (± 646
)391060
ns/iter (± 1098
)1.01
batch_points_classic/encode_total
400243
ns/iter (± 805
)400926
ns/iter (± 1126
)1.00
batch_points_classic/decode_total
745873
ns/iter (± 4368
)748069
ns/iter (± 2272
)1.00
batch_points_arrow/generate_message_bundles
243846
ns/iter (± 693
)244852
ns/iter (± 382
)1.00
batch_points_arrow/generate_messages
4550
ns/iter (± 18
)4580
ns/iter (± 16
)0.99
batch_points_arrow/encode_log_msg
264572
ns/iter (± 899
)264407
ns/iter (± 741
)1.00
batch_points_arrow/encode_total
542300
ns/iter (± 1701
)535540
ns/iter (± 1998
)1.01
batch_points_arrow/decode_log_msg
205247
ns/iter (± 444
)206944
ns/iter (± 449
)0.99
batch_points_arrow/decode_message_bundles
1662
ns/iter (± 3
)1702
ns/iter (± 7
)0.98
batch_points_arrow/decode_total
210884
ns/iter (± 527
)210707
ns/iter (± 447
)1.00
arrow_mono_points/insert
4066392993
ns/iter (± 12559278
)3532698352
ns/iter (± 8983778
)1.15
arrow_mono_points/query
1658457
ns/iter (± 23021
)1657753
ns/iter (± 20414
)1.00
arrow_batch_points/insert
1477736
ns/iter (± 5292
)1480920
ns/iter (± 9062
)1.00
arrow_batch_points/query
16874
ns/iter (± 31
)15628
ns/iter (± 97
)1.08
obj_batch_points_sequential/Tuid::random
37
ns/iter (± 0
)38
ns/iter (± 0
)0.97
This comment was automatically generated by workflow using github-action-benchmark.