Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leaf node linked list for iteration #33

Merged
merged 21 commits into from
Sep 16, 2024
Merged

Leaf node linked list for iteration #33

merged 21 commits into from
Sep 16, 2024

Conversation

declanvk
Copy link
Owner

This PR adds a doubly linked list to the leaf nodes to implement iteration in a more natural way. This should make it easier to implement things like range or prefix and maybe even the drain + filter.

@Gab-Menezes
Copy link
Collaborator

hey @declanvk can you explain how this works and why it makes things easier ? None of the papers I read mentioned using a linked list in the leafs, did you come up with this for yourself ?

@declanvk
Copy link
Owner Author

Linked node is a pretty common optimization in B+-tree's (see https://en.wikipedia.org/wiki/B%2B_tree#Implementation), but I don't think I saw it anywhere in the ART paper or associated stuff.

explain how this works

The linked list works from the base case of a single tree containing only a single leaf node. The previous and next pointers are both None in that case. On inserting a new leaf, we will always be able to find a nearest leaf that would be:

  • Before the new leaf -> we can insert the new leaf after the existing leaf in the linked list
  • After the new leaf -> we can insert the new leaf before the existing leaf in the linked list
  • Equal to the new leaf -> we can replace the existing leaf with the new leaf in the linked list

On delete, we can easily remove a leaf node from the linked list by pointing the previous and next leaf nodes at each other.

why it makes things easier

  • We can cache the minimum and maximum leaf nodes in a trie, which makes starting full-trie
    iteration very easy with those as the beginning and end pointers.
  • Range iteration is now very easy. The procedure now is just (1) lookup the minimum leaf node that satisfies the lower bound (2) lookup the maximum leaf node that satisfies the upper bound (3) run an iteration starting with the lower bound leaf node and ending with the upper bound leaf node
  • We can optimize the IntoIter implementation by deallocating all the inner node in the trie, and the incrementally deallocating leaf nodes and iterating through the linked list (see Optimize IntoIter implementation #19)

@declanvk
Copy link
Owner Author

I was also considering an implementation based on parent pointers (see #7 ), which is similar to how the Rust BTreeMap does it (see https://doc.rust-lang.org/1.80.1/src/alloc/collections/btree/node.rs.html#52), but I thought that the leaf siblings would be more direct, since iteration would just become traversing the linked list.

The downsides are definitely the increase in space usage, if you see the tests based on dhat-rs:

I haven't tested what the impact on iteration speed looks like though, which is what I would be weighing the size increase against.

@declanvk
Copy link
Owner Author

An ART implementation's biggest strength is definitely its compact size and excellent cache performance. I'm hoping that even though there is a significant increase in leaf node size and overall tree allocation size, that won't impact the search/insert/delete paths too much because the inner node have not change in size.

However, if the benchmarks at the end show negative impacts to the main operations (get/insert/delete) that may outweight performance improvements to the iteration/range/etc

@Gab-Menezes
Copy link
Collaborator

Makes a lot of sense, please share your findings when you get there, if there is anything I can help, just let me know.

@declanvk declanvk force-pushed the range-pointers branch 2 times, most recently from 1a158fb to d5ddd4a Compare September 11, 2024 23:05
**Description**
 - Rename `LeafNode::new` to `LeafNode::with_no_siblings` to
   emphasize that that the sibling pointers are unint
 - Add `LeafNode` functions for inserting and removing from the list
 - Minor changes like:
    - Switch `panic!` to `unreachable!` where relevant
    - Type alias for optional leaf pointer, since the type was large
    - Add a helper type to the for the `DotPrinter` and make the settings
      non-exhaustive to allow future settings without breakage
    - Fixup comments and make some functions unsafe that were not
**Description**
 - Add steps to the insert and delete operations so that the leaf
   linked list will be maintained with the correct order
    - This is the next step before we're able to use the linked list in
      various iteration processes
    - There is a TODO left to make the `deep_clone` functions work
      with the linked list
 - Add public API to the TreeMap type for converting to and from
   opaque node pointers. These functions are useful when running
   tests on the tree internals
**Description**
 - Add two new errors types for the WF checker that concern
   the contents and ordering of the leaf linked list
 - Add a FIXME note in the `impl Visitable for NodePtr` impl,
   the current state is probably broken
@declanvk
Copy link
Owner Author

Benchmark results from iai-callgrind:

iai_callgrind::bench_clone_group::bench_clone with_prefixes:...
  Instructions:            20112721|17225240        (+16.7631%) [+1.16763x]
  L1 Hits:                 27197743|23352051        (+16.4683%) [+1.16468x]
  L2 Hits:                    53342|45367           (+17.5789%) [+1.17579x]
  RAM Hits:                   53907|45682           (+18.0049%) [+1.18005x]
  Total read+write:        27304992|23443100        (+16.4735%) [+1.16473x]
  Estimated Cycles:        29351198|25177756        (+16.5759%) [+1.16576x]
iai_callgrind::bench_clone_group::bench_clone dictionary:...
  Instructions:            23817731|20773099        (+14.6566%) [+1.14657x]
  L1 Hits:                 32327675|28226392        (+14.5300%) [+1.14530x]
  L2 Hits:                    65728|60144           (+9.28438%) [+1.09284x]
  RAM Hits:                   64458|56238           (+14.6165%) [+1.14616x]
  Total read+write:        32457861|28342774        (+14.5190%) [+1.14519x]
  Estimated Cycles:        34912345|30495442        (+14.4838%) [+1.14484x]
iai_callgrind::bench_lookup_group::bench_lookup_single first_key:...
  Instructions:                 333|331             (+0.60423%) [+1.00604x]
  L1 Hits:                      425|431             (-1.39211%) [-1.01412x]
  L2 Hits:                        7|2               (+250.000%) [+3.50000x]
  RAM Hits:                      16|16              (No change)
  Total read+write:             448|449             (-0.22272%) [-1.00223x]
  Estimated Cycles:            1020|1001            (+1.89810%) [+1.01898x]
iai_callgrind::bench_lookup_group::bench_lookup_single last_key:...
  Instructions:                 398|396             (+0.50505%) [+1.00505x]
  L1 Hits:                      501|499             (+0.40080%) [+1.00401x]
  L2 Hits:                        0|0               (No change)
  RAM Hits:                      20|23              (-13.0435%) [-1.15000x]
  Total read+write:             521|522             (-0.19157%) [-1.00192x]
  Estimated Cycles:            1201|1304            (-7.89877%) [-1.08576x]
iai_callgrind::bench_lookup_group::bench_lookup_multiple dictionary:...
  Instructions:              882544|886649          (-0.46298%) [-1.00465x]
  L1 Hits:                  1176668|1191134         (-1.21447%) [-1.01229x]
  L2 Hits:                      501|467             (+7.28051%) [+1.07281x]
  RAM Hits:                      27|33              (-18.1818%) [-1.22222x]
  Total read+write:         1177196|1191634         (-1.21161%) [-1.01226x]
  Estimated Cycles:         1180118|1194624         (-1.21427%) [-1.01229x]
iai_callgrind::bench_remove_group::bench_remove_single first_key:...
  Instructions:            12169461|12263915        (-0.77018%) [-1.00776x]
  L1 Hits:                 17152728|17146932        (+0.03380%) [+1.00034x]
  L2 Hits:                    64606|56445           (+14.4583%) [+1.14458x]
  RAM Hits:                      88|94              (-6.38298%) [-1.06818x]
  Total read+write:        17217422|17203471        (+0.08109%) [+1.00081x]
  Estimated Cycles:        17478838|17432447        (+0.26612%) [+1.00266x]
iai_callgrind::bench_remove_group::bench_remove_single last_key:...
  Instructions:            12168579|12264033        (-0.77832%) [-1.00784x]
  L1 Hits:                 17151524|17147073        (+0.02596%) [+1.00026x]
  L2 Hits:                    64593|56432           (+14.4617%) [+1.14462x]
  RAM Hits:                      84|99              (-15.1515%) [-1.17857x]
  Total read+write:        17216201|17203604        (+0.07322%) [+1.00073x]
  Estimated Cycles:        17477429|17432698        (+0.25659%) [+1.00257x]
iai_callgrind::bench_remove_group::bench_remove_multiple dictionary:...
  Instructions:            12889135|12981136        (-0.70873%) [-1.00714x]
  L1 Hits:                 18125619|18117913        (+0.04253%) [+1.00043x]
  L2 Hits:                    63670|57370           (+10.9813%) [+1.10981x]
  RAM Hits:                    2011|179             (+1023.46%) [+11.2346x]
  Total read+write:        18191300|18175462        (+0.08714%) [+1.00087x]
  Estimated Cycles:        18514354|18411028        (+0.56122%) [+1.00561x]
iai_callgrind::bench_insert_group::bench_insert_single first_key:...
  Instructions:            12169577|12263993        (-0.76986%) [-1.00776x]
  L1 Hits:                 17152970|17147115        (+0.03415%) [+1.00034x]
  L2 Hits:                    64599|56433           (+14.4703%) [+1.14470x]
  RAM Hits:                      61|61              (No change)
  Total read+write:        17217630|17203609        (+0.08150%) [+1.00082x]
  Estimated Cycles:        17478100|17431415        (+0.26782%) [+1.00268x]
iai_callgrind::bench_insert_group::bench_insert_single last_key:...
  Instructions:            12170084|12264078        (-0.76642%) [-1.00772x]
  L1 Hits:                 17153620|17147214        (+0.03736%) [+1.00037x]
  L2 Hits:                    64604|56427           (+14.4913%) [+1.14491x]
  RAM Hits:                      63|59              (+6.77966%) [+1.06780x]
  Total read+write:        17218287|17203700        (+0.08479%) [+1.00085x]
  Estimated Cycles:        17478845|17431414        (+0.27210%) [+1.00272x]
iai_callgrind::bench_insert_group::bench_insert_multiple dictionary:...
  Instructions:            13550147|13591078        (-0.30116%) [-1.00302x]
  L1 Hits:                 19123270|19002336        (+0.63642%) [+1.00636x]
  L2 Hits:                    67108|58886           (+13.9626%) [+1.13963x]
  RAM Hits:                     229|105             (+118.095%) [+2.18095x]
  Total read+write:        19190607|19061327        (+0.67823%) [+1.00678x]
  Estimated Cycles:        19466825|19300441        (+0.86207%) [+1.00862x]
iai_callgrind::bench_iterator_group::bench_full_iterator dictionary:...
  Instructions:              196616|3610467         (-94.5543%) [-18.3630x]
  L1 Hits:                   196620|5036161         (-96.0958%) [-25.6137x]
  L2 Hits:                    32768|23587           (+38.9240%) [+1.38924x]
  RAM Hits:                       2|36              (-94.4444%) [-18.0000x]
  Total read+write:          229390|5059784         (-95.4664%) [-22.0576x]
  Estimated Cycles:          360530|5155356         (-93.0067%) [-14.2994x]
iai_callgrind::bench_iterator_group::bench_prefix_iterator empty:...
  Instructions:             4215954|4166861         (+1.17818%) [+1.01178x]
  L1 Hits:                  5805003|5792828         (+0.21017%) [+1.00210x]
  L2 Hits:                    56591|52438           (+7.91983%) [+1.07920x]
  RAM Hits:                      59|63              (-6.34921%) [-1.06780x]
  Total read+write:         5861653|5845329         (+0.27927%) [+1.00279x]
  Estimated Cycles:         6090023|6057223         (+0.54150%) [+1.00542x]
iai_callgrind::bench_iterator_group::bench_prefix_iterator specific_key:...
  Instructions:                1408|1402            (+0.42796%) [+1.00428x]
  L1 Hits:                     1777|1773            (+0.22561%) [+1.00226x]
  L2 Hits:                        2|1               (+100.000%) [+2.00000x]
  RAM Hits:                      27|28              (-3.57143%) [-1.03704x]
  Total read+write:            1806|1802            (+0.22198%) [+1.00222x]
  Estimated Cycles:            2732|2758            (-0.94271%) [-1.00952x]
iai_callgrind::bench_iterator_group::bench_prefix_iterator random_partial:...
  Instructions:                6426|6372            (+0.84746%) [+1.00847x]
  L1 Hits:                     8634|8645            (-0.12724%) [-1.00127x]
  L2 Hits:                       78|52              (+50.0000%) [+1.50000x]
  RAM Hits:                      43|38              (+13.1579%) [+1.13158x]
  Total read+write:            8755|8735            (+0.22896%) [+1.00229x]
  Estimated Cycles:           10529|10235           (+2.87250%) [+1.02872x]
iai_callgrind::bench_iterator_group::bench_fuzzy_iterator zero:...
  Instructions:                5084|5075            (+0.17734%) [+1.00177x]
  L1 Hits:                     6521|6507            (+0.21515%) [+1.00215x]
  L2 Hits:                        5|6               (-16.6667%) [-1.20000x]
  RAM Hits:                      39|43              (-9.30233%) [-1.10256x]
  Total read+write:            6565|6556            (+0.13728%) [+1.00137x]
  Estimated Cycles:            7911|8042            (-1.62895%) [-1.01656x]
iai_callgrind::bench_iterator_group::bench_fuzzy_iterator specific_key:...
  Instructions:            31519909|30199582        (+4.37200%) [+1.04372x]
  L1 Hits:                 41351228|39431391        (+4.86880%) [+1.04869x]
  L2 Hits:                    64555|56468           (+14.3214%) [+1.14321x]
  RAM Hits:                     248|256             (-3.12500%) [-1.03226x]
  Total read+write:        41416031|39488115        (+4.88227%) [+1.04882x]
  Estimated Cycles:        41682683|39722691        (+4.93419%) [+1.04934x]

Important bits:

iai_callgrind::bench_clone_group::bench_clone with_prefixes:...
...
  Estimated Cycles:        29351198|25177756        (+16.5759%) [+1.16576x]
...
iai_callgrind::bench_iterator_group::bench_full_iterator dictionary:...
...
  Estimated Cycles:          360530|5155356         (-93.0067%) [-14.2994x]
...

There are other minor regressions and improvements too. I was expecting more of an impact on the insert path, but I'm not seeing very much of it.

@declanvk declanvk marked this pull request as ready for review September 15, 2024 23:51
@declanvk
Copy link
Owner Author

Updated benchmarks after re-implementing the prefix iterators:

iai_callgrind::bench_clone_group::bench_clone with_prefixes:...
  Instructions:            20112723|17225240        (+16.7631%) [+1.16763x]
  L1 Hits:                 27197746|23352051        (+16.4683%) [+1.16468x]
  L2 Hits:                    53343|45367           (+17.5811%) [+1.17581x]
  RAM Hits:                   53907|45682           (+18.0049%) [+1.18005x]
  Total read+write:        27304996|23443100        (+16.4735%) [+1.16473x]
  Estimated Cycles:        29351206|25177756        (+16.5759%) [+1.16576x]
iai_callgrind::bench_clone_group::bench_clone dictionary:...
  Instructions:            23817733|20773099        (+14.6566%) [+1.14657x]
  L1 Hits:                 32327681|28226392        (+14.5300%) [+1.14530x]
  L2 Hits:                    65725|60144           (+9.27940%) [+1.09279x]
  RAM Hits:                   64459|56238           (+14.6182%) [+1.14618x]
  Total read+write:        32457865|28342774        (+14.5190%) [+1.14519x]
  Estimated Cycles:        34912371|30495442        (+14.4839%) [+1.14484x]
iai_callgrind::bench_lookup_group::bench_lookup_single first_key:...
  Instructions:                 331|331             (No change)
  L1 Hits:                      423|431             (-1.85615%) [-1.01891x]
  L2 Hits:                        7|2               (+250.000%) [+3.50000x]
  RAM Hits:                      16|16              (No change)
  Total read+write:             446|449             (-0.66815%) [-1.00673x]
  Estimated Cycles:            1018|1001            (+1.69830%) [+1.01698x]
iai_callgrind::bench_lookup_group::bench_lookup_single last_key:...
  Instructions:                 396|396             (No change)
  L1 Hits:                      499|499             (No change)
  L2 Hits:                        0|0               (No change)
  RAM Hits:                      20|23              (-13.0435%) [-1.15000x]
  Total read+write:             519|522             (-0.57471%) [-1.00578x]
  Estimated Cycles:            1199|1304            (-8.05215%) [-1.08757x]
iai_callgrind::bench_lookup_group::bench_lookup_multiple dictionary:...
  Instructions:              886638|886649          (-0.00124%) [-1.00001x]
  L1 Hits:                  1182811|1191134         (-0.69875%) [-1.00704x]
  L2 Hits:                      498|467             (+6.63812%) [+1.06638x]
  RAM Hits:                      28|33              (-15.1515%) [-1.17857x]
  Total read+write:         1183337|1191634         (-0.69627%) [-1.00701x]
  Estimated Cycles:         1186281|1194624         (-0.69838%) [-1.00703x]
iai_callgrind::bench_remove_group::bench_remove_single first_key:...
  Instructions:            12169409|12263915        (-0.77060%) [-1.00777x]
  L1 Hits:                 17152632|17146932        (+0.03324%) [+1.00033x]
  L2 Hits:                    64610|56445           (+14.4654%) [+1.14465x]
  RAM Hits:                      89|94              (-5.31915%) [-1.05618x]
  Total read+write:        17217331|17203471        (+0.08057%) [+1.00081x]
  Estimated Cycles:        17478797|17432447        (+0.26588%) [+1.00266x]
iai_callgrind::bench_remove_group::bench_remove_single last_key:...
  Instructions:            12168513|12264033        (-0.77886%) [-1.00785x]
  L1 Hits:                 17151416|17147073        (+0.02533%) [+1.00025x]
  L2 Hits:                    64596|56432           (+14.4670%) [+1.14467x]
  RAM Hits:                      82|99              (-17.1717%) [-1.20732x]
  Total read+write:        17216094|17203604        (+0.07260%) [+1.00073x]
  Estimated Cycles:        17477266|17432698        (+0.25566%) [+1.00256x]
iai_callgrind::bench_remove_group::bench_remove_multiple dictionary:...
  Instructions:            12839630|12981136        (-1.09009%) [-1.01102x]
  L1 Hits:                 18058901|18117913        (-0.32571%) [-1.00327x]
  L2 Hits:                    63654|57370           (+10.9535%) [+1.10953x]
  RAM Hits:                    2036|179             (+1037.43%) [+11.3743x]
  Total read+write:        18124591|18175462        (-0.27989%) [-1.00281x]
  Estimated Cycles:        18448431|18411028        (+0.20316%) [+1.00203x]
iai_callgrind::bench_insert_group::bench_insert_single first_key:...
  Instructions:            12169577|12263993        (-0.76986%) [-1.00776x]
  L1 Hits:                 17152966|17147115        (+0.03412%) [+1.00034x]
  L2 Hits:                    64605|56433           (+14.4809%) [+1.14481x]
  RAM Hits:                      59|61              (-3.27869%) [-1.03390x]
  Total read+write:        17217630|17203609        (+0.08150%) [+1.00082x]
  Estimated Cycles:        17478056|17431415        (+0.26757%) [+1.00268x]
iai_callgrind::bench_insert_group::bench_insert_single last_key:...
  Instructions:            12170118|12264078        (-0.76614%) [-1.00772x]
  L1 Hits:                 17153651|17147214        (+0.03754%) [+1.00038x]
  L2 Hits:                    64609|56427           (+14.5002%) [+1.14500x]
  RAM Hits:                      62|59              (+5.08475%) [+1.05085x]
  Total read+write:        17218322|17203700        (+0.08499%) [+1.00085x]
  Estimated Cycles:        17478866|17431414        (+0.27222%) [+1.00272x]
iai_callgrind::bench_insert_group::bench_insert_multiple dictionary:...
  Instructions:            13550836|13591078        (-0.29609%) [-1.00297x]
  L1 Hits:                 19123923|19002336        (+0.63985%) [+1.00640x]
  L2 Hits:                    67102|58886           (+13.9524%) [+1.13952x]
  RAM Hits:                     275|105             (+161.905%) [+2.61905x]
  Total read+write:        19191300|19061327        (+0.68187%) [+1.00682x]
  Estimated Cycles:        19469058|19300441        (+0.87364%) [+1.00874x]
iai_callgrind::bench_iterator_group::bench_full_iterator dictionary:...
  Instructions:              196616|3610467         (-94.5543%) [-18.3630x]
  L1 Hits:                   196621|5036161         (-96.0958%) [-25.6135x]
  L2 Hits:                    32768|23587           (+38.9240%) [+1.38924x]
  RAM Hits:                       1|36              (-97.2222%) [-36.0000x]
  Total read+write:          229390|5059784         (-95.4664%) [-22.0576x]
  Estimated Cycles:          360496|5155356         (-93.0074%) [-14.3007x]
iai_callgrind::bench_iterator_group::bench_prefix_iterator empty:...
  Instructions:              198164|4166861         (-95.2443%) [-21.0273x]
  L1 Hits:                   198560|5792828         (-96.5723%) [-29.1742x]
  L2 Hits:                    32777|52438           (-37.4938%) [-1.59984x]
  RAM Hits:                      23|63              (-63.4921%) [-2.73913x]
  Total read+write:          231360|5845329         (-96.0420%) [-25.2651x]
  Estimated Cycles:          363250|6057223         (-94.0030%) [-16.6751x]
iai_callgrind::bench_iterator_group::bench_prefix_iterator specific_key:...
  Instructions:                 400|1402            (-71.4693%) [-3.50500x]
  L1 Hits:                      486|1773            (-72.5888%) [-3.64815x]
  L2 Hits:                        2|1               (+100.000%) [+2.00000x]
  RAM Hits:                      25|28              (-10.7143%) [-1.12000x]
  Total read+write:             513|1802            (-71.5316%) [-3.51267x]
  Estimated Cycles:            1371|2758            (-50.2901%) [-2.01167x]
iai_callgrind::bench_iterator_group::bench_prefix_iterator random_partial:...
  Instructions:                 643|6372            (-89.9090%) [-9.90980x]
  L1 Hits:                      761|8645            (-91.1972%) [-11.3601x]
  L2 Hits:                       19|52              (-63.4615%) [-2.73684x]
  RAM Hits:                      26|38              (-31.5789%) [-1.46154x]
  Total read+write:             806|8735            (-90.7728%) [-10.8375x]
  Estimated Cycles:            1766|10235           (-82.7455%) [-5.79558x]
iai_callgrind::bench_iterator_group::bench_fuzzy_iterator zero:...
  Instructions:                5084|5075            (+0.17734%) [+1.00177x]
  L1 Hits:                     6520|6507            (+0.19978%) [+1.00200x]
  L2 Hits:                        5|6               (-16.6667%) [-1.20000x]
  RAM Hits:                      40|43              (-6.97674%) [-1.07500x]
  Total read+write:            6565|6556            (+0.13728%) [+1.00137x]
  Estimated Cycles:            7945|8042            (-1.20617%) [-1.01221x]
iai_callgrind::bench_iterator_group::bench_fuzzy_iterator specific_key:...
  Instructions:            31519909|30199582        (+4.37200%) [+1.04372x]
  L1 Hits:                 41351206|39431391        (+4.86875%) [+1.04869x]
  L2 Hits:                    64572|56468           (+14.3515%) [+1.14351x]
  RAM Hits:                     251|256             (-1.95312%) [-1.01992x]
  Total read+write:        41416029|39488115        (+4.88226%) [+1.04882x]
  Estimated Cycles:        41682851|39722691        (+4.93461%) [+1.04935x]
iai_callgrind::bench_iterator_group::bench_range_iterator full:...
  Instructions:              196763|196752          (+0.00559%) [+1.00006x]
  L1 Hits:                   196815|196801          (+0.00711%) [+1.00007x]
  L2 Hits:                    32773|32772           (+0.00305%) [+1.00003x]
  RAM Hits:                      14|12              (+16.6667%) [+1.16667x]
  Total read+write:          229602|229585          (+0.00740%) [+1.00007x]
  Estimated Cycles:          361170|361081          (+0.02465%) [+1.00025x]
iai_callgrind::bench_iterator_group::bench_range_iterator specific_key:...
  Instructions:                1235|703             (+75.6757%) [+1.75676x]
  L1 Hits:                     1540|849             (+81.3899%) [+1.81390x]
  L2 Hits:                       21|19              (+10.5263%) [+1.10526x]
  RAM Hits:                      35|31              (+12.9032%) [+1.12903x]
  Total read+write:            1596|899             (+77.5306%) [+1.77531x]
  Estimated Cycles:            2870|2029            (+41.4490%) [+1.41449x]
iai_callgrind::bench_iterator_group::bench_range_iterator middle_third:...
  Instructions:               66892|66382           (+0.76828%) [+1.00768x]
  L1 Hits:                    67216|66550           (+1.00075%) [+1.01001x]
  L2 Hits:                    10955|10951           (+0.03653%) [+1.00037x]
  RAM Hits:                      36|31              (+16.1290%) [+1.16129x]
  Total read+write:           78207|77532           (+0.87061%) [+1.00871x]
  Estimated Cycles:          123251|122390          (+0.70349%) [+1.00703x]

@declanvk
Copy link
Owner Author

declanvk commented Sep 16, 2024

I think I'm going to update the changelog and then go ahead and merge.

**Description**
Cache the minimum and maximum leaf nodes as part of the tree
internal state so that it is easy to start tree iteration using the leaves
linked list. This also speeds up the min/max access functions.
**Description**
 - Create a new unsafe "iterator" that will traverse the leaf node
   linked list.
 - Use the leaf node linked list iterator to implement the `Iter`,
   `IterMut`, `Keys`, `Values`, and `ValuesMut` iterators

**Motivation**
The changes to use the linked list iterator should be faster to
iterate (needs benchmark) and less complex to implement.

**Testing Done**
`./scripts/full_test.sh nightly`
**Description**
Fixes #19

This commit modifies the `IntoIter` implementation to be multiple
steps:
 1. Deallocate all the inner node of the trie, leaving only the
    leave nodes
 2. Iterate through the leaf nodes by linked list and deallocate the
    leaf node and return the key-value pairs.
 3. If there are any leaf nodes remaining on drop, deallocate all of
    them

This commit also moves the tree dealloc operation into a separate
module, and adds a couple variants of the dealloc operation to
support the `IntoIter` modifications.

The TreeMap fuzz tests are also updated to exercise the `.into_iter`
implementation in a way that should cover all 3 steps listed above.

**Motivation**
This change makes the `IntoIter` more efficient, since we don't have
to maintain a valid trie between each call to `next()`.

**Testing Done**
`./scripts/full-test.sh nightly`
**Description**
 - The previous implementation of `deallocate_leaves` was faulty
   because it would read all the way through the linked list of leaf
   nodes even if some leaves had been popped off the back by the
   iterator.
    - Fixing this by changing the `deallocate_leaves` take an
      iterator with start and end, instead of just start pointer.
 - Added some missing safety docs
 - Modified the `IntoIter` fuzz tests to iterate on the front and the
   back
 - Added new unit test to ensure drop count was correct

**Testing Done**
Ran fuzzer and full test script
**Description**
 - Fix bug where range iteration could sometimes be off by one on the end bound
 - Add debug_assertion checks on all inner node range iterator impls to check
   for cases "illegal" bound cases. Also add unit tests on these checks

**Motivation**
This change is supporting the range iterators change, since the inner node iterators
are required.

**Testing Done**
Added unit tests, ran full nightly test
**Description**
 - Implement the basic range iterator which will returns tuples of
   `(key, value)`.
 - Add the `TreeMap::range` function and a doctest example
 - Remove some dead code comments that were lying around

**Motivation**
This is the central feature that I wanted to implement, the ability
to query a sub-section of the `TreeMap` using the native key types.

This commit only implements the `Range` iterator, not the `RangeMut`
iterator so that the initial commit is devoid of `macro_rules` stuff.

This commit also includes some useful machinery that I think could
be applied to the `Prefix*` iterators, namely the
`find_terminating_node` function.

**Testing Done**
Full nightly test. I was not able to run a fuzzer because of some
difficulties setting it up.
**Description**
 - Fix another issue in the compressed inner node range iterator
   bounds handling and add some additional tests.
 - Factor out some common code in the range iterator lookup function
 - Fix a bug in the header `Debug` impl when it would read out of
   slice bounds when there were implicit bytes in the header.

**Motivation**
I made these changes while extending the range iterator to support
`RangeMut`, and found the test case from `(Unbounded, Included(128))`

**Testing Done**
Added some more unit tests, tested with fully nightly
**Description**
 - Add the `RangeMut` iterator by generalizing the existing range
   iterator using a `macro_rules` macro.
 - Also add the `TreeMap::range_mut` method and docs + example

**Motivation**
This change is needed to reach feature parity with the `BTreeMap`
which has a `range_mut` method.

**Testing Done**
Added doc test and new unit test. The `RangeMut` test coverage is
pretty low, but I'm banking on the fact that the implementation is
largely the same as `Range`.
**Description**
 - Add new concrete pointer type which only points to inner nodes
 - Swap equality condition in `TreeMap::eq` function

**Motivation**
 - The new concrete pointer type is helpful in cases where I want to
   statically know that the pointer cannot be a leaf node pointer.
   There are a couple cases of `LeafNode(_) => unreachable!()` in
   match statements that I'd like to remove later
 - The equality condition was checking the expensive part first,
   having the check on number of elements go first means it could
   easily short-circuit on maps of different size

**Testing Done**
`cargo test`
**Description**
The specialized clone based on `deep_clone` was removed because it
did not account for the leaf node linked list. This commit
re-implements `clone()` using a non-recursive algorithm that should
be more efficient than creating an empty tree and inserting elements.

This commit is needed, otherwise it would be a significant perf hit
to add the leaf node linked list.

This commit also removes the `deep_clone` functions and associated
stuff.

**Testing Done**
`./scripts/full-test.sh nightly`
**Description**
 - Apply some clippy lints to cleanup the build
 - Fix some test cases for miri, and reduce their size
 - Converting `debug_assert` to `assert` to make sure the panic
   message in tests is consistent in release mode
**Description**
 - Add `criterion` and `iai-callgrind` benchmark for the range
   iterator
 - Modify some existing `criterion` benches to use the
   `dictionary_tree` helper
**Description**
 - Re-implement the prefix iterators using the leaf node linked-list
   and some of the helper functions from the `range` module.
 - Remove the `prefix_keys`, `prefix_values`, `prefix_values_mut`
   functions and iterators
 - Add some additional unit tests

**Motivation**
 - The new implementation is faster and uses less code, since it can
   re-use the search function from the `range` module.
 - These iterator variants didn't add much value, aside from matching
   the existing `keys`, `values`, `values_mut` iterators.

**Testing Done**
`./scripts/full-test.sh nightly`
@declanvk declanvk merged commit ff12851 into main Sep 16, 2024
4 checks passed
@declanvk declanvk deleted the range-pointers branch September 16, 2024 17:27
@Gab-Menezes
Copy link
Collaborator

I know you already merged and did everything, but anyway nice changes, ty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize IntoIter implementation Implement range tree iterator
2 participants