Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BTreeMap navigation done safer & faster #68827

Merged
merged 3 commits into from
Feb 28, 2020

Conversation

ssomers
Copy link
Contributor

@ssomers ssomers commented Feb 4, 2020

It turns out that there was a faster way to do the tree navigation code bundled in #67073, by moving from edge to KV and from KV to next edge separately. It extracts most of the code as safe functions, and contains the duplication of handles within the short wrapper functions.

This somehow hits a sweet spot in the compiler because it reports boosts all over the board:

>cargo benchcmp pre3.txt posz4.txt --threshold 5
 name                                           pre3.txt ns/iter  posz4.txt ns/iter  diff ns/iter   diff %  speedup
 btree::map::first_and_last_0                   40                37                           -3   -7.50%   x 1.08
 btree::map::first_and_last_100                 58                44                          -14  -24.14%   x 1.32
 btree::map::iter_1000                          8,920             3,419                    -5,501  -61.67%   x 2.61
 btree::map::iter_100000                        1,069,290         411,615                -657,675  -61.51%   x 2.60
 btree::map::iter_20                            169               58                         -111  -65.68%   x 2.91
 btree::map::iter_mut_1000                      8,701             3,303                    -5,398  -62.04%   x 2.63
 btree::map::iter_mut_100000                    1,034,560         405,975                -628,585  -60.76%   x 2.55
 btree::map::iter_mut_20                        165               58                         -107  -64.85%   x 2.84
 btree::set::clone_100                          1,831             1,562                      -269  -14.69%   x 1.17
 btree::set::clone_100_and_clear                1,831             1,565                      -266  -14.53%   x 1.17
 btree::set::clone_100_and_into_iter            1,917             1,541                      -376  -19.61%   x 1.24
 btree::set::clone_100_and_pop_all              2,609             2,441                      -168   -6.44%   x 1.07
 btree::set::clone_100_and_remove_all           4,598             3,927                      -671  -14.59%   x 1.17
 btree::set::clone_100_and_remove_half          2,765             2,551                      -214   -7.74%   x 1.08
 btree::set::clone_10k                          191,610           164,616                 -26,994  -14.09%   x 1.16
 btree::set::clone_10k_and_clear                192,003           164,616                 -27,387  -14.26%   x 1.17
 btree::set::clone_10k_and_into_iter            200,037           163,010                 -37,027  -18.51%   x 1.23
 btree::set::clone_10k_and_pop_all              267,023           250,913                 -16,110   -6.03%   x 1.06
 btree::set::clone_10k_and_remove_all           536,230           464,100                 -72,130  -13.45%   x 1.16
 btree::set::clone_10k_and_remove_half          453,350           430,545                 -22,805   -5.03%   x 1.05
 btree::set::difference_random_100_vs_100       1,787             801                        -986  -55.18%   x 2.23
 btree::set::difference_random_100_vs_10k       2,978             2,696                      -282   -9.47%   x 1.10
 btree::set::difference_random_10k_vs_100       111,075           54,734                  -56,341  -50.72%   x 2.03
 btree::set::difference_random_10k_vs_10k       246,380           175,980                 -70,400  -28.57%   x 1.40
 btree::set::difference_staggered_100_vs_100    1,789             951                        -838  -46.84%   x 1.88
 btree::set::difference_staggered_100_vs_10k    2,798             2,606                      -192   -6.86%   x 1.07
 btree::set::difference_staggered_10k_vs_10k    176,452           97,401                  -79,051  -44.80%   x 1.81
 btree::set::intersection_100_neg_vs_10k_pos    34                32                           -2   -5.88%   x 1.06
 btree::set::intersection_100_pos_vs_100_neg    30                27                           -3  -10.00%   x 1.11
 btree::set::intersection_random_100_vs_100     1,537             613                        -924  -60.12%   x 2.51
 btree::set::intersection_random_100_vs_10k     2,793             2,649                      -144   -5.16%   x 1.05
 btree::set::intersection_random_10k_vs_10k     222,127           147,166                 -74,961  -33.75%   x 1.51
 btree::set::intersection_staggered_100_vs_100  1,447             622                        -825  -57.01%   x 2.33
 btree::set::intersection_staggered_100_vs_10k  2,606             2,382                      -224   -8.60%   x 1.09
 btree::set::intersection_staggered_10k_vs_10k  143,620           58,790                  -84,830  -59.07%   x 2.44
 btree::set::is_subset_100_vs_100               1,349             488                        -861  -63.83%   x 2.76
 btree::set::is_subset_100_vs_10k               1,720             1,428                      -292  -16.98%   x 1.20
 btree::set::is_subset_10k_vs_10k               135,984           48,527                  -87,457  -64.31%   x 2.80

The first_and_last ones are noise (they don't do iteration), the others seem genuine.
As always, approved by Miri.

Also, a separate commit with some more benchmarks of mutable behaviour (which also benefit).

r? @cuviper

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 4, 2020
@ssomers
Copy link
Contributor Author

ssomers commented Feb 8, 2020

Last commit merely:

  • fixes commit message of the benchmark commit
  • rebases on the last master that allows benchmarking
  • adds pub to the safe navigate functions

@Dylan-DPC-zz
Copy link

r? @Amanieu

@rust-highfive rust-highfive assigned Amanieu and unassigned cuviper Feb 22, 2020
@Amanieu
Copy link
Member

Amanieu commented Feb 24, 2020

I'm really not familiar with the BTree code, but I believe @Mark-Simulacrum has expressed interest.

r? @Mark-Simulacrum

@Mark-Simulacrum
Copy link
Member

I will try to take a look soon, though given the relatively hairy nature of the BTreeMap code it'll likely be some time until I can allocate a big enough block of time to dig in sufficiently.

@ssomers
Copy link
Contributor Author

ssomers commented Feb 26, 2020

Had to move documentation because apparently a recent change in master causes these unused doc comment errors.

@Mark-Simulacrum
Copy link
Member

Could you provide a medium-level description of what this is doing? I likely don't have enough time to try to reconstruct that from the diff, which looks to be moving a bunch of stuff around (I guess splitting something into two?).

@ssomers
Copy link
Contributor Author

ssomers commented Feb 27, 2020

It's purely about how the code in navigate.rs is ordered and grouped into functions: more but simpler functions now. The previous code had functions moving from an edge to the next edge while also returning the KV in between, which for mutable trees implies juggling with 2 mutable handles into the same tree, i.e. functions that are unsafe and private and there to support exactly the same code as before #67073. This PR has separate functions taking half steps, from edge to next KV and from KV to bordering edge, sticking to a single handle like the basic navigation in node.rs, i.e. functions that are safe and exposed within btree. There's no intended change to the interface of any previously exposed and used functions, the heart of iterators. There's no actual change to their implementation in my mind, but it's a stupid mind and benchmarks contradict it. The "we have to do this last" code that #58431 repaired still happens last and Miri is still happy.

@ssomers
Copy link
Contributor Author

ssomers commented Feb 27, 2020

Last commit is to rename the new functions and squash.

Meanwhile my mind remembers that there is one "functional" change: the new function called next_leaf_edge does a .force(), i.e. reads the NodeRef::height attribute, to remember the type of node. The previous code didn't need this because it was inlined and aware of the type, before erasing that type with .forget_type(). You'd think this might have some negative impact on performance, but apparently the opposite is true. You'd think this cannot have a negative impact on behaviour, because we're already reading the node attribute, and I still believe that.

Copy link
Member

@Mark-Simulacrum Mark-Simulacrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good; left a few comments that I think will simplify things even more.

Ok(internal_kv) => return Ok(internal_kv.forget_node_type()),
Err(last_edge) => match last_edge.into_node().ascend() {
Ok(parent_edge) => parent_edge,
Err(root) => return Err(root.forget_type()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we avoid duplicating this code? It looks like we should be able to, by starting out with let mut parent_edge = self;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, indeed. Not quite starting from there but from:

 let mut edge = self.forget_node_type();

which has to be defined for edges too, but then it works. Benchmarking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seriously? Another factor 2 boost? It's getting ridiculous.

unwrap_unchecked(next_level)
macro_rules! def_next_kv {
{ pub fn $name:ident : $adjacent_kv:ident } => {
/// Given a leaf edge handle, returns [`Result::Ok`] with a handle to the KV next to it,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Given a leaf edge handle, returns [`Result::Ok`] with a handle to the KV next to it,
/// Given a leaf edge handle, returns [`Result::Ok`] with a handle to the neighboring KV (either left or right, see name of this function),

I'm not really happy about having to say things like this but I was pretty confused otherwise. It may make sense to lift the documentation out of the macro, if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it's possible anymore. I had to move it there because of these unused doc comment errors.

But if the simplification above pays off, it may be worth while to inline the macro and repeat the code, like I did for next_leaf_edge and next_back_leaf_edge.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. We may be able to also use generics instead of the macro, though that wouldn't solve the documentation issue.

Copy link
Contributor Author

@ssomers ssomers Feb 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can probably also pass identifiers/tokens to the macro and let it generate the right documentation each time. But when I do "x.py doc", the doc for that internal function appears nowhere. And who would look for it anyways.

src/liballoc/collections/btree/navigate.rs Show resolved Hide resolved
src/liballoc/collections/btree/navigate.rs Outdated Show resolved Hide resolved
*self = next_edge;
let kv = ptr::read(self).next_kv();
let kv = unwrap_unchecked(kv.ok());
*self = ptr::read(&kv).next_leaf_edge();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also replace the *self = ... with a ptr::write(self, ...) because that way it'll do the right thing even if self gains a drop impl or so.

src/liballoc/collections/btree/navigate.rs Outdated Show resolved Hide resolved
@ssomers
Copy link
Contributor Author

ssomers commented Feb 27, 2020

Latest benchmarks:

cargo benchcmp old4.txt newer2.txt --threshold 5
 name                                           old4.txt ns/iter  newer2.txt ns/iter  diff ns/iter   diff %  speedup
 btree::map::find_seq_100                       20                18                            -2  -10.00%   x 1.11
 btree::map::first_and_last_100                 42                46                             4    9.52%   x 0.91
 btree::map::first_and_last_10k                 76                66                           -10  -13.16%   x 1.15
 btree::map::iter_1000                          8,988             960                       -8,028  -89.32%   x 9.36
 btree::map::iter_100000                        1,042,720         277,573                 -765,147  -73.38%   x 3.76
 btree::map::iter_20                            170               24                          -146  -85.88%   x 7.08
 btree::map::iter_mut_1000                      8,401             929                       -7,472  -88.94%   x 9.04
 btree::map::iter_mut_100000                    1,033,090         277,953                 -755,137  -73.09%   x 3.72
 btree::map::iter_mut_20                        166               21                          -145  -87.35%   x 7.90
 btree::set::clone_100                          1,812             1,623                       -189  -10.43%   x 1.12
 btree::set::clone_100_and_clear                1,844             1,585                       -259  -14.05%   x 1.16
 btree::set::clone_100_and_into_iter            1,904             1,564                       -340  -17.86%   x 1.22
 btree::set::clone_100_and_remove_all           4,567             3,869                       -698  -15.28%   x 1.18
 btree::set::clone_10k                          197,038           170,053                  -26,985  -13.70%   x 1.16
 btree::set::clone_10k_and_clear                187,530           168,030                  -19,500  -10.40%   x 1.12
 btree::set::clone_10k_and_into_iter            201,240           166,126                  -35,114  -17.45%   x 1.21
 btree::set::clone_10k_and_remove_all           502,770           459,700                  -43,070   -8.57%   x 1.09
 btree::set::difference_random_100_vs_100       1,764             617                       -1,147  -65.02%   x 2.86
 btree::set::difference_random_100_vs_10k       2,988             2,635                       -353  -11.81%   x 1.13
 btree::set::difference_random_10k_vs_100       110,336           45,705                   -64,631  -58.58%   x 2.41
 btree::set::difference_random_10k_vs_10k       244,187           156,345                  -87,842  -35.97%   x 1.56
 btree::set::difference_staggered_100_vs_100    1,780             643                       -1,137  -63.88%   x 2.77
 btree::set::difference_staggered_100_vs_10k    2,817             2,425                       -392  -13.92%   x 1.16
 btree::set::difference_staggered_10k_vs_10k    176,294           61,357                  -114,937  -65.20%   x 2.87
 btree::set::intersection_random_100_vs_100     1,537             333                       -1,204  -78.33%   x 4.62
 btree::set::intersection_random_100_vs_10k     2,723             2,262                       -461  -16.93%   x 1.20
 btree::set::intersection_random_10k_vs_100     2,837             2,382                       -455  -16.04%   x 1.19
 btree::set::intersection_random_10k_vs_10k     224,545           117,043                 -107,502  -47.88%   x 1.92
 btree::set::intersection_staggered_100_vs_100  1,440             358                       -1,082  -75.14%   x 4.02
 btree::set::intersection_staggered_100_vs_10k  2,559             2,101                       -458  -17.90%   x 1.22
 btree::set::intersection_staggered_10k_vs_10k  144,611           32,502                  -112,109  -77.52%   x 4.45
 btree::set::is_subset_100_vs_100               1,348             280                       -1,068  -79.23%   x 4.81
 btree::set::is_subset_100_vs_10k               1,751             1,276                       -475  -27.13%   x 1.37
 btree::set::is_subset_10k_vs_10k               134,618           26,777                  -107,841  -80.11%   x 5.03

Note that without an [inline] on the new replace function, the performance gain is annihilated.

@rust-highfive
Copy link
Collaborator

The job mingw-check of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2020-02-27T22:08:38.6739167Z ========================== Starting Command Output ===========================
2020-02-27T22:08:38.6742601Z [command]/bin/bash --noprofile --norc /home/vsts/work/_temp/0c0122db-32e7-4431-95c9-7d8efb756f22.sh
2020-02-27T22:08:38.6742967Z 
2020-02-27T22:08:38.6746951Z ##[section]Finishing: Disable git automatic line ending conversion
2020-02-27T22:08:38.6762051Z ##[section]Starting: Checkout rust-lang/rust@refs/pull/68827/merge to s
2020-02-27T22:08:38.6765020Z Task         : Get sources
2020-02-27T22:08:38.6765238Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
2020-02-27T22:08:38.6765449Z Version      : 1.0.0
2020-02-27T22:08:38.6765593Z Author       : Microsoft
---
2020-02-27T22:08:39.6897962Z ##[command]git remote add origin https://github.com/rust-lang/rust
2020-02-27T22:08:39.6903823Z ##[command]git config gc.auto 0
2020-02-27T22:08:39.6910247Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2020-02-27T22:08:39.6919946Z ##[command]git config --get-all http.proxy
2020-02-27T22:08:39.6925089Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/68827/merge:refs/remotes/pull/68827/merge
---
2020-02-27T22:13:43.2328572Z     Checking rustc-std-workspace-core v1.99.0 (/checkout/src/tools/rustc-std-workspace-core)
2020-02-27T22:13:43.2607603Z    Compiling hashbrown v0.6.2
2020-02-27T22:13:46.3032532Z     Checking cfg-if v0.1.10
2020-02-27T22:13:46.3432985Z     Checking alloc v0.0.0 (/checkout/src/liballoc)
2020-02-27T22:13:46.9935651Z error[E0599]: no method named `forget_node_type` found for struct `collections::btree::node::Handle<collections::btree::node::NodeRef<collections::btree::node::marker::Mut<'_>, K, V, collections::btree::node::marker::Leaf>, collections::btree::node::marker::KV>` in the current scope
2020-02-27T22:13:46.9937203Z     |
2020-02-27T22:13:46.9937203Z     |
2020-02-27T22:13:46.9937743Z 680 |                 handle: kv.forget_node_type(),
2020-02-27T22:13:46.9968520Z     |                            ^^^^^^^^^^^^^^^^ method not found in `collections::btree::node::Handle<collections::btree::node::NodeRef<collections::btree::node::marker::Mut<'_>, K, V, collections::btree::node::marker::Leaf>, collections::btree::node::marker::KV>`
2020-02-27T22:13:46.9970341Z    ::: src/liballoc/collections/btree/node.rs:782:1
2020-02-27T22:13:46.9970955Z     |
2020-02-27T22:13:46.9970955Z     |
2020-02-27T22:13:46.9971480Z 782 | pub struct Handle<Node, Type> {
2020-02-27T22:13:46.9972191Z     | ----------------------------- method `forget_node_type` not found for this
2020-02-27T22:13:46.9976340Z 
2020-02-27T22:13:47.0015778Z error[E0599]: no method named `forget_node_type` found for struct `collections::btree::node::Handle<collections::btree::node::NodeRef<collections::btree::node::marker::Mut<'_>, K, V, collections::btree::node::marker::Leaf>, collections::btree::node::marker::KV>` in the current scope
2020-02-27T22:13:47.0017429Z     |
2020-02-27T22:13:47.0017429Z     |
2020-02-27T22:13:47.0017976Z 743 |                 handle: kv.forget_node_type(),
2020-02-27T22:13:47.0019257Z     |                            ^^^^^^^^^^^^^^^^ method not found in `collections::btree::node::Handle<collections::btree::node::NodeRef<collections::btree::node::marker::Mut<'_>, K, V, collections::btree::node::marker::Leaf>, collections::btree::node::marker::KV>`
2020-02-27T22:13:47.0020637Z    ::: src/liballoc/collections/btree/node.rs:782:1
2020-02-27T22:13:47.0021065Z     |
2020-02-27T22:13:47.0021065Z     |
2020-02-27T22:13:47.0021525Z 782 | pub struct Handle<Node, Type> {
2020-02-27T22:13:47.0022235Z     | ----------------------------- method `forget_node_type` not found for this
2020-02-27T22:13:47.1497590Z     Checking rustc-demangle v0.1.16
2020-02-27T22:13:47.3916166Z     Checking panic_abort v0.0.0 (/checkout/src/libpanic_abort)
2020-02-27T22:13:47.5201805Z     Checking backtrace v0.3.44
2020-02-27T22:13:47.5825119Z error: aborting due to 2 previous errors
---
2020-02-27T22:13:47.6851237Z   local time: Thu Feb 27 22:13:47 UTC 2020
2020-02-27T22:13:48.2231992Z   network time: Thu, 27 Feb 2020 22:13:48 GMT
2020-02-27T22:13:48.2237597Z == end clock drift check ==
2020-02-27T22:13:49.2268342Z 
2020-02-27T22:13:49.2328430Z ##[error]Bash exited with code '1'.
2020-02-27T22:13:49.2351783Z ##[section]Finishing: Run build
2020-02-27T22:13:49.2389824Z ##[section]Starting: Checkout rust-lang/rust@refs/pull/68827/merge to s
2020-02-27T22:13:49.2393985Z Task         : Get sources
2020-02-27T22:13:49.2394271Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
2020-02-27T22:13:49.2394537Z Version      : 1.0.0
2020-02-27T22:13:49.2394738Z Author       : Microsoft
2020-02-27T22:13:49.2394738Z Author       : Microsoft
2020-02-27T22:13:49.2395026Z Help         : [More Information](https://go.microsoft.com/fwlink/?LinkId=798199)
2020-02-27T22:13:49.2395364Z ==============================================================================
2020-02-27T22:13:49.5169374Z Cleaning any cached credential from repository: rust-lang/rust (GitHub)
2020-02-27T22:13:49.5203873Z ##[section]Finishing: Checkout rust-lang/rust@refs/pull/68827/merge to s
2020-02-27T22:13:49.5272396Z Cleaning up task key
2020-02-27T22:13:49.5273532Z Start cleaning up orphan processes.
2020-02-27T22:13:49.5417795Z Terminate orphan process: pid (3901) (python)
2020-02-27T22:13:49.5526230Z ##[section]Finishing: Finalize Job

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

Copy link
Member

@Mark-Simulacrum Mark-Simulacrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace the merge commit with its parent(s) (i.e., we do not want merge commits, but e.g. a rebase or so is fine). Otherwise, this looks good, I will approve once that's done.

src/liballoc/collections/btree/navigate.rs Outdated Show resolved Hide resolved
@ssomers
Copy link
Contributor Author

ssomers commented Feb 28, 2020

Please replace the merge commit with its parent(s) (i.e., we do not want merge commits, but e.g. a rebase or so is fine).

Rebasing would make things a lot easier, but kills benchmarking. So here's one with the actual commits squashed again, and the commit merged in master cherry picked, still based on the last master that allowed benchmarking. If that's still not okay, I'll rebase.

@Mark-Simulacrum
Copy link
Member

Yeah, that works too. Just didn't want the merge commit :)

I'm surprised/amazed/confused by the benchmark deltas, but since they're not negative they seem fine.

Thanks!

@bors r+

@bors
Copy link
Contributor

bors commented Feb 28, 2020

📌 Commit 9f7b58f has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 28, 2020
@bors
Copy link
Contributor

bors commented Feb 28, 2020

⌛ Testing commit 9f7b58f with merge e2223c9...

@bors
Copy link
Contributor

bors commented Feb 28, 2020

☀️ Test successful - checks-azure
Approved by: Mark-Simulacrum
Pushing e2223c9 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 28, 2020
@bors bors merged commit e2223c9 into rust-lang:master Feb 28, 2020
@ssomers ssomers deleted the btree_navigation_revisited branch March 9, 2020 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants