BTreeSet intersection, is_subset & difference optimizations #64820

ssomers · 2019-09-26T18:59:29Z

...based on the range of values contained; in particular, a massive improvement when these ranges are disjoint (or merely touching), like in the neg-vs-pos benchmarks already in liballoc. Inspired by #64383 but none of the ideas there worked out.

I introduced another variant in IntersectionInner and in DifferenceInner, because I couldn't find a way to initialize these iterators as empty if there's no empty set around.

Also, reduced the size of "large" sets in test cases - if Miri can't handle it, it was needlessly slowing down everyone.

rust-highfive · 2019-09-26T18:59:39Z

r? @sfackler

(rust_highfive has picked a reviewer for you, use r? to override)

Centril · 2019-09-26T20:01:25Z

cc @scottmcm @bluss

ssomers

Meanwhile I tweaked the order in both match expressions to move first/min before last/max

ssomers · 2019-09-30T15:47:37Z

Property based tests and performance comparison by travis are now cleaned up and as complete as I can think off.

src/liballoc/collections/btree/set.rs

bluss · 2019-09-30T18:27:29Z

src/liballoc/collections/btree/set.rs

+                let mut other_iter = other.iter();
+                let other_min = other_iter.next().unwrap();
+                let other_max = other_iter.next_back().unwrap();
+                let mut self_iter = match (self_min.cmp(other_min), self_max.cmp(other_max)) {


In the previous method you use the Ord::cmp(x, y) style and here x.cmp(y). Either is fine but consistency is best.

I never noticed that. Let's count: we have 4 x Ord::cmp, 3 x cmp (counting pairs as one). Before I started messing about in this code, there was just 1 Ord::cmp and 2 cmp. Notice that cmp_opt acts as a replacement for Ord::cmp but uses cmp itself.

So I say, use the shorter, member cmp.

bluss · 2019-09-30T18:30:44Z

Nice! Cool benchmark setup. I only had nitpicks to contribute to the review. Would love if there was a way to write this without .unwrap() (using discriminants for control flow instead), but it is clear enough that they can never panic here. r=me when nitpicks are fixed to taste

Centril · 2019-09-30T23:14:52Z

Property based tests and performance comparison by travis are now cleaned up and as complete as I can think off.

Oh nice! -- Could we add the proptests to the test suite? cc @alexcrichton @nikomatsakis

ssomers · 2019-10-01T14:06:41Z

write this without .unwrap

I tried several times, but always hit unsavory amounts of indentation, remote else clauses or eRFC 2497. But now I think I saw the light, resulting in a little less code that is more readable (mostly by dropping some of the micro-optimization). Peculiar indentation courtesy of cargo fmt.

r=bluss

ssomers · 2019-10-01T14:19:42Z

src/liballoc/collections/btree/set.rs

+        {
+            (other_min, other_max)
+        } else {
+            return false; // other is empty


This else-part cannot be reached, due to the performance shortcut on top. It's possible to:

merge this let if with the if let above, but then it's not at all clear to the casual reader that it should return true

write a panic! explaining this, better than raw unwrap I guess, but pointless extra code

unreachable!("message") is the panic for that, but since we don't need a panic - false is correct, it seems this works just as well.

ssomers · 2019-10-01T14:38:53Z

Could we add the proptests to the test suite

I don't know what test suites there are, but seeing if cfg!(miri) { // Miri is too slow appear in the unit tests tells me not everyone would welcome proptests in the standard test suite. I could easily write a bunch of small unit tests covering every corner, but not in the current scheme with 1 test function testing every kind of intersection in 1 file covering everything about sets.

bluss · 2019-10-01T18:48:44Z

@bors r+ rollup

Thanks!

bors · 2019-10-01T18:48:46Z

📌 Commit d132a70 has been approved by bluss

BTreeSet intersection, is_subset & difference optimizations ...based on the range of values contained; in particular, a massive improvement when these ranges are disjoint (or merely touching), like in the neg-vs-pos benchmarks already in liballoc. Inspired by rust-lang#64383 but none of the ideas there worked out. I introduced another variant in IntersectionInner and in DifferenceInner, because I couldn't find a way to initialize these iterators as empty if there's no empty set around. Also, reduced the size of "large" sets in test cases - if Miri can't handle it, it was needlessly slowing down everyone.

@ghost

Rollup of 7 pull requests Successful merges: - #63416 (apfloat: improve doc comments) - #64820 (BTreeSet intersection, is_subset & difference optimizations) - #64910 (syntax: cleanup param, method, and misc parsing) - #64912 (Remove unneeded `fn main` blocks from docs) - #64933 (Fixes #64919. Suggest fix based on operator precendence.) - #64943 (Add lower bound doctests for `saturating_{add,sub}` signed ints) - #64950 (Simplify interners) Failed merges: r? @ghost

@bluss

BTreeSet symmetric_difference & union optimized No scalability changes, but: - Grew the cmp_opt function (shared by symmetric_difference & union) into a MergeIter, with less memory overhead than the pairs of Peekable iterators now, speeding up ~20% on my machine (not so clear on Travis though, I actually switched it off there because it wasn't consistent about identical code). Mainly meant to improve readability by sharing code, though it does end up using more lines of code. Extending and reusing the MergeIter in btree_map might be better, but I'm not sure that's possible or desirable. This MergeIter probably pretends to be more generic than it is, yet doesn't declare to be an iterator because there's no need to, it's only there to help construct genuine iterators SymmetricDifference & Union. - Compact the code of rust-lang#64820 by moving if/else into match guards. r? @bluss

@bluss

BTreeSet symmetric_difference & union optimized No scalability changes, but: - Grew the cmp_opt function (shared by symmetric_difference & union) into a MergeIter, with less memory overhead than the pairs of Peekable iterators now, speeding up ~20% on my machine (not so clear on Travis though, I actually switched it off there because it wasn't consistent about identical code). Mainly meant to improve readability by sharing code, though it does end up using more lines of code. Extending and reusing the MergeIter in btree_map might be better, but I'm not sure that's possible or desirable. This MergeIter probably pretends to be more generic than it is, yet doesn't declare to be an iterator because there's no need to, it's only there to help construct genuine iterators SymmetricDifference & Union. - Compact the code of rust-lang#64820 by moving if/else into match guards. r? @bluss

rust-highfive assigned sfackler Sep 26, 2019

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 26, 2019

ssomers commented Sep 27, 2019

View reviewed changes

ssomers force-pushed the master branch from 0ed38b5 to f16fa72 Compare September 30, 2019 11:20

bluss reviewed Sep 30, 2019

View reviewed changes

src/liballoc/collections/btree/set.rs Outdated Show resolved Hide resolved

bluss reviewed Sep 30, 2019

View reviewed changes

src/liballoc/collections/btree/set.rs Outdated Show resolved Hide resolved

bluss reviewed Sep 30, 2019

View reviewed changes

BTreeSet intersection, difference & is_subnet optimizations

d132a70

ssomers force-pushed the master branch from f16fa72 to d132a70 Compare October 1, 2019 13:59

ssomers commented Oct 1, 2019

View reviewed changes

bluss changed the title ~~BTreeSet intersection, is_subnet & difference optimizations~~ BTreeSet intersection, is_subset & difference optimizations Oct 1, 2019

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 1, 2019

tmandry mentioned this pull request Oct 1, 2019

Rollup of 8 pull requests #64966

Closed

Centril mentioned this pull request Oct 1, 2019

Rollup of 8 pull requests #64971

Closed

Centril mentioned this pull request Oct 1, 2019

Rollup of 7 pull requests #64972

Merged

bors merged commit d132a70 into rust-lang:master Oct 2, 2019

ssomers mentioned this pull request Oct 8, 2019

BTreeSet symmetric_difference & union optimized #65226

Merged

ssomers mentioned this pull request Oct 19, 2019

Tracking issue for map_first_last: first/last methods on BTreeSet and BTreeMap #62924

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BTreeSet intersection, is_subset & difference optimizations #64820

BTreeSet intersection, is_subset & difference optimizations #64820

ssomers commented Sep 26, 2019 •

edited

Loading

rust-highfive commented Sep 26, 2019

Centril commented Sep 26, 2019 •

edited

Loading

ssomers left a comment •

edited

Loading

ssomers commented Sep 30, 2019 •

edited

Loading

bluss Sep 30, 2019

ssomers Sep 30, 2019

bluss commented Sep 30, 2019

Centril commented Sep 30, 2019

ssomers commented Oct 1, 2019

ssomers Oct 1, 2019 •

edited

Loading

bluss Oct 1, 2019

ssomers commented Oct 1, 2019

bluss commented Oct 1, 2019

bors commented Oct 1, 2019

BTreeSet intersection, is_subset & difference optimizations #64820

BTreeSet intersection, is_subset & difference optimizations #64820

Conversation

ssomers commented Sep 26, 2019 • edited Loading

rust-highfive commented Sep 26, 2019

Centril commented Sep 26, 2019 • edited Loading

ssomers left a comment • edited Loading

Choose a reason for hiding this comment

ssomers commented Sep 30, 2019 • edited Loading

bluss Sep 30, 2019

Choose a reason for hiding this comment

ssomers Sep 30, 2019

Choose a reason for hiding this comment

bluss commented Sep 30, 2019

Centril commented Sep 30, 2019

ssomers commented Oct 1, 2019

ssomers Oct 1, 2019 • edited Loading

Choose a reason for hiding this comment

bluss Oct 1, 2019

Choose a reason for hiding this comment

ssomers commented Oct 1, 2019

bluss commented Oct 1, 2019

bors commented Oct 1, 2019

ssomers commented Sep 26, 2019 •

edited

Loading

Centril commented Sep 26, 2019 •

edited

Loading

ssomers left a comment •

edited

Loading

ssomers commented Sep 30, 2019 •

edited

Loading

ssomers Oct 1, 2019 •

edited

Loading