Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to hashbrown's RawTable internally #131

Merged
merged 14 commits into from
Jul 18, 2020

Conversation

cuviper
Copy link
Member

@cuviper cuviper commented Jun 9, 2020

This switches IndexMapCore from its bespoke hash table to hashbrown::raw::RawTable<usize>, storing just the index into our ordered entries: Vec<Bucket<K, V>>. We lose the badge of having no unsafe code, but the overall implementation is much simpler, relying on the battle-hardened hashbrown for the tricky parts. I have also confirmed that the testsuite passes under cargo miri.

I'll post the benchmark comparison in a followup comment, but broadly speaking it appears faster for insertion and lookup, and slower for removal. I think that's a reasonable trade-off for this crate.

As a bonus, I also implemented a proper reserve and added shrink_to_fit, since they are now pretty straightforward.

cc @Amanieu -- thanks for hashbrown and for exposing RawTable to make this possible!

@cuviper cuviper requested a review from bluss June 9, 2020 23:38
@cuviper
Copy link
Member Author

cuviper commented Jun 9, 2020

Note that this does raise the MSRV to 1.32, but that's well more than a year old. Perhaps we should release 1.4.1 for other recent changes, then bump to 1.5.0 for this change. We can also upgrade to 2018 edition, but I didn't want to pollute this PR with that churn.

cargo benchcmp master hashbrown
 name                                         master ns/iter  hashbrown ns/iter  diff ns/iter   diff %  speedup
 entry_hashmap_150                            2,533           2,476                       -57   -2.25%   x 1.02
 entry_indexmap_150                           3,809           2,976                      -833  -21.87%   x 1.28
 few_retain_hashmap_100_000                   1,016,146       1,005,607               -10,539   -1.04%   x 1.01
 few_retain_indexmap_100_000                  999,829         490,861                -508,968  -50.91%   x 2.04
 grow_fnv_hashmap_100_000                     1,995,356       1,757,197              -238,159  -11.94%   x 1.14
 grow_fnv_indexmap_100_000                    2,959,801       2,424,241              -535,560  -18.09%   x 1.22
 half_retain_hashmap_100_000                  973,803         1,005,521                31,718    3.26%   x 0.97
 half_retain_indexmap_100_000                 1,131,965       900,842                -231,123  -20.42%   x 1.26
 hashmap_merge_shuffle                        494,258         467,592                 -26,666   -5.40%   x 1.06
 hashmap_merge_simple                         370,419         384,379                  13,960    3.77%   x 0.96
 indexmap_clone_for_sort_s                    370,587         377,810                   7,223    1.95%   x 0.98
 indexmap_clone_for_sort_u32                  4,987           19,638                   14,651  293.78%   x 0.25
 indexmap_merge_shuffle                       432,270         356,746                 -75,524  -17.47%   x 1.21
 indexmap_merge_simple                        361,191         269,597                 -91,594  -25.36%   x 1.34
 indexmap_simple_sort_s                       1,890,796       1,953,812                63,016    3.33%   x 0.97
 indexmap_simple_sort_u32                     646,615         631,954                 -14,661   -2.27%   x 1.02
 indexmap_sort_s                              1,597,762       1,591,000                -6,762   -0.42%   x 1.00
 indexmap_sort_u32                            520,058         526,665                   6,607    1.27%   x 0.99
 insert_hashmap_100_000                       2,179,030       2,364,903               185,873    8.53%   x 0.92
 insert_hashmap_10_000                        195,234         209,327                  14,093    7.22%   x 0.93
 insert_hashmap_150                           2,923           3,116                       193    6.60%   x 0.94
 insert_hashmap_int_bigvalue_10_000           229,529         232,528                   2,999    1.31%   x 0.99
 insert_hashmap_str_10_000                    194,549         196,252                   1,703    0.88%   x 0.99
 insert_hashmap_string_10_000                 970,037         944,748                 -25,289   -2.61%   x 1.03
 insert_hashmap_string_10_000                 878,069         923,558                  45,489    5.18%   x 0.95
 insert_hashmap_string_oneshot_10_000         929,238         941,393                  12,155    1.31%   x 0.99
 insert_indexmap_100_000                      2,125,932       2,246,446               120,514    5.67%   x 0.95
 insert_indexmap_10_000                       213,993         189,754                 -24,239  -11.33%   x 1.13
 insert_indexmap_150                          3,210           2,926                      -284   -8.85%   x 1.10
 insert_indexmap_int_bigvalue_10_000          282,849         262,640                 -20,209   -7.14%   x 1.08
 insert_indexmap_str_10_000                   226,982         207,124                 -19,858   -8.75%   x 1.10
 insert_indexmap_string_10_000                881,052         879,212                  -1,840   -0.21%   x 1.00
 insert_indexmap_string_10_000                936,610         882,226                 -54,384   -5.81%   x 1.06
 iter_black_box_hashmap_10_000                18,282          18,481                      199    1.09%   x 0.99
 iter_black_box_indexmap_10_000               3,028           3,007                       -21   -0.69%   x 1.01
 iter_sum_hashmap_10_000                      10,546          10,419                     -127   -1.20%   x 1.01
 iter_sum_indexmap_10_000                     2,045           2,016                       -29   -1.42%   x 1.01
 lookup_hashmap_100_000_inorder_multi         68,768          69,260                      492    0.72%   x 0.99
 lookup_hashmap_100_000_multi                 72,178          70,700                   -1,478   -2.05%   x 1.02
 lookup_hashmap_100_000_single                15              16                            1    6.67%   x 0.94
 lookup_hashmap_10_000_exist                  68,488          68,528                       40    0.06%   x 1.00
 lookup_hashmap_10_000_exist_string           122,444         123,332                     888    0.73%   x 0.99
 lookup_hashmap_10_000_exist_string_oneshot   101,518         101,856                     338    0.33%   x 1.00
 lookup_hashmap_10_000_noexist                66,051          66,800                      749    1.13%   x 0.99
 lookup_indexmap_100_000_inorder_multi        82,405          87,341                    4,936    5.99%   x 0.94
 lookup_indexmap_100_000_multi                124,984         121,523                  -3,461   -2.77%   x 1.03
 lookup_indexmap_100_000_single               25              24                           -1   -4.00%   x 1.04
 lookup_indexmap_10_000_exist                 119,710         89,257                  -30,453  -25.44%   x 1.34
 lookup_indexmap_10_000_exist_string          176,353         165,409                 -10,944   -6.21%   x 1.07
 lookup_indexmap_10_000_exist_string_oneshot  160,672         131,624                 -29,048  -18.08%   x 1.22
 lookup_indexmap_10_000_noexist               120,897         74,732                  -46,165  -38.19%   x 1.62
 many_retain_hashmap_100_000                  374,578         381,946                   7,368    1.97%   x 0.98
 many_retain_indexmap_100_000                 891,856         1,165,316               273,460   30.66%   x 0.77
 new_hashmap                                  5               5                             0    0.00%   x 1.00
 new_indexmap                                 4               6                             2   50.00%   x 0.67
 pop_indexmap_100_000                         1,079,201       1,611,619               532,418   49.33%   x 0.67
 shift_remove_indexmap_100_000_few            14,464,051      15,427,596              963,545    6.66%   x 0.94
 shift_remove_indexmap_2_000_full             2,188,620       2,642,177               453,557   20.72%   x 0.83
 swap_remove_indexmap_100_000                 3,587,036       4,523,900               936,864   26.12%   x 0.79
 with_capacity_10e5_hashmap                   562             556                          -6   -1.07%   x 1.01
 with_capacity_10e5_indexmap                  1,511           215                      -1,296  -85.77%   x 7.03

@cuviper
Copy link
Member Author

cuviper commented Jun 9, 2020

For future work, I've also been playing with parameterizing the index type, since rustc likes to use newtype-u32 indexes. We can extend to something like this without a breaking change:

pub struct IndexMap<K, V, S = RandomState, Idx = usize> { ... }
pub struct IndexSet<T, S = RandomState, Idx = usize> { ... }

Then internally we'd switch to RawTable<Idx>.

src/map.rs Show resolved Hide resolved
src/map.rs Show resolved Hide resolved
src/map_core.rs Outdated Show resolved Hide resolved
@bluss
Copy link
Member

bluss commented Jun 10, 2020

Really cool work.

So we lose the 32-bit index optimization, and still have some performance improvements, does that mean that there is more to gain?

We are giving up being implemented in safe Rust and can only do so if we show performance improvements that are above the noise. Some of the lookup benchmark cases do that, and the insert cases barely do it. The improved lookup benches are really encouraging.

I'll read RawTable and then come back to reviewing. Are you going to start experimenting with this version of indexmap in rustc, before it gets merged?

@cuviper
Copy link
Member Author

cuviper commented Jun 10, 2020

So we lose the 32-bit index optimization, and still have some performance improvements, does that mean that there is more to gain?

Maybe, but that answer depends on whether I understand that optimization. 🙂

AIUI, the benefit of the 32-bit index is that we can stuff a short hash in the other 32-bits, which means we can use the hash without another memory access to entries. That's used for:

  • Computing probe distance for Robin Hood hashing optimizations.
    • Hashbrown doesn't do Robin Hood, so this doesn't apply.
  • Comparing 32-bit hashes as a short-circuit before checking key equality.
    • Hashbrown already keeps 7 bits of the hash in its control word, which it matches by SIMD before key equality. Of course that's not as good as 32-bit, but it's something.
  • Using the short hash for reinsertion after resizing
    • We could potentially do this with hashbrown, but only if we're sure of the size class during every resizing operation.
    • In insert, hashbrown can avoid resizing even with zero capacity if the new value replaces a tombstone. We can't predict that from outside, so in order to control the size class, we'd have to force an unconditional reserve(1) beforehand.

Generally, I prefer the simplicity without those Pos hacks, especially since the performance holds. Also, that makes it easier to add in custom Idx types like I mentioned for future work.

Are you going to start experimenting with this version of indexmap in rustc, before it gets merged?

I do have local builds where I'm using this. So far it's a very slight improvement, less than 1%, but that's a good thing! For example, they reported a 21% slowdown in unused-warnings-check when switching to indexmap here, which I did confirm, but I see slightly improved performance with my new indexmap. That puts us in the realm of zero-cost abstractions, where FxIndexSet<T> performs similarly to a manual FxHashMap<T, Idx>, Vec<T> combination.

Here are those branches:
https://github.com/cuviper/indexmap/tree/shredded-potatoes-index-rustc
https://github.com/cuviper/rust/tree/indexmap

(You may need to adjust the rust Cargo.toml patch if you want to compile this yourself.)

src/map_core.rs Outdated Show resolved Hide resolved
@cuviper
Copy link
Member Author

cuviper commented Jun 18, 2020

@bluss where do you stand on this? Just waiting for time to review it?

@bluss
Copy link
Member

bluss commented Jun 19, 2020

This weekend should have some time where I can review

Cargo.toml Outdated Show resolved Hide resolved
src/map_core.rs Outdated Show resolved Hide resolved
src/map.rs Show resolved Hide resolved
src/map_core.rs Outdated Show resolved Hide resolved
src/map_core.rs Outdated Show resolved Hide resolved
@bluss
Copy link
Member

bluss commented Jun 20, 2020

There isn't much unsafe code, that's nice, but the questions of simultaneous reference and raw pointer validity stump me at the moment, so it's not trivial code.

@bluss
Copy link
Member

bluss commented Jun 28, 2020

Thanks all for your input on the discussion, and helping me find some solid footing again with the raw pointer code. Thanks cuviper for working on this - I'll work on helping you wrap this up as soon as you want.

There are two discomforting things left; I think it's something that's rather general to come up in this situation

  • We handle raw buckets (raw pointer wrappers) that are passed in structs and between function scopes

    • Usually in Rust, we use safe abstractions, maybe just a simple lifetime to help us with the book keeping when leaving local reasoning. Lifetimes often help us convert the problem into just using local reasoning instead of global reasoning.
      I'm not saying it's worth the trouble to wrap this in further abstractions here, to make it safer, but it's the things we try to do if it's possible.
  • We have internal functions that don't follow the usual safe/unsafe distinction

    • I believe it is best practice, that for example fn swap_remove_bucket(&mut self, raw_bucket: RawBucket) -> (usize, K, V),
      should be an unsafe method. The reason is that we must trust the caller to pass a correct raw bucket.
      I'll push a commit with this change.

    • We end up in a gray area here, since struct fields can not be made unsafe to assign (known issue), so we don't follow this completely stringently.

@bluss
Copy link
Member

bluss commented Jun 28, 2020

This might be the first time we raise the MSRV in 1.x, but with a good reason.

It looks like we might break serde_json, which builds CI with Rust 1.31? cc @dtolnay, when we raise ours to Rust 1.32 here.

Other top rdeps - toml, http, petgraph, they look fine from MSRV standpoint.

@cuviper
Copy link
Member Author

cuviper commented Jun 29, 2020

  • We handle raw buckets (raw pointer wrappers) that are passed in structs and between function scopes

    • Usually in Rust, we use safe abstractions, maybe just a simple lifetime to help us with the book keeping when leaving local reasoning. Lifetimes often help us convert the problem into just using local reasoning instead of global reasoning.
      I'm not saying it's worth the trouble to wrap this in further abstractions here, to make it safer, but it's the things we try to do if it's possible.

I'll play a little with abstracting this. It would be nice if we could contain unsafe even more tightly.

  • We have internal functions that don't follow the usual safe/unsafe distinction

    • I believe it is best practice, that for example fn swap_remove_bucket(&mut self, raw_bucket: RawBucket) -> (usize, K, V),
      should be an unsafe method. The reason is that we must trust the caller to pass a correct raw bucket.
      I'll push a commit with this change.

    • We end up in a gray area here, since struct fields can not be made unsafe to assign (known issue), so we don't follow this completely stringently.

Yeah, this is kind of like the tootsie-pop model, just like Vec's length and capacity are innocent-looking fields with important safety implications.

This might be the first time we raise the MSRV in 1.x, but with a good reason.

If we entertain the idea of indexmap 2.0, are there other changes you would want?

It looks like we might break serde_json, which builds CI with Rust 1.31?

I am reluctant to push MSRV changes on others, but at least that is only for their non-default preserve_order feature. Still, they have CI to test that on 1.31, so they'd have to at least hold that back in CI to keep working. Such downgrades are possible, but annoying to deal with.

@cuviper
Copy link
Member Author

cuviper commented Jun 29, 2020

I'll push a commit with this change.

That looks fine, thanks.

@bluss
Copy link
Member

bluss commented Jun 29, 2020

I don't think it's wrong to carefully raise the MSRV, it's the plan we have documented and promised since 1.0. However we can discuss it with those that depend on us.

@cuviper
Copy link
Member Author

cuviper commented Jun 29, 2020

I tried an abstraction to wrap the raw bucket with an appropriate lifetime, by bundling to a map reference. We really only need that in OccupiedEntry, and I found that this structure basically is that wrapper already, so trying to add a separate layer just adds confusing indirection.

I settled instead on just encapsulating this in its own module. I moved map_core to map::core, then added a further map::core::raw for anything that has to deal with raw buckets, which is the only unsafe code.

@bluss
Copy link
Member

bluss commented Jun 29, 2020

I agree

@bluss
Copy link
Member

bluss commented Jun 30, 2020

I can't really think of any breaking changes that are on the wishlist.

I definitely don't want to pile on work here, but a 2.0 version could include the parameterization by index type. Then we can update the Default impl to be fully generic without having to fear type inference regressions (and the other new* methods too?).

The "experimental" things - Equivalent trait works well, the MutableKeys trait I don't know, so those things seem like they could stay as they are for 2.0.

Which method to make the default .remove() - I still think the current decision is fine, no change from my end for 2.0.
(Good way to document this would be to say that "remove is the fastest way to remove an element, and if order should be preserved, ...")

src/map/core/raw.rs Outdated Show resolved Hide resolved
src/map/core/raw.rs Outdated Show resolved Hide resolved
src/map/core.rs Outdated Show resolved Hide resolved
@cuviper cuviper mentioned this pull request Jul 2, 2020
7 tasks
@cuviper
Copy link
Member Author

cuviper commented Jul 2, 2020

I opened #135 for 2.0 discussion.

@bluss
Copy link
Member

bluss commented Jul 8, 2020

I'd vote to merge this and go for indexmap 1.5 with this

@jonhoo
Copy link

jonhoo commented Jul 8, 2020

Not sure if my vote counts, but I agree :p

@cuviper
Copy link
Member Author

cuviper commented Jul 8, 2020

I'm fine with that.

For new APIs in the release notes, we have reverse from #128, and this PR's shrink_to_fit and improved reserve. Maybe clone_from in #125 is worth mentioning, just to know that you can now clone into an existing allocation.

cuviper and others added 13 commits July 16, 2020 14:16
These methods trust their caller to pass correct RawBucket values, so we
mark them unsafe to use the common safe/unsafe distinction. I used
allow(unused_unsafe) to write the functions in the (hopefully) future
style of internal unsafe blocks in unsafe functions.
It was an over-optimization to use `clear_no_drop`, which hurts the
possibility of using griddle as an alternate table implementation.
@cuviper
Copy link
Member Author

cuviper commented Jul 16, 2020

I've added the version bump and some release notes to this PR.

@Amanieu
Copy link

Amanieu commented Jul 16, 2020

insert_no_grow is now available in hashbrown 0.8.1.

@cuviper
Copy link
Member Author

cuviper commented Jul 16, 2020

Great! I've updated to use the new hashbrown methods.

@cuviper
Copy link
Member Author

cuviper commented Jul 18, 2020

Let's ship it!

@cuviper cuviper merged commit d293267 into indexmap-rs:master Jul 18, 2020
@cuviper cuviper deleted the shredded-potatoes branch July 23, 2020 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants