Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slowdown when inserting elements in the iteration order of another FxHashMap #45

Open
FeldrinH opened this issue Jun 19, 2024 · 2 comments

Comments

@FeldrinH
Copy link

FeldrinH commented Jun 19, 2024

It seems that in 2.0.0 there has been a regression in the fix to #14.

In 1.2.0 it was possible to avoid the issue by using different seeds for different FxHashMaps (by manually setting the seed or using a randomized seed). This no longer works in 2.0.0. Even if you use two FxHashMaps/FxHashSets with different seeds you still get extremely bad performance when values are inserted in the iteration order of another FxHashMap/FxHashSet.

For example, try this:

fn main() {
    let mut rnd = rand::thread_rng();

    let t0 = Instant::now();
    let mut h1 = HashSet::with_hasher(FxSeededState::with_seed(9943));
    for _ in 0..10_000_000 {
        h1.insert(rnd.gen::<u64>());
    }
    let t1 = Instant::now();
    println!("build: {}", (t1 - t0).as_secs_f64());

    let t0 = Instant::now();
    let mut h2 = HashSet::with_hasher(FxSeededState::with_seed(1829319203));
    for &n in h1.iter() {
        h2.insert(n);
    }
    let t1 = Instant::now();
    println!("rebuild: {}", (t1 - t0).as_secs_f64());

    println!("{}", h2.len());
}

On my computer this takes 0.3 seconds for the inital build, but 2.6 seconds for the rebuild with 2.0.0. With 1.2.0 this takes 0.3 seconds for both build and rebuild.

Note that FxHashSetRand and FxHashMapRand have the same problem, since no matter what seeds they generate you get more or less the same 8-9x slowdown.

@FeldrinH FeldrinH changed the title Slowdown when building an FxHashMap by inserting elements from another FxHashMap Slowdown when inserting elements in the iteration order of another FxHashMap Jun 19, 2024
@orlp
Copy link
Contributor

orlp commented Jun 19, 2024

The problem is that the new hash is completely linear with the seed, which still runs into the accidentally-quadratic probing issue.

JuniMay added a commit to JuniMay/orzcc that referenced this issue Jul 13, 2024
Because of the deterministic property of `FxHasher` in rustc-hash crate,
the performance will be extremely poor when there are iteration +
reinsertion operations.

A more thorough explanation can be found in the articles below:
-
https://accidentallyquadratic.tumblr.com/post/153545455987/rust-hash-iteration-reinsertion
- https://morestina.net/blog/1843/the-stable-hashmap-trap

And related issue in rustc-hash:

- rust-lang/rustc-hash#45
@FeldrinH
Copy link
Author

FeldrinH commented Aug 16, 2024

Is there any hope that this issue will be fixed in the near term? If not then I think a warning should be added to the readme and/or documentation. This issue makes the current hashing algorithm a potential source of extreme slowdown in any use case where hahsmaps are serialized and deserialized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants