Skip to content

Commit

Permalink
Replace ahash with foldhash
Browse files Browse the repository at this point in the history
  • Loading branch information
mo8it committed Oct 14, 2024
1 parent baeeff3 commit a675cb5
Show file tree
Hide file tree
Showing 4 changed files with 35 additions and 68 deletions.
86 changes: 27 additions & 59 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,10 @@ include = [
]

[dependencies]
ahash = { version = "0.8.11", default-features = false }
anyhow = "1.0.89"
clap = { version = "4.5.18", features = ["derive"] }
clap = { version = "4.5.20", features = ["derive"] }
crossterm = { version = "0.28.1", default-features = false, features = ["windows", "events"] }
foldhash = "0.1.3"
notify = { version = "6.1.1", default-features = false, features = ["macos_fsevent"] }
os_pipe = "1.2.1"
rustlings-macros = { path = "rustlings-macros", version = "=6.3.0" }
Expand All @@ -61,7 +61,7 @@ toml_edit.workspace = true
rustix = { version = "0.38.37", default-features = false, features = ["std", "stdio", "termios"] }

[dev-dependencies]
tempfile = "3.12.0"
tempfile = "3.13.0"

[profile.release]
panic = "abort"
Expand Down
2 changes: 1 addition & 1 deletion clippy.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ disallowed-types = [
]

disallowed-methods = [
# We use `ahash` instead of the default hasher.
# We use `foldhash` instead of the default hasher.
"std::collections::HashSet::new",
"std::collections::HashSet::with_capacity",
# Inefficient. Use `.queue(…)` instead.
Expand Down
9 changes: 4 additions & 5 deletions src/collections.rs
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
use ahash::AHasher;
use std::hash::BuildHasherDefault;
use foldhash::fast::FixedState;

/// DOS attacks aren't a concern for Rustlings. Therefore, we use `ahash` with fixed seeds.
pub type HashSet<T> = std::collections::HashSet<T, BuildHasherDefault<AHasher>>;
/// DOS attacks aren't a concern for Rustlings. Therefore, we use `foldhash` with a fixed state.

This comment has been minimized.

Copy link
@Nahor

Nahor Oct 14, 2024

Contributor

Why even bother with a non-default hash? It seems to only be used to track exercises and paths, i.e. very few unless someone is trying to hack the system, and I/O should be the limiting factor anyway.

This comment has been minimized.

Copy link
@mo8it

mo8it Oct 16, 2024

Author Contributor

Why not if you know that you don't need the security overhead of the default hasher? It is only one tiny crate without any dependencies.

This comment has been minimized.

Copy link
@Nahor

Nahor Oct 16, 2024

Contributor

you know that you don't need the security overhead

But that overhead is insignificant in this case, we are talking about microseconds in the worst case scenario (thousands of exercises/hashes), completely dwarfed by the cost of requesting the file info from the OS (itself dwarfed by the cost of querying the HDD/SSD if the data is not already in the OS file cache). It's just a micro-optimization, while that dependency still needs to be downloaded and compiled.

It is only one tiny crate without any dependencies.

The crate might be tiny, but so is the performance gain 😜 (and arguably, the performance gain is actually tinier 😜😜). And the crate might not have a dependency, but is itself a dependency.

More importantly IMHO, dependencies add risks, see the recent "xz incident" or more generally "supply chain attacks". Rustlings may not as big a target as systemd/openssh but it could still be an opportunity target or just collateral.

If Rust was providing two hashes and you chose the fastest (i.e. not external dependency), I would be with you. Everything else being equal, one might as well go with "fast". But adding a third-party dependency for it, is IMHO, much more costly (build time, attack surface, maintenance) than the (unnoticeable) performance gain.
Personally, I believe that dependencies should only be added if they provide some functionality missing from the standard library, or if they have a practical beneficial impact. Neither of which is the case here.

This comment has been minimized.

Copy link
@mo8it

mo8it Oct 17, 2024

Author Contributor

Fine, I removed it in 7e2f56f :)

This comment has been minimized.

Copy link
@Nahor

Nahor Oct 17, 2024

Contributor

I was not forcing you, I was just stating my opinion 😊, but thanks 🥰

pub type HashSet<T> = std::collections::HashSet<T, FixedState>;

#[inline]
pub fn hash_set_with_capacity<T>(capacity: usize) -> HashSet<T> {
HashSet::with_capacity_and_hasher(capacity, BuildHasherDefault::<AHasher>::default())
HashSet::with_capacity_and_hasher(capacity, FixedState::default())
}

0 comments on commit a675cb5

Please sign in to comment.