-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify std and no_std APIs for IndexMap
and IndexSet
#207
Conversation
by using `DefaultHashBuilder` for the `S: BuildHasher` parameter.
For more info about the motivation behind this changeset, here's an example. Assume we start with this original code, which is std-only: let mut my_map = indexmap::IndexMap::new(); Before this PRDownstream crates require tedious changes to support both std and no_std environments that depend on At the crate root: #[cfg(feature = "std")]
type IndexMap<K, V> = indexmap::IndexMap<K, V>;
#[cfg(not(feature = "std"))]
type IndexMap<K, V> = indexmap::IndexMap<K, V, hashbrown::hash_map::DefaultHashBuilder>; And then, we must make these changes everywhere indexmap is used: #[cfg(feature = "std")]
let mut my_map = indexmap::IndexMap::new();
#[cfg(not(feature = "std"))]
let mut my_map = crate::IndexMap::with_hasher(hashbrown::hash_map::DefaultHashBuilder::new()); Such example crates include After this PRNo changes are required to the original code at all. 🎉 |
At a high level, I don't want to lock an external crate dependency in our public API, especially pre-1.0. That is, #[cfg(has_std)]
pub use std::collections::hash_map::RandomState as DefaultHashBuilder;
#[cfg(not(has_std))]
pub struct DefaultHashBuilder(ahash::RandomState);
// It could privately use any no_std choice internally with forwarding impls,
// or just write our own simple hasher to avoid any dependency.
pub struct IndexMap<K, V, S = DefaultHashBuilder> { ... } I'm not sure how safe it is for us to toggle the But it's weird because we detect I'll think about it, and give @bluss time to weigh in too... |
I see, thanks for the detailed response! I'm definitely keen to consider your concerns and make any requested changes; for ex, a newtype/wrapper struct is totally fine with me. The good thing about this PR's current changeset is that it's not a breaking change, as it only offers a default type parameter that can still be customized (as I'm sure you know).
I'm not sure what would happen if both no_std and std are concurrently enabled, as I can't think of a way to make that happen, but it seems to be beyond the scope of this PR. I'm not 100% sure, but I think that cargo might fail during the crate resolution phase (?). |
Sure, users can pick their own, but that doesn't give us free reign to change our default. Someone might write: let map: IndexMap<String, String, RandomState> = IndexMap::new(); ... then that's a strong commitment to the default Or a weirder case, probably less likely, is to depend on it by coherence: impl<K, V> MyTrait for IndexMap<K, V> {} // = DefaultHashBuilder ?
impl<K, V> MyTrait for IndexMap<K, V, RandomState> {}
impl<K, V> MyTrait for IndexMap<K, V, ahash::RandomState> {} ... that's only allowed if they are all distinct That's where I'm trying to explore if there's a way that our next release, with or without the |
Thanks for the reply again! I'd like to understand things in detail so I can help with any required changes, but I also don't want to waste your time. So feel free to address my below responses if you want, but no pressure.
Apologies for my ignorance here, but I don't see why this could be problematic. Such a statement would still work unchanged before & after this PR, right? Please correct me if I'm wrong. Regarding the coherence issue, that's an interesting scenario you bring up, and while I agree it's unlikely, I do of course respect the need to consider potential issues there. I think your suggestion of using a newtype wrapper should avoid the issue of conflicting trait implementations, since it would be a new struct type that previously wasn't available to implement traits for. Another option, admittedly not my preference, is to add a feature to the crate that would only set a default type for |
@cuviper can't it still be a problem if a crate is written using indexmap and nostd, and then it has its environment shifted by someone depending on that crate and std? Then it suddenly does not compile. The situation here is that this situation in the Pr is generally disallowed/unworkable but there might be some tricky argument for why it works. Great, more tricks, that didn't work so well for us with the std detection 😅 When I have time I want to release indexmap 2 to be able to solve this properly. |
@bluss thanks for the response!
I don't believe it's possible to construe a scenario where this change causes a compilation failure, but I'm not 100% perfectly sure. If you have an example, perhaps we can arrive at a solution that addresses that.
Are you saying the use case I presented when introducing this PR isn't allowable, or that the situation @cuviper brought up earlier isn't feasible?
Sure. However, it'd be really nice to have this change there now; it'd relieve folks of the burden from changing hundreds of LoC that use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To try to explain, this PR breaks the rule that cuviper mentioned
Generally, it would be a breaking change to have different S based on features, because one crate may depend on us with different features than another crate does, each expecting different default S, but that all gets unified by cargo.
and that means that the solution here is generally disallowed/unworkable unless a good explanation is found for why it would be allowed (Edit: To clarify, here I'm assuming the requested change for a wrapper type is done, that's the easy part, doesn't solve the crate feature question). This by the usual cargo rule that crates features have to be purely additive (and std is not, with this change). rust-lang/cargo#4328
@@ -5,6 +5,8 @@ pub use crate::rayon::set as rayon; | |||
|
|||
#[cfg(has_std)] | |||
use std::collections::hash_map::RandomState; | |||
#[cfg(not(has_std))] | |||
use hashbrown::hash_map::DefaultHashBuilder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @cuviper said, we can't expose hashbrown types in our public API. This change would make hashbrown a public dependency and it would not be possible for us to bump that version without changing our own major version. As cuviper said, we could just solve that with a wrapper.
Hashbrown is not becoming our public dependency, because it causes problems for our own versioning story.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, thanks for the clarification.
Thanks for the clarification! I'm more than happy to make the change to enclose hashbrown's |
Yes, but the other question is the harder problem to crack. |
Hmm, yeah that's a plausible scenario where even a working |
Agreed. Thanks for all the excellent feedback and clarifications regarding this std detection issue. In the interim, while waiting for a major version bump to v2.0.0, would my earlier idea of gating the |
That unfortunately does not change the situation in formal terms since either the std feature or your new feature is non-additive after that. |
Hmm, ok. I suppose this'll have to wait for v2.0 then (?). Let me know if there's any way I can help. |
Hey, thanks for exploring this anyway. You could have found something that we had missed (I have not been searching around in this space, and yes I had ignored anything about using features this way since I consider non-additive features something that we can't use). Non-additive features won't be possible in 2.0 either, of course, but maybe there is something that can be done when removing the build script solution and if we can admit some minor breaking change to new or similar (?) |
No problem, and thanks for your kind words. |
FYI, the master branch is now on |
@@ -38,8 +38,7 @@ rayon = { version = "1.2", optional = true } | |||
|
|||
[dependencies.hashbrown] | |||
version = "0.11" | |||
default-features = false | |||
features = ["raw"] | |||
features = ["ahash", "raw"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahash pulls in a non-trivial amount of dependencies. On linux it pulls in libc, version_check, cfg-if, once_cell and getrandom.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find, I hadn't noticed that since I primarily build for a custom no_std
target. If you're aware of other hashing crates with fewer dependencies, I could experiment with using them as the default. I'll look for others myself too, in the meantime.
If considering multiple hashing crates, we could gate the inclusion of each one behind a feature; this would allow std
and no_std
users alike to select different default hashers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can only do that if we make the features mutually exclusive, as a hard build error. You could have one part of a large dependency tree that wants indexmap
one way, and another part wanting the other way, but Cargo unifies dependencies into one build so we can't satisfy both.
But that's only about default S
and the few methods that assume that. Most of the API is perfectly fine with bring-your-own-hasher S
, so each part of that dependency tree can fill their own needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I had used that approach of pseudo-mutually-exclusive features for another project via the compile_error!()
macro, but it's admittedly tedious and feels hacky.
I do hear what you're saying about multiple downstream crates using indexmap
differently, and unfortunately there's no way around that.
Thanks for the heads up! I'll take a look at the changes and see if we can make this PR work for everyone. |
I think we'll still run into the fundamental issue, that a feature ("std") should not change the meaning of a type like |
In HashBrown, depending on whether the /// Default hasher for HashMap.
#[cfg(feature = "ahash")]
pub type DefaultHashBuilder = ahash::RandomState;
/// Dummy default hasher for `HashMap`.
#[cfg(not(feature = "ahash"))]
pub enum DefaultHashBuilder {} Would we consider something similar here? #[cfg(feature = "std")]
type DefaultHashBuilder = std::collections::hash_map::RandomState;
#[cfg(not(feature = "std"))]
#[cfg(feature = "ahash")]
type DefaultHashBuilder = ahash::RandomState;
#[cfg(not(feature = "std"))]
#[cfg(not(feature = "ahash"))]
pub enum DefaultHashBuilder {} This also means that to enable |
Relatedly, would we consider a change from: #[cfg(feature = "std")]
impl<K, V> IndexMap<K, V> {
/// Create a new map. (Does not allocate.)
#[inline]
pub fn new() -> Self {
...
}
...
} To the following, like the impl<K, V, S: BuildHasher + Default> IndexMap<K, V, S> {
/// Create a new map. (Does not allocate.)
#[inline]
pub fn new() -> Self {
...
}
...
} This would mean that the type IndexMap<K, V> = indexmap::IndexMap<K, V, hashbrown::hash_map::DefaultHashBuilder>; |
Note that Choosing different hashers based on features is still problematic, since features are unified by cargo. Two crates in the build tree may ask for Removing the default
|
Understood, thanks for the explanation and the links, really appreciate your reply and the article link. To summarise my understanding:
|
Note that |
Thanks for the update, appreciate it! |
Both
IndexMap
andIndexSet
restrict access to theirnew()
andwith_capacity()
methods behind thehas_std
feature gate.Turns out, this isn't actually required because we can use
DefaultHashBuilder
from thehashbrown
crate for the default value of theS: BuildHasher
type parameter.This unifies the API for downstream users of this crate that also support both std and no_std environments, which significantly improves ergonomics and simplifies code.
I have also modified the documentation to reflect these changes.
Note: I believe we could also remove the
#[cfg(has_std)]
restriction on the two macrosindexmap!()
andindexset!()
, but I didn't include those changes here yet. I can add them for consistency's sake, if you'd like.