Remove Identity hash from table, add helper function and examples of alternatives #201

mriise · 2022-04-01T23:29:59Z

related to #196

closes #194

In order to allow for easier transition for downstream users, this adds a few ideas how to implement the identity hasher.

Undecided yet is about what to do with non-const sized identity hashers. Some users may be ok with allocating in order to have larger identity hashes. This is a fairly far edge-case which can be covered, but is debatable whether it should actually be actively supported by us.
(update): We won't be supporting alloc-dependent hashes in this crate for now unless significant desire exists to add them.

~~Another thing introduced is the bytes crate which can do fun zero-copy things. This part was mostly for me to become more familiar with the crate.~~

vmx

Thanks for this nice set of examples.

Would you then suggest that the upgrade path for users of the identity feature is, that they should use their own codec table?

examples/identity_hasher.rs

mriise · 2022-04-05T20:09:42Z

Would you then suggest that the upgrade path for users of the identity feature is, that they should use their own codec table?

Yes, having a default can only cover one or maybe two use cases when it would be better for lib users to decide how they want to handle it. Adding a helper function might be helpful, but again it depends on how they want to handle excess data and should be implemented by the user.

…dentity hashes

vmx

Please update the modules top-level documentation. It still mentions the identity hash. There should be some information about the identity hash on how to use it if you need it.

examples/identity_hasher.rs

vmx · 2022-08-11T09:00:54Z

@mxinden I'd like to get your approval before this one gets merged. As libp2p is one of the users of the identity hash.

mxinden · 2022-08-12T08:13:18Z

Thanks for the ping @vmx. I am not quite sure I understand the consequences of this change for rust-libp2p.

Do I understand correctly that we would need to special-case the identity hash parsing from a [u8] here?

https://github.com/libp2p/rust-libp2p/blob/a4110a2b6939d5b460f11df6dd158308f517becf/core/src/peer_id.rs#L71-L75

vmx · 2022-08-12T12:20:57Z

Do I understand correctly that we would need to special-case the identity hash parsing from a [u8] here?

Yes that's correct. You'd also need to special case here:
https://github.com/libp2p/rust-libp2p/blob/a4110a2b6939d5b460f11df6dd158308f517becf/core/src/peer_id.rs#L60-L66

Alternatively you could implement your own Code enum, which only includes SHA2-256 and the identity hash (which is also what the new example from this PR does).

mxinden · 2022-08-13T03:06:28Z

That sounds reasonable to me. Thanks @vmx.

BigLep · 2022-10-04T22:42:09Z

@mriise : are you able to incorporate comments here so this can be merged?

BigLep · 2022-11-22T23:35:25Z

@mriise : are you able to incorporate comments here, or should we close?

thomaseizinger

+1 for removing footguns.

thomaseizinger · 2022-12-20T01:34:14Z

src/multihash_impl.rs

-    // The following hashes are not cryptographically secure hashes and are not enabled by default
-    /// Identity hash (max. 64 bytes)
-    #[cfg(feature = "identity")]
-    #[mh(code = 0x00, hasher = crate::IdentityHasher::<64>)]
-    Identity,


Instead of directly removing it, may I suggest the use of #[deprecated] to make this a non-breaking change?

That's not possible, unfortunately. The footgun is that one might accidentally construct such a hasher, attempt to hash some data, then panic. This won't hit any "deprecated" warnings.

I am not sure I understand. If the user uses this code, adding #[deprecated] to it will flag it right?

We can point the user to anything in the deprecation note, including explaining the footgun.

To be clear, I am suggesting to add #[deprecated] to the enum variant.

The foot-gun is Code::try_from(number).digest(...).

If the identity feature gets enabled (e.g., by some other crate in the dependency tree).

If number happens to be 0.

This will panic.

(assuming that the input buffer ... is too long)

Right, now I get you. Thanks for explaining!

We can still add code that will automatically trigger a deprecation warning once the identity feature gets enabled. I did similar stuff in rust-libp2p.

Can push a PoC in a bit.

thomaseizinger · 2022-12-20T01:35:40Z

src/lib.rs

+/// Code reserved for Identity "hash"
+pub const IDENTITY_CODE: u64 = 0x0;


This doesn't necessarily need to be public, can we make it crate-private?

its still useful imo, e.g. Multihash::wrap(multihash::IDENTITY_CODE, &bytes) or doing a match for it.

I see that. I am just wary about extending the public API with new items that are not essential. If we want to discourage the use of the identity hasher, it seems reasonable to have users declare that constant by themselves if they actually want to use it.

thomaseizinger · 2022-12-20T01:35:49Z

src/lib.rs

+pub fn identity_hash<const S: usize>(data: impl AsRef<[u8]>) -> Result<MultihashGeneric<S>> {
+    MultihashGeneric::wrap(IDENTITY_CODE, data.as_ref())
+}


Would it make sense to offer this as Multihash::identity instead?

thomaseizinger · 2022-12-20T01:38:01Z

examples/identity_hasher.rs

+    pub const IDENTITY_CODE: u64 = 0x0;
+    // blind copy of the input array
+    pub fn identity_hash_arr<const S: usize>(input: &[u8; S]) -> MultihashGeneric<S> {
+        MultihashGeneric::wrap(IDENTITY_CODE, input).unwrap()
+    }
+
+    // input is truncated to S size
+    pub fn identity_hash<const S: usize>(input: &[u8]) -> MultihashGeneric<S> {
+        let mut hasher = IdentityTrunk::<S>::default();
+        hasher.update(input);
+        MultihashGeneric::wrap(IDENTITY_CODE, hasher.finalize()).unwrap()
+    }


So which one is the recommended way of doing it? I am not sure it make sense to add an example where the user is again left with two choices.

The problem we have here is that these are both "valid" technically, the "what to do with too small of a buffer" is entirely based around the requirements of the system you are writing for. Most of the time I would go with identity_hash_arr since that forces users to fit data into a certain size before the library gets it.

These are examples anyway so just using the function defined in in lib.rs will be fine.

thomaseizinger · 2022-12-20T21:38:13Z

Cargo.toml

@@ -27,6 +27,7 @@ serde-codec = ["serde", "serde-big-array"]

 blake2b = ["blake2b_simd"]
 blake2s = ["blake2s_simd"]
+# identity feature is depricated


Suggested change

# identity feature is depricated

# identity feature is deprecated

thomaseizinger · 2022-12-20T21:38:27Z

src/lib.rs

 //!  - `sha1`: Enable SHA-1 hasher
 //!  - `sha2`: (default) Enable SHA-2 hashers
 //!  - `sha3`: (default) Enable SHA-3 hashers
 //!  - `strobe`: Enable Strobe hashers
+//!  - `identity`: A depricated feature for identity hashes.


Suggested change

//! - `identity`: A depricated feature for identity hashes.

//! - `identity`: A deprecated feature for identity hashes.

thomaseizinger · 2022-12-20T21:41:48Z

src/multihash_impl.rs

-    // The following hashes are not cryptographically secure hashes and are not enabled by default
-    /// Identity hash (max. 64 bytes)
-    #[cfg(feature = "identity")]
-    #[mh(code = 0x00, hasher = crate::IdentityHasher::<64>)]
-    Identity,


I am not sure I understand. If the user uses this code, adding #[deprecated] to it will flag it right?

We can point the user to anything in the deprecation note, including explaining the footgun.

thomaseizinger · 2022-12-20T21:45:11Z

src/lib.rs

+/// Will error if `data.len() > S`
+///
+/// See examples for a few other approaches if this doesn't fit your application
+pub fn identity_hash<const S: usize>(data: impl AsRef<[u8]>) -> Result<MultihashGeneric<S>> {
+    MultihashGeneric::wrap(IDENTITY_CODE, data.as_ref())
+}


It seems inconsistent to me to provide only this here when we are showing two ways to do it in the example.

Why have this at all? Seems trivial to implement so I am not sure what we are gaining here. At the same time, we are adding an item to our public API which in the future might cause a breaking change.

thomaseizinger · 2022-12-20T21:46:44Z

src/multihash_impl.rs

-    // The following hashes are not cryptographically secure hashes and are not enabled by default
-    /// Identity hash (max. 64 bytes)
-    #[cfg(feature = "identity")]
-    #[mh(code = 0x00, hasher = crate::IdentityHasher::<64>)]
-    Identity,


To be clear, I am suggesting to add #[deprecated] to the enum variant.

thomaseizinger

For the sake of making this a non-breaking change and working towards minimizing them in the future, can I ask that we:

Not remove the Identity variant, but add #[deprecated] to it explicitly.
Not add any new public API items. What we add here seems trivial to implement outside of the crate. It is my understanding that the IdentityHasher per se is not bad, users just need to be careful when using it. Having them write a bit more code seems like a code way of getting them to think what they are doing there.
Apply Issue deprecation warning on Code if identity feature is enabled #260 to warn users about the footgun in combination with the identity feature.

Thanks for considering :)

vmx · 2022-12-21T08:51:58Z

Not add any new public API items.

I think that API comes from my request, when I said there needs to be a proper replacement for people that currently use it. I wanted a clean upgrade path. Though it can also be a well defined function in the (now existing ;) changelog, it doesn't have to be a public API (especially if we move the code table to a separate crate as it currently is discussed at #259).

thomaseizinger · 2023-01-09T06:25:31Z

Not add any new public API items.

I think that API comes from my request, when I said there needs to be a proper replacement for people that currently use it. I wanted a clean upgrade path. Though it can also be a well defined function in the (now existing ;) changelog, it doesn't have to be a public API (especially if we move the code table to a separate crate as it currently is discussed at #259).

I think a recommendation on how to deal with the identity hash is a good idea. I have no strong opinion whether it is an example, rustdoc comment, test, changelog entry or something else :)

thomaseizinger · 2023-04-11T18:29:40Z

There is #289 now which is up-to-date with latest master.

vmx · 2023-05-31T09:05:22Z

As #289 was merged, I'll close this one.

mriise requested review from Stebalien and vmx April 1, 2022 23:30

add alternative identity examples

0221420

mriise force-pushed the ident-examples branch from 939a9b0 to 0221420 Compare April 1, 2022 23:32

vmx reviewed Apr 4, 2022

View reviewed changes

examples/identity_hasher.rs Outdated Show resolved Hide resolved

mriise and others added 4 commits April 5, 2022 13:40

remove non-applicable examples

1a3064e

Merge branch 'multiformats:master' into ident-examples

1c9de68

remove identity hash from table, add helper function for generating i…

57c14f8

…dentity hashes

rustfmt

55dcbc1

mriise marked this pull request as ready for review August 11, 2022 06:42

mriise requested a review from vmx August 11, 2022 06:42

mriise changed the title ~~add alternative identity examples~~ Remove Identity hash from table, add helper function and examples of alternatives Aug 11, 2022

vmx requested changes Aug 11, 2022

View reviewed changes

examples/identity_hasher.rs Outdated Show resolved Hide resolved

thomaseizinger reviewed Dec 20, 2022

View reviewed changes

mriise added 2 commits December 20, 2022 09:45

use assert in example, mark identity feature as depricated

62232fb

fmt

fe16730

thomaseizinger reviewed Dec 20, 2022

View reviewed changes

thomaseizinger mentioned this pull request Dec 21, 2022

Issue deprecation warning on Code if identity feature is enabled #260

Closed

thomaseizinger reviewed Dec 21, 2022

View reviewed changes

thomaseizinger mentioned this pull request Apr 14, 2023

feat(codetable): remove identity hasher #289

Merged

vmx closed this May 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove Identity hash from table, add helper function and examples of alternatives #201

Remove Identity hash from table, add helper function and examples of alternatives #201

mriise commented Apr 1, 2022 •

edited

Loading

vmx left a comment

mriise commented Apr 5, 2022 •

edited

Loading

vmx left a comment

vmx commented Aug 11, 2022

mxinden commented Aug 12, 2022

vmx commented Aug 12, 2022

mxinden commented Aug 13, 2022

BigLep commented Oct 4, 2022

BigLep commented Nov 22, 2022

thomaseizinger left a comment

thomaseizinger Dec 20, 2022

Stebalien Dec 20, 2022

thomaseizinger Dec 20, 2022

thomaseizinger Dec 20, 2022

Stebalien Dec 20, 2022

Stebalien Dec 20, 2022 •

edited

Loading

thomaseizinger Dec 20, 2022

thomaseizinger Dec 21, 2022

thomaseizinger Dec 20, 2022

mriise Dec 20, 2022 •

edited

Loading

thomaseizinger Dec 21, 2022 •

edited

Loading

thomaseizinger Dec 20, 2022

thomaseizinger Dec 20, 2022

mriise Dec 20, 2022

thomaseizinger Dec 20, 2022

thomaseizinger Dec 20, 2022

thomaseizinger Dec 20, 2022

thomaseizinger Dec 20, 2022

thomaseizinger Dec 20, 2022

thomaseizinger left a comment

vmx commented Dec 21, 2022

thomaseizinger commented Jan 9, 2023

thomaseizinger commented Apr 11, 2023

vmx commented May 31, 2023

		/// Code reserved for Identity "hash"
		pub const IDENTITY_CODE: u64 = 0x0;

	# identity feature is depricated
	# identity feature is deprecated

	//! - `identity`: A depricated feature for identity hashes.
	//! - `identity`: A deprecated feature for identity hashes.

Remove Identity hash from table, add helper function and examples of alternatives #201

Remove Identity hash from table, add helper function and examples of alternatives #201

Conversation

mriise commented Apr 1, 2022 • edited Loading

vmx left a comment

Choose a reason for hiding this comment

mriise commented Apr 5, 2022 • edited Loading

vmx left a comment

Choose a reason for hiding this comment

vmx commented Aug 11, 2022

mxinden commented Aug 12, 2022

vmx commented Aug 12, 2022

mxinden commented Aug 13, 2022

BigLep commented Oct 4, 2022

BigLep commented Nov 22, 2022

thomaseizinger left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Stebalien Dec 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mriise Dec 20, 2022 • edited Loading

Choose a reason for hiding this comment

thomaseizinger Dec 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thomaseizinger left a comment

Choose a reason for hiding this comment

vmx commented Dec 21, 2022

thomaseizinger commented Jan 9, 2023

thomaseizinger commented Apr 11, 2023

vmx commented May 31, 2023

mriise commented Apr 1, 2022 •

edited

Loading

mriise commented Apr 5, 2022 •

edited

Loading

Stebalien Dec 20, 2022 •

edited

Loading

mriise Dec 20, 2022 •

edited

Loading

thomaseizinger Dec 21, 2022 •

edited

Loading