Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- str now emits a delimiter of its own length - str and [u8] hash the same - Hasher::delimiter customizes how a delimiter is handled Add method `fn delimit(&mut self, len: usize)` to Hasher. This method makes the hasher emit a delimiter for a chunk of length `len`. For example str and slices both emit a delimiter for their length during hashing. The Hasher impl decides how to implement the delimiter. By default it emits the whole `usize` as data to the hashing stream. SipHash will ignore the first delimiter and hash the others as data. Since it hashes in the total length, hashing all but one delimiters is equivalent to hashing all lengths. For the next example, take something like farmhash that is not designed for streaming hashing. It could be implemented like this: - Every call to Hasher::write runs the whole hashing algorithm. Previous hash is xored together with the new result. - Delimiters are ignored, since the length of each chunk to write is already hashed in. It follows a sketch of how siphash and farmhash could work with this change: When hashing a: &[u8] - SipHash: `write(a); finish();` - Farmhash: `hash = write(a); hash` Both SipHash and Farmhash will hash just the bytes of a string in a single Hasher::write and a single Hasher::finish. When hashing (a: &[u8], b: [u8]): - SipHash: `write(a); write(b.len()); write(b); finish();` - Farmhash: `hash = write(a); hash ^= write(b); hash`
- Loading branch information