-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std.crypto.hash.sha2: cleanup add add more docs #19744
Conversation
Replace `Sha512224`, `Sha512256`, and `Sha512T224` with `fn Sha512Truncated(digest_bits: comptime_int)`. This required refactoring `Sha2x64(comptime params)` to `Sha2x64(comptime iv: [8]u64, digest_bits: comptime_int)` for user-specified `digest_bits`. I left ziglang#19697 alone but added a compile-time check that digest_bits is divisible by 8. Remove docs which restate type name. Add module docs and reference where IVs come from.
looks good so long as the test pass, but the |
I've made them public and updated the OP. |
Do not do that. SHA512 truncated to 256 bits is widely used, especially with anything using NaCl, libsodium and derivatives. To be honest, I don't like these changes at all. They are misleading. SHA-512/256 is not SHA-512 truncated to 256 bits. It shouldn't be renamed to We probably shouldn't make the truncation a parameter either, and stick to what exists. IVs for NIST variants are only defined for a small set, so we can't invent them for arbitrary output sizes. |
Cleanups are great (check for possible performance changes, though), but if introducing Maybe we should also try to keep SHA-256 and SHA-512 similar.
Names such as |
I did not know that. That makes sense because the
Sadly it's a parameter in section 5.3.6 of FIPS 180 for arbitrary output sizes where Whether or not it's a commonly used parameter for sizes other than 224 and 256 is another question. While I can personally think of uses for other sizes, I'll defer to your experience.
I agree. Not only is it hard to read the pub const Sha224 = Sha2x32(iv224, 224);
pub const Sha256 = Sha2x32(iv256, 256);
pub const Sha384 = Sha2x64(iv384, 384);
pub const Sha512 = Sha2x64(iv512, 512);
/// This may be more performant than SHA224 and SHA256.
pub const IvStrategy = enum {
/// Change the IV per NIST FIPS 180 before truncation.
/// These are for the official SHA-512/t types published in 2011.
change,
/// Use the original IV and truncate the end.
/// Commonly used in implementations before 2011.
keep,
};
/// SHA-512 truncated to length `digest_bits`.
/// This may be more performant than SHA224 and SHA256.
pub fn Sha512Truncated(digest_bits: comptime_int, comptime iv_strategy: IvStrategy) type {
const iv = switch (iv_strategy) {
.change => sha512iv(digest_bits),
.keep => iv256,
};
return Sha2x64(iv, digest_bits);
} This will force truncated SHA2 users to carefully select their truncation strategy instead of flipping a coin and being frustrated when it doesn't match some other implementation.
You really know the history! Since that one is unambiguous it's easy to define: /// Truncated to 192 bytes per FIPS 800-208.
pub const Sha256_192 = Sha2x32(iv256, 192); |
I've never seen any actual protocol or application where other sizes are used. We should avoid allowing insecure parameters, or really uncommon parameters that will cause interoperability problems.
If someone really needs something weird such as a 480 bit output with SHA512, it's already possible without additional types and functions, by explicitly truncating the SHA512 result. Being explicit in such weird cases is not a bad thing. So, I'd suggest keeping the interface simple and only exposing types for constructions that are actually used: That's already a ton of choices. Adding details about IV strategy would be even more confusing. |
I can come up with some creative bit-packing use cases, but I defer to you.
Yes, by reimplementing truncation to the proper For either case, an author would probably find it easier to modify
The confusion exists either way between the If there was standardized notation I wouldn't argue further, but I've seen It's just a question of how to expose the complexity. I'd prefer a generic type and to trust the user with security but I understand why you'd prefer instantiated common variants and to not trust the user. I've amended this PR to do the latter. Please rereview. Benchmarks coming soon. |
Performance difference is within margin of error. Looks like rolling those 2 loops changed nothing but readability 🎉 . ~/src/poop(main)$ zig version
0.12.0-dev.3673+1fb238131
~/src/poop(main)$ zig build-exe ./sha.zig -O ReleaseFast -femit-bin=new --zig-lib-dir ../zig/lib
~/src/poop(main)$ zig build-exe ./sha.zig -O ReleaseFast -femit-bin=old
~/src/poop(main)$ ./zig-out/bin/poop ./old ./new
Benchmark 1 (323 runs): ./old
measurement mean ± σ min … max outliers delta
wall_time 15.4ms ± 317us 14.9ms … 17.2ms 9 ( 3%) 0%
peak_rss 0 ± 0 0 … 0 0 ( 0%) 0%
cpu_cycles 79.6M ± 828K 79.1M … 88.2M 39 (12%) 0%
instructions 340M ± 2.63 340M … 340M 34 (11%) 0%
cache_references 5.02K ± 25.8K 1.29K … 460K 34 (11%) 0%
cache_misses 428 ± 106 181 … 1.72K 39 (12%) 0%
branch_misses 310 ± 27.7 222 … 528 7 ( 2%) 0%
Benchmark 2 (324 runs): ./new
measurement mean ± σ min … max outliers delta
wall_time 15.4ms ± 282us 14.9ms … 16.6ms 13 ( 4%) + 0.2% ± 0.3%
peak_rss 0 ± 0 0 … 0 0 ( 0%) - nan% ± -nan%
cpu_cycles 79.6M ± 669K 79.2M … 84.2M 47 (15%) - 0.0% ± 0.1%
instructions 340M ± 2.66 340M … 340M 34 (10%) + 0.0% ± 0.0%
cache_references 4.20K ± 9.46K 1.28K … 143K 40 (12%) - 16.4% ± 59.7%
cache_misses 439 ± 101 258 … 1.07K 48 (15%) + 2.8% ± 3.7%
branch_misses 309 ± 26.4 214 … 476 16 ( 5%) - 0.3% ± 1.3% Edit: Here's sha.zig |
These inconsistent notations are annoying, but specifications would always refer to these as Go's standard library only exposes SHA224, SHA256, SHA384, SHA512, SHA512_224, SHA512_256 because that's enough to implement all standard protocols. JavaScript and Python even get away with SHA512_224 and SHA512_256. There's a gap between NIST documents, especially ancient ones, and what's used in the industry. If we wanted to properly support these documents, we should even support lengths with a bit granularity. Since then, NIST adopted SHAKE for hashing with variable-length output, so even in a pure NIST world, custom SHA2 truncation is going to stay a funny thing from an old specification. |
Great! |
I loved your doc comment so much I made it into a namespace. /// Truncating the output of SHA2-based hash functions improves security by mitigating...
pub const truncated = struct {
pub const Sha256T192 = Sha2x32(iv256, 256);
...
}; Happy to remove if you dislike it, I just think it should be featured in the docs. |
SHA-384 et al. are also truncated, so they should go there. But it may make them harder to find. Maybe we could just mention that in the sha2 namespace comment. |
Or mention in the documentation of SHA-256 and SHA-512 that unlike their truncated counterparts, they are vulnerable to length extension attacks. |
I moved the truncation comment to the module docs, dropped the overeager |
Some cleanup I made during changes in #19308 before #19697. This is no longer necessary for that branch, but it does allow arbitrary SHA512 digests and improves readability.