Added padding to Poseidon duplex #26

rozbb · 2021-08-24T06:30:40Z

Description

The Poseidon duplex construction in this crate was trivially insecure. I wrote a regression test, test_collision, that demonstrates two absorb inputs which produce the same squeeze output. The problem was that there was no padding applied to any absorbed messages.

As mentioned in the duplex paper a "sponge-compliant" padding scheme (Definition 1) is crucial to the security claims of the sponge and duplex constructions. So I went ahead and adapted the multi-rate padding function (Section 7) to a finite field setting. The only caveat is that this doesn't work when F = Z/2Z. I assume that's not an issue though. If it is, I can make another version that ends up being even less efficient, requiring all Poseidon rates to be at least 3 in order to function.

Two things that still need to be done (and see Zcash's impl for a reference):

Don't chunk the contents of absorb and feed it all in. As the paper specifies,absorb's input len should be limited to the maximum that can fit in a rate-sized block, after padding. The current auto-chunking trick makes it so that absorb(a || b) and absorb(a), absorb(b) yield the same state. This isn't good. Instead, the duplex needs to limit absorbs and force the user to figure out message packet system that disambiguates the things that they care about. STROBE disambiguates by prepending every new operation with a special message.
Don't allow squeeze to output more than rate bytes. This is also specified by the paper. I'm actually less confident about this suggestion because it seems fine to simply permute() when bytes run out. This is because, by the padding principle, the input will never collide with an actual message. Concretely, this mechanism still satisfies the injectivity claim proven by Lemma 4 in the paper. Also, if this functionality is removed, I'm not certain how one would get more bytes out of a stream.

Finally, I commented out test_poseidon_sponge_consistency because I couldn't figure out how to re-derive the new expected values for the output. That's an easy fix if you know which traits give you a base-10 representation of your field elements (which I do not).

Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

Targeted PR against correct branch (master)
Linked to Github issue with discussion and accepted design OR have an explanation in the PR that describes this work.
Wrote unit tests
Updated relevant documentation in the code
Added a relevant changelog entry to the Pending section in CHANGELOG.md
Re-reviewed Files changed in the Github PR explorer

weikengchen · 2021-08-24T15:10:44Z

We will need to examine the changes in this PR further and likely consider adding a Poseidon duplex wrapper that is less "raw" than what arkworks-rs currently has.

I agree that:

The original sponge is too raw, and if used inappropriately, it is not a secure duplex sponge. E.g., as you mention, H(a) = H(a || 0) and absorb(a || b) = absorb(a), absorb(b). These are not random.
With careful wrapping (mostly padding some length-related information), the properties above can be removed, and it can be used more securely.

The issue is this:

The new sponge does add a lot of overhead.
For a few upstream applications currently using Poseidon, such as Merkle tree and Marlin AHP, the original sponge works fine (since we provide fixed-length messages). But I agree this is risky as well. I am thinking about how large the performance impact it would cause, but should be manageable.

The second point you mentioned, regarding the limitation of the output, may be not necessary. Based on the paper, it seems that it limits the amount of bytes (in our case, field elements) that can be outputted "each round". But the sponge implementation in arkworks-rs permutes the state before outputting more bytes (so, the sponge enters the next round), so it should be fine---but padding is necessary to show that this round is different from other rounds, in that it has zero input.

rozbb · 2021-08-25T17:40:15Z

These are fair points. Some thoughts:

It's entirely unclear to me how to make any iteration of this hard to misuse for an end user. Forcing small message sizes is one thing, but this risks the user breaking up larger messages into smaller ones without adding any metadata (this is a collision). Or you could not pad anything until the last input block, in which case you get the same problem but at least you save the overhead you're talking about.
One immediate thing that could be done re padding is to have a separate padding function that the user defines. This is what the zcash impl does. In the constant-length case, the padding function would be a no-op (I think? check me on that)

rozbb · 2021-08-25T18:06:13Z

Also, st4d pointed me to a specification draft for this construction: C2SP/C2SP#3

hdevalence · 2021-10-04T19:16:21Z

We will need to examine the changes in this PR further and likely consider adding a Poseidon duplex wrapper that is less "raw" than what arkworks-rs currently has.

I would actually suggest that the problem with the current API is that it is both "too raw" and "not raw enough". On the one hand, it's not yet safe for general use, but on the other hand, it doesn't expose the underlying permutation in a way that would allow implementing different constructions (fixed-length hashing, duplex variants, etc) on top.

I think a path forward for this would be to split the current code into two levels, as suggested here: arkworks-rs/crypto-primitives#93

Then, the crate can provide a general-purpose, safe(ish) duplex API like this one, and allow users who want to optimize performance for fixed messages to use the underlying permutation directly.

rozbb added 4 commits August 23, 2021 23:37

Wrote collision test; it fails :O

b42c0a6

Poseidon now uses multirate padding

520227a

Poseidon constraints now use padding

1e1d64b

Updated changelog

cfb2f2b

Pratyush requested review from weikengchen and Pratyush August 24, 2021 07:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added padding to Poseidon duplex #26

Added padding to Poseidon duplex #26

rozbb commented Aug 24, 2021

weikengchen commented Aug 24, 2021 •

edited

Loading

rozbb commented Aug 25, 2021

rozbb commented Aug 25, 2021

hdevalence commented Oct 4, 2021

Added padding to Poseidon duplex #26

Are you sure you want to change the base?

Added padding to Poseidon duplex #26

Conversation

rozbb commented Aug 24, 2021

Description

weikengchen commented Aug 24, 2021 • edited Loading

rozbb commented Aug 25, 2021

rozbb commented Aug 25, 2021

hdevalence commented Oct 4, 2021

weikengchen commented Aug 24, 2021 •

edited

Loading