-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctness issue: Change to batched entry upload to support witnessing #1067
Comments
cc @asraa @priyawadhwa @znewman01 for more discussion |
An alternative to making breaking changes, while still increasing the batch size, would be to keep the current design and increase the MMD to O(minutes). Rekor already polls Trillian until the upload is complete, so this would only require a configuration change. However, the downside to this is that connections will be kept open for minutes, and the SLO for upload would have to be significantly increased. I assume this is not a viable solution due to filling up the connection pool, but maybe there's some way around this? If anyone has knowledge of Go + networking, please chime in! Also worth noting the current state:
|
I'm not sure I agree that witnessing and gossip protocols are necessary to mitigate split-view attacks. In order to maintain a long-term split view attack, the server would need to track individual users across IP addresses over long periods of time in order to target or create one. A split-view attack is impractical if you assume that clients/witnesses are relatively anonymous. There's plenty more work that can be done here to make these attacks even harder, but this is still pretty effective today. |
I still think it's a good idea to make the This has some advantages architecturally: as the Merkle tree grows, each append will be slower, and batching helps, so we may want to batch for reasons other than security. EDIT: plus it's weirdly inconsistent with CT |
@rbehjati has also brought up points in favor of batched updates, in face of CAP theorem too: and we can ideally make this a configurable parameter of the server. |
I have written a proposal for supporting witnessing in Rekor - https://docs.google.com/document/d/1NFSrkcIvrwvqV-2Ax2NoE0FdKh6ccnZufdqawAetKmY/edit?resourcekey=0-p8eFLQub4klCf3wrOAFSTg (shared with sigstore-dev@) tl;dr: We don't need any breaking changes, we don't need to switch to batching entries. Rekor will publish a checkpoint periodically that witnesses will sign. As an artifact verifier, I will use this witnessed checkpoint (and the log size from that checkpoint) when computing inclusion proofs. The consequence of this is that artifacts aren't immediately verifiable when added to the log, but this is no different than if we went with a batched approach. This approach also comes with privacy improvements (k-anonymity), since checkpoints no longer uniquely identity an entry. Proposed changes to Rekor are to:
There will be changes to Sigstore clients to support this stronger online verification and witnesses to use the new API. |
Currently, a log witness countersigns the latest checkpoint Rekor publishes. Rekor updates its checkpoint on every entry upload, which is extremely frequent. This means that two witnesses are very unlikely to countersign the same checkpoint. While gossiping, it will not be possible to reach a consensus on the same checkpoint, and therefore we can't mitigate split-view attacks. This change publishes a checkpoint, a "stable checkpoint", every 5 minutes (configurable) to Redis. This runs as a goroutine, with a Redis key derived from the current time rounded to the nearest 5 minutes. We use set-if-not-exist for Redis, meaning you can run replicated instances of Rekor, with all instances writing to the same Redis key. For a client that wants to gossip, this means waiting 5 minutes before a checkpoint is published that witnesses will countersign (Note that this is an area of active development and research too). The stable checkpoint can be accessed with a query parameter. Fixes sigstore#1067. There is still value in batching in terms of reliablity, but stable checkpoints solve the gossiping issue without a breaking change. Signed-off-by: Hayden Blauzvern <hblauzvern@google.com>
Currently, a log witness countersigns the latest checkpoint Rekor publishes. Rekor updates its checkpoint on every entry upload, which is extremely frequent. This means that two witnesses are very unlikely to countersign the same checkpoint. While gossiping, it will not be possible to reach a consensus on the same checkpoint, and therefore we can't mitigate split-view attacks. This change publishes a checkpoint, a "stable checkpoint", every 5 minutes (configurable) to Redis. This runs as a goroutine, with a Redis key derived from the current time rounded to the nearest 5 minutes. We use set-if-not-exist for Redis, meaning you can run replicated instances of Rekor, with all instances writing to the same Redis key. For a client that wants to gossip, this means waiting 5 minutes before a checkpoint is published that witnesses will countersign (Note that this is an area of active development and research too). The stable checkpoint can be accessed with a query parameter. Fixes sigstore#1067. There is still value in batching in terms of reliablity, but stable checkpoints solve the gossiping issue without a breaking change. Signed-off-by: Hayden Blauzvern <hblauzvern@google.com>
* Publish stable checkpoint periodically to Redis Currently, a log witness countersigns the latest checkpoint Rekor publishes. Rekor updates its checkpoint on every entry upload, which is extremely frequent. This means that two witnesses are very unlikely to countersign the same checkpoint. While gossiping, it will not be possible to reach a consensus on the same checkpoint, and therefore we can't mitigate split-view attacks. This change publishes a checkpoint, a "stable checkpoint", every 5 minutes (configurable) to Redis. This runs as a goroutine, with a Redis key derived from the current time rounded to the nearest 5 minutes. We use set-if-not-exist for Redis, meaning you can run replicated instances of Rekor, with all instances writing to the same Redis key. For a client that wants to gossip, this means waiting 5 minutes before a checkpoint is published that witnesses will countersign (Note that this is an area of active development and research too). The stable checkpoint can be accessed with a query parameter. Fixes #1067. There is still value in batching in terms of reliablity, but stable checkpoints solve the gossiping issue without a breaking change. Signed-off-by: Hayden Blauzvern <hblauzvern@google.com> * Use latest key to access latest checkpoint Signed-off-by: Hayden Blauzvern <hblauzvern@google.com> * Add test Signed-off-by: Hayden Blauzvern <hblauzvern@google.com> * Return early if key already exists Signed-off-by: Hayden Blauzvern <hblauzvern@google.com> * Add comment explaining failure handling Signed-off-by: Hayden Blauzvern <hblauzvern@google.com> * Apply suggestions from code review Co-authored-by: Bob Callaway <bobcallaway@users.noreply.github.com> Signed-off-by: Hayden B <hblauzvern@google.com> * Fix goroutine leak, check if redis client is configured Signed-off-by: Hayden Blauzvern <hblauzvern@google.com> * Add test for goroutine leak Signed-off-by: Hayden Blauzvern <hblauzvern@google.com> --------- Signed-off-by: Hayden Blauzvern <hblauzvern@google.com> Signed-off-by: Hayden B <hblauzvern@google.com> Co-authored-by: Bob Callaway <bobcallaway@users.noreply.github.com>
Witnesses monitor the consistency of the log, verifying that the log is append-only and immutable. Roughly, the verification process for a witness is:
Witnesses also help mitigate split-view attacks, where the log presents different views to different clients. As a consumer, I can use witnesses' countersignatures to verify that all witnesses have seen the same checkpoint. For a client that wants to verify an entry in the log and verify the log is consistent and no split-view attack is occurring, the client needs to:
This quorum mechanism requires that witnesses have seen the same root hash. This means that the root hash cannot frequently update. With the current state of Rekor, the root hash updates with almost every entry, because we've set Trillian's MMD (maximum merge delay, the time it takes for Trillian to process an entry for inclusion) to 0. This means that witnesses are very, very unlikely to ever witness the same checkpoint.
The proposed fix would be to move towards a model like CT, where entries are batched and processed periodically. Whatever this period is, we can set witnesses to poll more frequently than this period, giving them a chance to witness the same checkpoint. For example, if we set MMD to 5 minutes, we can have witnesses poll every minute or two.
The impact of this change is:
The text was updated successfully, but these errors were encountered: