-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TAP introducing snapshot Merkle trees #125
Conversation
candidate-snapshot-merkle-tap.md
Outdated
internal nodes of a Merkle tree contain the hash of the leaf nodes. The exact | ||
algorithm for generating this Merkle tree (ie the order of leaves in the hash, | ||
how version information is encoded), is left to the implementer, but this | ||
algorithm should be documented in a POUF so that implementations can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
algorithm should be documented in a POUF so that implementations can be | |
algorithm should be documented in a [POUF](https://github.com/theupdateframework/taps/blob/master/tap11.md) so that implementations can be |
candidate-snapshot-merkle-tap.md
Outdated
longer valid. At this point, the repository may garbage collect all snapshot | ||
Merkle metadata files. | ||
|
||
# Security Analysis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the descriptions and analysis of these attacks, although I think it could be made clearer to non-experts by illustrating a short and concrete example. WDYT?
candidate-snapshot-merkle-tap.md
Outdated
forward attack recovery in a similar manner, by instructing clients to delete | ||
stored version information after a timestamp key replacement. | ||
|
||
## Mix and match attack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I think it's important to clarify that although mix-and-match attacks by MitM attackers are not possible, they are possible to do by attackers who control timestamp and snapshot keys.
candidate-snapshot-merkle-tap.md
Outdated
|
||
# Abstract | ||
|
||
To optimize the snapshot metadata file size for large registries, registries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
registry? is this the same as repository?
candidate-snapshot-merkle-tap.md
Outdated
* +TUF-Version: | ||
* +Post-History: | ||
|
||
# Abstract |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This chapter feels quite heavy on technical detail and light on the high level impact of this TAP
candidate-snapshot-merkle-tap.md
Outdated
could significantly impact the metadata overhead. For example, if a repository | ||
has 50,000,000 targets, the snapshot metadata will be about 380,000,000 bytes | ||
(https://docs.google.com/spreadsheets/d/18iwWnWvAAZ4In33EWJBgdAWVFE720B_z0eQlB4FpjNc/edit?ts=5ed7d6f4#gid=0). | ||
For this reason, it is necessary to create a more scalable solution for snapshot | ||
metadata that does not significantly impact the security properties of TUF. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reasoning seems sound but the wording is misleading I think: the issue isn't the number of targets alone but the design decision of 1 target per metadata file? So it would be more accurate to say
For example, if a repository
has 10,000,000 target metadata files, the snapshot metadata will be about 380,000,000 bytes
or did I misunderstand? (it's entirely possible: I'm not able to understand most of the spreadsheet with the context I have)
candidate-snapshot-merkle-tap.md
Outdated
Merkle tree. As long as the client checks for an auditor’s verification, the | ||
client will not install the rolled-back version of the target. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be clear about how this proposal compares to current situation -- both when clients accept only 3rd party audited trees and when auditors are not used.
This actually applies to all security impact sections: Clear comparison to status quo would be great
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely agree, I would love to see a concluding paragraph for each of these security analysis that compares snapshot Merkle trees, both with and without auditors, to the current status quo.
Thanks for the reviews @trishankatdatadog and @jku! I pushed some new commits to address your comments, especially in the security analysis section. |
candidate-snapshot-merkle-tap.md
Outdated
This information will be included in the following metadata format: | ||
``` | ||
{ “leaf_contents”: {METAFILES}, | ||
“Merkle_path”: {INDEX:HASH} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think you meant merkle_path
.
candidate-snapshot-merkle-tap.md
Outdated
{ “leaf_contents”: {METAFILES}, | ||
“Merkle_path”: {INDEX:HASH} | ||
“path_directions”:{INDEX:DIR} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is worthwhile to scope out how the hash algorithm is specified, in case we wanted to switch implementations over time, both for how a hash tree is computed (for example, I just found out about Verkle Trees), and the underlying hash algorithm itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current design leaves the hash algorithm to the implementer (to be specified in a POUF). Although, it might be nice to include the algorithm in the timestamp metadata along with the root hash so that the algorithm could change over time without needing a major version release.
Switching from a Merkle tree to something like a Verkle tree might be trickier without a major version release as it changes the whole algorithm for computation rather than just the hash algorithm used.
candidate-snapshot-merkle-tap.md
Outdated
|
||
By replacing a single snapshot metadata file with individual snapshot Merkle | ||
metadata files, this TAP reduces the metadata overhead for repositories with | ||
large numbers of targets metadata files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be worth at least hinting at the security trade-offs and the need for an auditor here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I added a sentence below
candidate-snapshot-merkle-tap.md
Outdated
protect against mix-and-match attacks and rollback attacks. In order to provide | ||
these protections, snapshot metadata is responsible for keeping track of the | ||
version number of each targets metadata file, ensuring that all targets downloaded are | ||
from the same snapshot, and ensuring that no target file decreases its version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from the same snapshot, and ensuring that no target file decreases its version | |
from the same snapshot, and ensuring that no targets metadata file decreases its version |
candidate-snapshot-merkle-tap.md
Outdated
we develop must provide these same protections. | ||
|
||
A snapshot Merkle tree manages version information for each targets metadata file by including | ||
this information in each leaf node. By using a Merkle tree to store these nodes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this information in each leaf node. By using a Merkle tree to store these nodes, | |
this information in a leaf node for each targets metadata file. By using a Merkle tree to store these nodes, |
candidate-snapshot-merkle-tap.md
Outdated
and ensure that the version information has not decreased for each leaf. | ||
Alternatively, the repository may provide auditors with information about the | ||
contents and ordering of leaf nodes so that the auditors can more efficiently | ||
verify the entire tree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd appreciate some discussion here of what happens if an auditor starts watching a repository that has already gone through several Merkle tree generations. Is it a problem that the auditor hasn't validated all trees generated by the repository?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From context in the security analysis section below, I think an auditor is valid so long as it has the full Merkle tree since the last timestamp key rotation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a paragraph about fast forward attack recovery that describes which versions should be verified by new auditors
candidate-snapshot-merkle-tap.md
Outdated
TUF currently protects against rollback attacks by checking the current time | ||
signed by timestamp and ensuring that no version information provided by | ||
snapshot has decreased since the last update. With both of these protections, | ||
the client is secure against a rollback attack to any version released |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the client is secure against a rollback attack to any version released | |
a client that has a copy of trusted metadata is secure against a rollback attack to any version released |
somewhat implicit in the "before the previous update cycle" below, but possibly worth being explicit
candidate-snapshot-merkle-tap.md
Outdated
client verification and by third party auditors. If no keys are compromised, | ||
the timestamp keys protect against a rollback attack by ensuring a valid | ||
snapshot Merkle tree. If the timestamp key is compromised, the client | ||
verification of previous Merkle trees provides rollback protection for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only for a client with previously downloaded Merkle trees? Or are we suggesting that a client could (should?) download previous Merkle trees to ensure the targets metadata file they are downloading isn't susceptible to a rollback attack (I think so based on context below)? In which case, I'd like to see the TAP suggest when the download of previous snapshot Merkle trees might be performed and how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a description of the client previous tree verification to the Merkle tree verification
section.
Thanks for the review @joshuagl! I responded to your comments inline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for taking so long to get back to this PR @mnm678, I've reviewed it this week with fresh eyes. I really like this idea, I think we would benefit from a few additional steps before we merge as a draft:
- minor edits of language for clarity, suggested in-line
- summarising each subsection of the security analysis with a concluding statement comparing snapshot Merkel trees, with and without an auditor, to the current status quo
- better quantifying how much less data clients will need to download under this model
- assign the TAP a number
Before the TAP comes out of draft, I think it would be good to also add:
- PoC implementation of an auditor
- PoC and recommendations for integrating an auditor into the client workflow
candidate-snapshot-merkle-tap.md
Outdated
metadata that does not significantly impact the security properties of TUF. | ||
|
||
We designed a new approach to snapshot that improves scalability while | ||
achieving similar security properties to the existing snapshot metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we quantify this improved scalability a bit? It would help readers understand the motivation if we indicate how much snapshot metadata an average client will need to download on each update cycle compared to the 380,000,000 bytes above?
candidate-snapshot-merkle-tap.md
Outdated
To optimize the snapshot metadata file size for large repositories, repositories | ||
can use a snapshot Merkle tree to conceptually store version information about | ||
all images in a single snapshot without needing to distribute this entire | ||
snapshot to all clients. To do so, it puts version information for each targets | ||
metadata file into a leaf of the Merkle tree, and distributes snapshot Merkle | ||
files containing this leaf information and a path to the root of the Merkle tree. | ||
Clients can then compare this information with a Merkle root provided in the | ||
timestamp metadata. To prove that there has not been a reversion of the | ||
snapshot Merkle tree when downloading an image, the client and third-party | ||
auditors download the prior snapshot Merkle trees and check that the version | ||
numbers did not decrease at any point. | ||
|
||
By replacing a single snapshot metadata file with individual snapshot Merkle | ||
metadata files, this TAP reduces the metadata overhead for repositories with | ||
large numbers of targets metadata files. It maintains the security protections | ||
of the single snapshot metadata file in part through the use of third party | ||
auditors that check for rollback attacks anywhere in the Merkle tree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From TAP 1 the purpose of the Abstract is to provide:
a short (~200 word) description of the technical issue being addressed
therefore, I tried to condense what you've written here into just a description of the issue being addressed:
To optimize the snapshot metadata file size for large repositories, repositories | |
can use a snapshot Merkle tree to conceptually store version information about | |
all images in a single snapshot without needing to distribute this entire | |
snapshot to all clients. To do so, it puts version information for each targets | |
metadata file into a leaf of the Merkle tree, and distributes snapshot Merkle | |
files containing this leaf information and a path to the root of the Merkle tree. | |
Clients can then compare this information with a Merkle root provided in the | |
timestamp metadata. To prove that there has not been a reversion of the | |
snapshot Merkle tree when downloading an image, the client and third-party | |
auditors download the prior snapshot Merkle trees and check that the version | |
numbers did not decrease at any point. | |
By replacing a single snapshot metadata file with individual snapshot Merkle | |
metadata files, this TAP reduces the metadata overhead for repositories with | |
large numbers of targets metadata files. It maintains the security protections | |
of the single snapshot metadata file in part through the use of third party | |
auditors that check for rollback attacks anywhere in the Merkle tree. | |
Snapshot metadata for large repositories, with a high number of targets | |
metadata files (through significant use of delegations), can become | |
prohibitively large. Due to the need to download the snapshot file on every | |
update cycle, very large snapshot metadata files can become a significant | |
overhead for TUF clients. | |
This TAP proposes a method for reducing the size of snapshot metadata a client | |
must download without significantly weakening the security properties of TUF. |
candidate-snapshot-merkle-tap.md
Outdated
Merkle tree. As long as the client checks for an auditor’s verification, the | ||
client will not install the rolled-back version of the target. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely agree, I would love to see a concluding paragraph for each of these security analysis that compares snapshot Merkle trees, both with and without auditors, to the current status quo.
candidate-snapshot-merkle-tap.md
Outdated
@@ -0,0 +1,293 @@ | |||
* TAP: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this will become TAP 16, but let's be sure to assign the correct number (and rename the file) before we merge this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @joshuagl!
candidate-snapshot-merkle-tap.md
Outdated
sibling nodes needed to reconstruct the tree during verification (see diagram). | ||
In addition the path should contain direction information so that the client | ||
will know whether each node is a left or right sibling when reconstructing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a matter of perspective. Each node contains the hashes of it's child nodes, but to re-construct a node's parent, you need that node's sibling. I'll work on making that more clear.
candidate-snapshot-merkle-tap.md
Outdated
## Auditing Merkle trees | ||
|
||
In order to ensure the validity of all targets metadata version information in the | ||
Merkle tree, third-party auditors should validate the entire tree each time it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but the TUF spec is a bit inconsistent about capitalizing these, so I'm not sure whether to do so here.
@joshuagl This is ready for another review. I addressed everything except the auditor implementation in the PoC, which I will do before this moves out of the draft stage. |
I added a PoC auditor implementation. It doesn't yet include client/auditor interaction, but it demonstrates how an auditor could download and verify all snapshot merkle metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the continued iteration on this TAP Marina! I think this looks in a good shape to merge as a draft to the TAPs repository.
I appreciate you sharing a diff of the reference POUF. However, I don't think we should be merging that until the reference implementation is updated, could you remove that commit from this PR?
Aside: the updated POUF doesn't quite answer the questions I was expecting, either. From the TAP:
The exact algorithm for generating this Merkle tree (ie the order of nodes in the hash, how version information is encoded, etc.), is left to the implementer, but this
algorithm should be documented in a POUF
I currently need to read the PoC code and the POUF to understand how to generate the Merkle tree.
Following are some thoughts on considerations/work items once the PR is merged:
- A reader has to understand Merkle trees to implement this TAP, for example to understand how to compute the root hash from a leaf node. Is that a concern? Should we link to a good overview of Merkle trees? Should we specify how this is expected to work?
- are the metadata extensions for the Merkle tree truly abstract enough to support arbitrary (non-Merkle) tree algorithms? Should we PoC some other algorithms? perhaps in a different implementation?
- Further PoC development of auditor integrations?
- Recommendations on specifying the algorithms for a POUF
- The Specification section suggests the snapshot Merkle tree replaces the single snapshot metadata file – is there any reason we shouldn't generate both? If we generate a Merkle tree only, should integrations still have a snapshot role with associated key? Should we explore this a bit more?
- In places the TAP seems to suggest an auditor is required, whereas in others it indicates auditors are optional. Let's be sure to clarify.
0943761
to
fe1f4ac
Compare
Signed-off-by: marinamoore <mnm678@gmail.com>
The TUF spec uses the term repository to refer to the server that stores updates. Signed-off-by: marinamoore <mnm678@gmail.com>
Signed-off-by: marinamoore <mnm678@gmail.com>
Signed-off-by: marinamoore <mnm678@gmail.com>
Add attack descriptions and comparison to the existing TUF specification Signed-off-by: marinamoore <mnm678@gmail.com>
Signed-off-by: marinamoore <mnm678@gmail.com>
Add clarifications based on feedback on the pr. Signed-off-by: marinamoore <mnm678@gmail.com>
Co-authored-by: Joshua Lock <jlock@vmware.com>
This commit adds clarifications to: * The size of snapshot merkle metadata * Information included in the merkle metadata * Fast forward attack recovery Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Add a quick comparison to the existing specification to each section of the security analysis to make it clear what the differences are with and without auditors. Signed-off-by: Marina Moore <mnm678@gmail.com>
Credit to @joshuagl for helping with the text here. This commit makes the abstract shorter and more in line with the requirements of TAP 1. Signed-off-by: Marina Moore <mnm678@gmail.com>
Add a section that describes a few options for how clients can verify that a Merkle tree has been verified by an auditor. Signed-off-by: Marina Moore <mnm678@gmail.com>
542be51
to
b4f7f80
Compare
Thanks again for the review @joshuagl!
I moved the POUF changes to #133. I'll continue to iterate on this there.
I moved these to #134 for further discussion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the continued iteration on this TAP Marina. I think this is ready for merging as a draft.
This TAP introduces snapshot Merkle trees as a scalable solution to snapshot metadata for large repositories. This grew out of a discussion of metadata overheads associated with the Notary v2 effort.
There is also a draft of the reference implementation available at theupdateframework/python-tuf#1113.