Add TAP introducing snapshot Merkle trees #125

mnm678 · 2020-09-17T19:36:42Z

This TAP introduces snapshot Merkle trees as a scalable solution to snapshot metadata for large repositories. This grew out of a discussion of metadata overheads associated with the Notary v2 effort.

There is also a draft of the reference implementation available at theupdateframework/python-tuf#1113.

mnm678 · 2020-09-17T19:37:33Z

cc @JustinCappos @joshuagl

trishankatdatadog · 2020-09-18T03:11:12Z

candidate-snapshot-merkle-tap.md

+internal nodes of a Merkle tree contain the hash of the leaf nodes. The exact
+algorithm for generating this Merkle tree (ie the order of leaves in the hash,
+how version information is encoded), is left to the implementer, but this
+algorithm should be documented in a POUF so that implementations can be


Suggested change

algorithm should be documented in a POUF so that implementations can be

algorithm should be documented in a [POUF](https://github.com/theupdateframework/taps/blob/master/tap11.md) so that implementations can be

trishankatdatadog · 2020-09-18T03:12:47Z

candidate-snapshot-merkle-tap.md

+longer valid. At this point, the repository may garbage collect all snapshot
+Merkle metadata files.
+
+# Security Analysis


I like the descriptions and analysis of these attacks, although I think it could be made clearer to non-experts by illustrating a short and concrete example. WDYT?

trishankatdatadog · 2020-09-18T03:13:56Z

candidate-snapshot-merkle-tap.md

+forward attack recovery in a similar manner, by instructing clients to delete
+stored version information after a timestamp key replacement.
+
+## Mix and match attack


Yes, but I think it's important to clarify that although mix-and-match attacks by MitM attackers are not possible, they are possible to do by attackers who control timestamp and snapshot keys.

jku · 2020-09-18T07:08:12Z

candidate-snapshot-merkle-tap.md

+
+ # Abstract
+
+ To optimize the snapshot metadata file size for large registries, registries


registry? is this the same as repository?

jku · 2020-09-18T07:13:31Z

candidate-snapshot-merkle-tap.md

+* +TUF-Version:
+* +Post-History:
+
+ # Abstract


This chapter feels quite heavy on technical detail and light on the high level impact of this TAP

jku · 2020-09-18T07:31:06Z

candidate-snapshot-merkle-tap.md

+could significantly impact the metadata overhead. For example, if a repository
+has 50,000,000 targets, the snapshot metadata will be about 380,000,000 bytes
+(https://docs.google.com/spreadsheets/d/18iwWnWvAAZ4In33EWJBgdAWVFE720B_z0eQlB4FpjNc/edit?ts=5ed7d6f4#gid=0).
+For this reason, it is necessary to create a more scalable solution for snapshot
+metadata that does not significantly impact the security properties of TUF.


The reasoning seems sound but the wording is misleading I think: the issue isn't the number of targets alone but the design decision of 1 target per metadata file? So it would be more accurate to say

For example, if a repository
has 10,000,000 target metadata files, the snapshot metadata will be about 380,000,000 bytes

or did I misunderstand? (it's entirely possible: I'm not able to understand most of the spreadsheet with the context I have)

jku · 2020-09-18T07:45:15Z

candidate-snapshot-merkle-tap.md

+Merkle tree. As long as the client checks for an auditor’s verification, the
+client will not install the rolled-back version of the target.


I think this should be clear about how this proposal compares to current situation -- both when clients accept only 3rd party audited trees and when auditors are not used.

This actually applies to all security impact sections: Clear comparison to status quo would be great

Completely agree, I would love to see a concluding paragraph for each of these security analysis that compares snapshot Merkle trees, both with and without auditors, to the current status quo.

mnm678 · 2020-09-18T20:09:45Z

Thanks for the reviews @trishankatdatadog and @jku! I pushed some new commits to address your comments, especially in the security analysis section.

erickt · 2020-09-19T01:05:16Z

candidate-snapshot-merkle-tap.md

+This information will be included in the following metadata format:
+```
+{ “leaf_contents”: {METAFILES},
+  “Merkle_path”: {INDEX:HASH}


nit: I think you meant merkle_path.

erickt · 2020-09-19T15:25:34Z

candidate-snapshot-merkle-tap.md

+{ “leaf_contents”: {METAFILES},
+  “Merkle_path”: {INDEX:HASH}
+  “path_directions”:{INDEX:DIR}
+}


I think is worthwhile to scope out how the hash algorithm is specified, in case we wanted to switch implementations over time, both for how a hash tree is computed (for example, I just found out about Verkle Trees), and the underlying hash algorithm itself.

The current design leaves the hash algorithm to the implementer (to be specified in a POUF). Although, it might be nice to include the algorithm in the timestamp metadata along with the root hash so that the algorithm could change over time without needing a major version release.

Switching from a Merkle tree to something like a Verkle tree might be trickier without a major version release as it changes the whole algorithm for computation rather than just the hash algorithm used.

joshuagl · 2020-09-23T09:43:39Z

candidate-snapshot-merkle-tap.md

+
+By replacing a single snapshot metadata file with individual snapshot Merkle
+metadata files, this TAP reduces the metadata overhead for repositories with
+large numbers of targets metadata files.


I think it would be worth at least hinting at the security trade-offs and the need for an auditor here?

Good idea, I added a sentence below

joshuagl · 2020-09-23T09:48:09Z

candidate-snapshot-merkle-tap.md

+protect against mix-and-match attacks and rollback attacks. In order to provide
+these protections, snapshot metadata is responsible for keeping track of the
+version number of each targets metadata file, ensuring that all targets downloaded are
+from the same snapshot, and ensuring that no target file decreases its version


Suggested change

from the same snapshot, and ensuring that no target file decreases its version

from the same snapshot, and ensuring that no targets metadata file decreases its version

joshuagl · 2020-09-23T09:49:20Z

candidate-snapshot-merkle-tap.md

+we develop must provide these same protections.
+
+A snapshot Merkle tree manages version information for each targets metadata file by including
+this information in each leaf node. By using a Merkle tree to store these nodes,


Suggested change

this information in each leaf node. By using a Merkle tree to store these nodes,

this information in a leaf node for each targets metadata file. By using a Merkle tree to store these nodes,

candidate-snapshot-merkle-tap.md

joshuagl · 2020-09-23T10:12:10Z

candidate-snapshot-merkle-tap.md

+and ensure that the version information has not decreased for each leaf.
+Alternatively, the repository may provide auditors with information about the
+contents and ordering of leaf nodes so that the auditors can more efficiently
+verify the entire tree.


I'd appreciate some discussion here of what happens if an auditor starts watching a repository that has already gone through several Merkle tree generations. Is it a problem that the auditor hasn't validated all trees generated by the repository?

From context in the security analysis section below, I think an auditor is valid so long as it has the full Merkle tree since the last timestamp key rotation?

I added a paragraph about fast forward attack recovery that describes which versions should be verified by new auditors

joshuagl · 2020-09-23T10:18:49Z

candidate-snapshot-merkle-tap.md

+TUF currently protects against rollback attacks by checking the current time
+signed by timestamp and ensuring that no version information provided by
+snapshot has decreased since the last update. With both of these protections,
+the client is secure against a rollback attack to any version released


Suggested change

the client is secure against a rollback attack to any version released

a client that has a copy of trusted metadata is secure against a rollback attack to any version released

somewhat implicit in the "before the previous update cycle" below, but possibly worth being explicit

joshuagl · 2020-09-23T10:22:13Z

candidate-snapshot-merkle-tap.md

+client verification and by third party auditors. If no keys are compromised,
+the timestamp keys protect against a rollback attack by ensuring a valid
+snapshot Merkle tree. If the timestamp key is compromised, the client
+verification of previous Merkle trees provides rollback protection for the


Is this only for a client with previously downloaded Merkle trees? Or are we suggesting that a client could (should?) download previous Merkle trees to ensure the targets metadata file they are downloading isn't susceptible to a rollback attack (I think so based on context below)? In which case, I'd like to see the TAP suggest when the download of previous snapshot Merkle trees might be performed and how.

I added a description of the client previous tree verification to the Merkle tree verification section.

mnm678 · 2020-09-23T17:42:34Z

Thanks for the review @joshuagl! I responded to your comments inline.

joshuagl

Apologies for taking so long to get back to this PR @mnm678, I've reviewed it this week with fresh eyes. I really like this idea, I think we would benefit from a few additional steps before we merge as a draft:

minor edits of language for clarity, suggested in-line
summarising each subsection of the security analysis with a concluding statement comparing snapshot Merkel trees, with and without an auditor, to the current status quo
better quantifying how much less data clients will need to download under this model
assign the TAP a number

Before the TAP comes out of draft, I think it would be good to also add:

PoC implementation of an auditor
PoC and recommendations for integrating an auditor into the client workflow

candidate-snapshot-merkle-tap.md

joshuagl · 2021-01-20T21:38:05Z

candidate-snapshot-merkle-tap.md

+metadata that does not significantly impact the security properties of TUF.
+
+We designed a new approach to snapshot that improves scalability while
+achieving similar security properties to the existing snapshot metadata


Can we quantify this improved scalability a bit? It would help readers understand the motivation if we indicate how much snapshot metadata an average client will need to download on each update cycle compared to the 380,000,000 bytes above?

joshuagl · 2021-01-20T21:47:45Z

candidate-snapshot-merkle-tap.md

+To optimize the snapshot metadata file size for large repositories, repositories
+can use a snapshot Merkle tree to conceptually store version information about
+all images in a single snapshot without needing to distribute this entire
+snapshot to all clients. To do so, it puts version information for each targets
+metadata file into a leaf of the Merkle tree, and distributes snapshot Merkle
+files containing this leaf information and a path to the root of the Merkle tree.
+Clients can then compare this information with a Merkle root provided in the
+timestamp metadata. To prove that there has not been a reversion of the
+snapshot Merkle tree when downloading an image, the client and third-party
+auditors download the prior snapshot Merkle trees and check that the version
+numbers did not decrease at any point.
+
+By replacing a single snapshot metadata file with individual snapshot Merkle
+metadata files, this TAP reduces the metadata overhead for repositories with
+large numbers of targets metadata files. It maintains the security protections
+of the single snapshot metadata file in part through the use of third party
+auditors that check for rollback attacks anywhere in the Merkle tree.


From TAP 1 the purpose of the Abstract is to provide:

a short (~200 word) description of the technical issue being addressed

therefore, I tried to condense what you've written here into just a description of the issue being addressed:

Suggested change

To optimize the snapshot metadata file size for large repositories, repositories

can use a snapshot Merkle tree to conceptually store version information about

all images in a single snapshot without needing to distribute this entire

snapshot to all clients. To do so, it puts version information for each targets

metadata file into a leaf of the Merkle tree, and distributes snapshot Merkle

files containing this leaf information and a path to the root of the Merkle tree.

Clients can then compare this information with a Merkle root provided in the

timestamp metadata. To prove that there has not been a reversion of the

snapshot Merkle tree when downloading an image, the client and third-party

auditors download the prior snapshot Merkle trees and check that the version

numbers did not decrease at any point.

By replacing a single snapshot metadata file with individual snapshot Merkle

metadata files, this TAP reduces the metadata overhead for repositories with

large numbers of targets metadata files. It maintains the security protections

of the single snapshot metadata file in part through the use of third party

auditors that check for rollback attacks anywhere in the Merkle tree.

Snapshot metadata for large repositories, with a high number of targets

metadata files (through significant use of delegations), can become

prohibitively large. Due to the need to download the snapshot file on every

update cycle, very large snapshot metadata files can become a significant

overhead for TUF clients.

This TAP proposes a method for reducing the size of snapshot metadata a client

must download without significantly weakening the security properties of TUF.

joshuagl · 2021-01-20T22:06:05Z

candidate-snapshot-merkle-tap.md

+Merkle tree. As long as the client checks for an auditor’s verification, the
+client will not install the rolled-back version of the target.


Completely agree, I would love to see a concluding paragraph for each of these security analysis that compares snapshot Merkle trees, both with and without auditors, to the current status quo.

joshuagl · 2021-01-20T22:08:49Z

candidate-snapshot-merkle-tap.md

@@ -0,0 +1,293 @@
+* TAP:


I believe this will become TAP 16, but let's be sure to assign the correct number (and rename the file) before we merge this PR.

mnm678

Thanks for the review @joshuagl!

mnm678 · 2021-01-22T17:51:14Z

candidate-snapshot-merkle-tap.md

+sibling nodes needed to reconstruct the tree during verification (see diagram).
+In addition the path should contain direction information so that the client
+will know whether each node is a left or right sibling when reconstructing the


I think this is a matter of perspective. Each node contains the hashes of it's child nodes, but to re-construct a node's parent, you need that node's sibling. I'll work on making that more clear.

mnm678 · 2021-01-22T17:54:41Z

candidate-snapshot-merkle-tap.md

+## Auditing Merkle trees
+
+In order to ensure the validity of all targets metadata version information in the
+Merkle tree, third-party auditors should validate the entire tree each time it


Yes, but the TUF spec is a bit inconsistent about capitalizing these, so I'm not sure whether to do so here.

mnm678 · 2021-01-26T21:49:53Z

@joshuagl This is ready for another review. I addressed everything except the auditor implementation in the PoC, which I will do before this moves out of the draft stage.

mnm678 · 2021-02-01T21:17:53Z

I added a PoC auditor implementation. It doesn't yet include client/auditor interaction, but it demonstrates how an auditor could download and verify all snapshot merkle metadata.

joshuagl

Thank you for the continued iteration on this TAP Marina! I think this looks in a good shape to merge as a draft to the TAPs repository.

I appreciate you sharing a diff of the reference POUF. However, I don't think we should be merging that until the reference implementation is updated, could you remove that commit from this PR?
Aside: the updated POUF doesn't quite answer the questions I was expecting, either. From the TAP:

The exact algorithm for generating this Merkle tree (ie the order of nodes in the hash, how version information is encoded, etc.), is left to the implementer, but this
algorithm should be documented in a POUF

I currently need to read the PoC code and the POUF to understand how to generate the Merkle tree.

Following are some thoughts on considerations/work items once the PR is merged:

A reader has to understand Merkle trees to implement this TAP, for example to understand how to compute the root hash from a leaf node. Is that a concern? Should we link to a good overview of Merkle trees? Should we specify how this is expected to work?
- are the metadata extensions for the Merkle tree truly abstract enough to support arbitrary (non-Merkle) tree algorithms? Should we PoC some other algorithms? perhaps in a different implementation?
Further PoC development of auditor integrations?
Recommendations on specifying the algorithms for a POUF
The Specification section suggests the snapshot Merkle tree replaces the single snapshot metadata file – is there any reason we shouldn't generate both? If we generate a Merkle tree only, should integrations still have a snapshot role with associated key? Should we explore this a bit more?
In places the TAP seems to suggest an auditor is required, whereas in others it indicates auditors are optional. Let's be sure to clarify.

tap16.md

Signed-off-by: marinamoore <mnm678@gmail.com>

The TUF spec uses the term repository to refer to the server that stores updates. Signed-off-by: marinamoore <mnm678@gmail.com>

Signed-off-by: marinamoore <mnm678@gmail.com>

Add attack descriptions and comparison to the existing TUF specification Signed-off-by: marinamoore <mnm678@gmail.com>

Signed-off-by: marinamoore <mnm678@gmail.com>

Add clarifications based on feedback on the pr. Signed-off-by: marinamoore <mnm678@gmail.com>

Co-authored-by: Joshua Lock <jlock@vmware.com>

This commit adds clarifications to: * The size of snapshot merkle metadata * Information included in the merkle metadata * Fast forward attack recovery Signed-off-by: Marina Moore <mnm678@gmail.com>

Signed-off-by: Marina Moore <mnm678@gmail.com>

Add a quick comparison to the existing specification to each section of the security analysis to make it clear what the differences are with and without auditors. Signed-off-by: Marina Moore <mnm678@gmail.com>

@joshuagl

Credit to @joshuagl for helping with the text here. This commit makes the abstract shorter and more in line with the requirements of TAP 1. Signed-off-by: Marina Moore <mnm678@gmail.com>

Add a section that describes a few options for how clients can verify that a Merkle tree has been verified by an auditor. Signed-off-by: Marina Moore <mnm678@gmail.com>

mnm678 · 2021-02-03T18:38:52Z

Thanks again for the review @joshuagl!

I appreciate you sharing a diff of the reference POUF. However, I don't think we should be merging that until the reference implementation is updated, could you remove that commit from this PR?
Aside: the updated POUF doesn't quite answer the questions I was expecting, either. From the TAP:

The exact algorithm for generating this Merkle tree (ie the order of nodes in the hash, how version information is encoded, etc.), is left to the implementer, but this
algorithm should be documented in a POUF

I currently need to read the PoC code and the POUF to understand how to generate the Merkle tree.

I moved the POUF changes to #133. I'll continue to iterate on this there.

Following are some thoughts on considerations/work items once the PR is merged:

I moved these to #134 for further discussion.

joshuagl

Thank you for the continued iteration on this TAP Marina. I think this is ready for merging as a draft.

mnm678 mentioned this pull request Sep 17, 2020

PoC(tap): Snapshot merkle tree theupdateframework/python-tuf#1113

Closed

trishankatdatadog reviewed Sep 18, 2020

View reviewed changes

jku reviewed Sep 18, 2020

View reviewed changes

erickt reviewed Sep 19, 2020

View reviewed changes

joshuagl reviewed Sep 23, 2020

View reviewed changes

joshuagl reviewed Jan 22, 2021

View reviewed changes

mnm678 commented Jan 22, 2021

View reviewed changes

joshuagl reviewed Feb 3, 2021

View reviewed changes

tap16.md Show resolved Hide resolved

mnm678 force-pushed the snapshot-merkle branch from 0943761 to fe1f4ac Compare February 3, 2021 18:10

This was referenced Feb 3, 2021

Snapshot merkle POUF #133

Draft

Discussion of TAP 16: Snapshot Merkle trees #134

Open

mnm678 and others added 11 commits February 3, 2021 10:36

Add initial draft of tap introducing snapshot merkle trees

624bcba

Signed-off-by: marinamoore <mnm678@gmail.com>

Use repository instead of registry

99a3a79

The TUF spec uses the term repository to refer to the server that stores updates. Signed-off-by: marinamoore <mnm678@gmail.com>

Add link to tap 11

5d579ca

Signed-off-by: marinamoore <mnm678@gmail.com>

Snapshot holds targets metadata information, not targets information

1d22fba

Signed-off-by: marinamoore <mnm678@gmail.com>

Update security analysis for snapshot merkle TAP

e930054

Add attack descriptions and comparison to the existing TUF specification Signed-off-by: marinamoore <mnm678@gmail.com>

Simplify the abstract for snapshot merkle tap

db0c229

Signed-off-by: marinamoore <mnm678@gmail.com>

Clarify client verification and auditing

5655479

Add clarifications based on feedback on the pr. Signed-off-by: marinamoore <mnm678@gmail.com>

Apply edits for clarity from code review

f7ef3ab

Co-authored-by: Joshua Lock <jlock@vmware.com>

Add clarifications based on review

82891d5

This commit adds clarifications to: * The size of snapshot merkle metadata * Information included in the merkle metadata * Fast forward attack recovery Signed-off-by: Marina Moore <mnm678@gmail.com>

Assign TAP number to Snapshot Merke Tree TAP

83f989f

Signed-off-by: Marina Moore <mnm678@gmail.com>

Add summary to security analysis

2319f9c

Add a quick comparison to the existing specification to each section of the security analysis to make it clear what the differences are with and without auditors. Signed-off-by: Marina Moore <mnm678@gmail.com>

mnm678 added 2 commits February 3, 2021 10:37

Condense the abstract

32d214b

Credit to @joshuagl for helping with the text here. This commit makes the abstract shorter and more in line with the requirements of TAP 1. Signed-off-by: Marina Moore <mnm678@gmail.com>

Add section for client interaction with auditors

b4f7f80

Add a section that describes a few options for how clients can verify that a Merkle tree has been verified by an auditor. Signed-off-by: Marina Moore <mnm678@gmail.com>

mnm678 force-pushed the snapshot-merkle branch from 542be51 to b4f7f80 Compare February 3, 2021 18:37

joshuagl approved these changes Feb 15, 2021

View reviewed changes

mnm678 merged commit 49dcf1d into theupdateframework:master Mar 1, 2021

mnm678 deleted the snapshot-merkle branch March 1, 2023 15:14

	algorithm should be documented in a POUF so that implementations can be
	algorithm should be documented in a [POUF](https://github.com/theupdateframework/taps/blob/master/tap11.md) so that implementations can be


		# Abstract

		To optimize the snapshot metadata file size for large registries, registries

		Merkle tree. As long as the client checks for an auditor’s verification, the
		client will not install the rolled-back version of the target.

	from the same snapshot, and ensuring that no target file decreases its version
	from the same snapshot, and ensuring that no targets metadata file decreases its version

	this information in each leaf node. By using a Merkle tree to store these nodes,
	this information in a leaf node for each targets metadata file. By using a Merkle tree to store these nodes,

	the client is secure against a rollback attack to any version released
	a client that has a copy of trusted metadata is secure against a rollback attack to any version released

Add TAP introducing snapshot Merkle trees #125

Add TAP introducing snapshot Merkle trees #125

Conversation

mnm678 commented Sep 17, 2020

mnm678 commented Sep 17, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jku Sep 18, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mnm678 commented Sep 18, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mnm678 commented Sep 23, 2020

joshuagl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mnm678 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mnm678 commented Jan 26, 2021

mnm678 commented Feb 1, 2021

joshuagl left a comment

Choose a reason for hiding this comment

mnm678 commented Feb 3, 2021

joshuagl left a comment

Choose a reason for hiding this comment

jku Sep 18, 2020 •

edited

Loading