Vulnerability in VerifyNamespace: Partial Absence Proofs #110

staheri14 · 2023-02-21T18:01:48Z

Problem

Identified as part of EPIC celestiaorg/celestia-app#1296.
The current implementation of VerifyNamespace is vulnerable to an issue in which a queried node can act inconsistently with the protocol specifications for when a queried namespace ID does not have corresponding items in the tree (i.e., absence proof). The attack/issue involves providing a partial absence proof for a namespace ID, while remaining undetected by the verification implementation. A partial proof in this case refers to a proof of an ancestor of a leaf to the root, rather than a proof of inclusion of the leaf to the tree. This is because the implementation cannot distinguish between a full or partial proof.

Note: Please note that although VerifyNamespace is unable to detect partial proofs, the soundness of the namespace proof is not compromised. In other words, a malicious node is not able to convince someone of the absence of a namespace ID that actually exists in the tree. However, the inability to distinguish this inconsistent behavior may lead to incorrect conclusions being drawn at the application layer about the underlying data represented by NMT. The impact of this issue depends on the context in which NMT is being used and may or may not be considered an attack. Nevertheless, it is important to address this issue properly as it currently causes inconsistency and ambiguity in the verification process.
Update: I suspect that the current implementation may be intentional to help with the non-interactive default rules for inclusion proofs of Blobs. Even if this is the case, the specs needs to be updated to make it clear whether partial absence proofs are considered valid or not.

Root Cause:
This is because VerifyNamespace does not consider the height of the tree or the total number of leaves during verification. Furthermore, the absence proof is based solely on the leaf hash and not on its underlying data. As a result, any internal node (non-leaf node) can be used as a leaf hash in the proof.

Example:
Below is an example of such a partial proof that can be successfully verified by the VerifyNamespace algorithm.
Description of the example: A correct absence proof for nID=1 would consist of the following: proof.start=1, proof.end=2, proof.leafHash=leaf1, and proof.nodes=[leaf0, hash2].
However, it is possible to construct a partial proof that would also be considered correct by the verifyNamespace function.
This alternate proof would have proof.start=0, proof.end=1, proof.leafHash=hash1, and proof.nodes=[hash2].

nID := namespace.ID{1}
// create a tree with 4 leaves, variables names match the figure above
n := New(sha256.New(), NamespaceIDSize(1))
d0 := append([]byte{0}, []byte("data0")...)
d1 := append([]byte{2}, []byte("data1")...)
d2 := append([]byte{3}, []byte("data2")...)
d3 := append([]byte{4}, []byte("data3")...)

n.Push(d0)
n.Push(d1)
n.Push(d2)
n.Push(d3)

// this will populate n.leafHashes with the hash of d0...d3
n.computeLeafHashesIfNecessary()

root := n.Root()
hash1 := n.computeRoot(0, 2)
hash2 := n.computeRoot(2, 4)
leaf0 := n.leafHashes[0]
leaf1 := n.leafHashes[1]

// attack scenario: create a partial proof from hash1 to the root instead of leaf1 to the root
partialProof := Proof{start: 0, end: 1, leafHash: hash1, nodes: [][]byte{hash2}, isMaxNamespaceIDIgnored: true}
// run VerifyNamespace for the fabricated absence proof and see if it can pass
if partialProof.VerifyNamespace(sha256.New(), nID, nil, root) {
    fmt.Println("partial proof is successfully verified")  // this will be executed
} else {
    fmt.Println("verification of the partial proof failed")
}

// correct scenario: create a full proof from leaf1 to the root
fullProof := Proof{start: 1, end: 2, leafHash: leaf1, nodes: [][]byte{leaf0, hash2}}
if fullProof.VerifyNamespace(sha256.New(), nID, nil, root) {
    fmt.Println("full proof is successfully verified")  // this will be executed
} else {
    fmt.Println("verification of the full proof failed") 
}

Proposed Solutions

One of the following options will solve this issue:

If the current behaviour does not cause serious issue in the entire system, then we shall consider the current behavior of the VerifyNamespace valid and include it in the specification.
For an absence proof, we should return the underlying leaf data instead of the leaf hash so that the internal nodes wont be acceptable in place of leaf nodes. Please see this comment for the rationale of this solution.
To prevent partial proofs from being successfully verified, we can modify the implementation of VerifyNamespace by adding a new parameter to take the size of the tree, i.e., the number of data items it represents. This new parameter would be used to adjust the verification process to account for the tree size. By doing so, we can ensure that partial proofs do not comply with the height of the tree, since they will include fewer nodes than what is required for a full proof.

The text was updated successfully, but these errors were encountered:

staheri14 · 2023-02-21T18:02:39Z

cc: @liamsi @evan-forbes Please let me know your thoughts on this, thanks!

liamsi · 2023-02-28T13:03:34Z

If the current behaviour does not cause serious issue in the entire system, then we shall consider the current behavior of the VerifyNamespace valid and include it in the specification.

Hmm, I fail to see if this actually causes any problems. But need to think about it further. The example you provided basically states that a root is sufficient to prove absence of a namespace (and is basically several instances of this case). In this case this is a subtree root but the base-case we should decide is: is it sufficient to provide only the root to prove that a namespace is not included?
Edit: actually, no. It’s not that simple. hash1 does not prove that ns 1 is outside of the namespace range of its subtree (hash2 does though). So that case is different from the base case I was talking about. The question instead boils down to: does hash1 sufficiently prove that no leaf with ns 1 is included in its subtree?

For an absence proof, we should return the underlying leaf data instead of the leaf hash so that the internal nodes wont be acceptable in place of leaf nodes.

Why is that? Don't the internal node equivalently prove the absence? If anything, the proof should consider the height as you suggested but I am not sure that is even necessary. Note that leaf and inner nodes are domain separated (like in rfc 6962) s.t. hashing a leaf and an inner node yields a different result. Though it sounds like in this specific questions/case is orthogonal to that.

liamsi · 2023-02-28T14:48:34Z

In other words, a malicious node is not able to convince someone of the absence of a namespace ID that actually exists in the tree.

I think as long as this property really holds we are good either way.

rootulp · 2023-02-28T15:46:29Z

The question instead boils down to: does hash1 sufficiently prove that no leaf with ns 1 is included in its subtree?

No. hash1 is 0 (minNamespaceID) | 2 (maxNamespaceID) | hash so it is possible that a node with namespaceID=1 is present in the subtree under hash1. Additional data is needed in the proof (e.g. the leafHash under that subtree). Question with regards to the leafHash godoc:

        // leafHash contains a tree leaf that 1) its namespace ID is the largest
	// namespace ID less than nid and 2) the namespace ID of the leaf to the
	// left of it is smaller than the nid 3) the namespace ID of the  leaf to
	// the right of it is larger than nid.

Why is statement 2 necessary? If statement 1 is true, statement 2 must be true because the assumption that leaves of the NMT are sorted.

staheri14 · 2023-02-28T17:48:03Z

For an absence proof, we should return the underlying leaf data instead of the leaf hash so that the internal nodes wont be acceptable in place of leaf nodes.
Why is that? Don't the internal node equivalently prove the absence? If anything, the proof should consider the height as you suggested but I am not sure that is even necessary. Note that leaf and inner nodes are domain separated (like in rfc 6962) s.t. hashing a leaf and an inner node yields a different result. Though it sounds like in this specific questions/case is orthogonal to that.

@liamsi One reason that VerifyNamespace accepts a partial proof is that it does not verify whether the supplied leafHash belongs to a leaf or an intermediate node. Due to this, any intermediate node e.g., hash1 in the preceding example can be sent in place of a leafHash. However, if the underlying data of the leafHash is supplied, then we can perform the HashLeaf operations ourselves (using the LeafPrefix 0x00) and make sure the node corresponds to a leaf (hence the supplied proof is not partial rather a full proof). This allows us to detect whether the provided proof is partial or full.

staheri14 · 2023-02-28T21:22:54Z

Why is statement 2 necessary? If statement 1 is true, statement 2 must be true because the assumption that leaves of the NMT are sorted.

@rootulp Right, I recall that we discussed it for the NMT specification, the spec was updated, however, the code documentation was not updated accordingly at that time. Thank you for bringing this to my attention. Please see the following PR that addresses your concrn as well #114.

staheri14 · 2023-02-28T21:36:14Z

The question instead boils down to: does hash1 sufficiently prove that no leaf with ns 1 is included in its subtree?

@liamsi
The absence of nID=1 cannot be proven by hash1, as noted in #110 (comment).
If we choose to implement the first proposed solution, which is to accept partial absence proof as valid, we could send the highest ancestor (or one of the ancestors) of the leafHash instead of the leafHash itself. This ancestor should be the one closest to the root on the branch connecting the leafHash to the root, and its range of namespace IDs should not overlap with the queried nID.

I want to clarify that I am not promoting this solution, but rather explaining how it can be implemented correctly.

…proof (#114) ## Overview This PR intends to bring the description of the `leafHash` field in the `Proof `data structure in line with the specification. Additionally, among the three criteria enumerated for the `leafHash` the third criterion is redundant since it is already covered by the first two. Although this has already been addressed in the specification, the code description needs to be updated, and this PR accomplishes that (this comment was originally raised in #110 (comment)). ## Checklist - [x] New and updated code has appropriate documentation - [x] New and updated code has new and/or updated testing - [x] Required CI checks are passing - [x] Visual proof for any user facing features like CLI or documentation updates - [x] Linked issues closed with keywords

liamsi · 2023-03-02T13:33:57Z

The absence of nID=1 cannot be proven by hash1, as noted in #110 (comment).

Yeah, I see. We need to fix this. Great find and write-up. Looking forward to the outcome of #118

staheri14 · 2023-06-07T02:32:06Z

Considering the results established in the following notion page we can consider this issue as completed.
I also created a follow-up issue to decide on whether to include the partial absence proof description in the NMT specs.

staheri14 self-assigned this Feb 21, 2023

staheri14 mentioned this issue Feb 28, 2023

chore: fixes the description of the index of leafHash in the absence proof #114

Merged

5 tasks

staheri14 mentioned this issue Feb 28, 2023

Namespace proof soundness can fail if leafHash namespace range isn't checked against queried namespace ID #115

Closed

staheri14 added the enhancement New feature or request label Mar 1, 2023

staheri14 mentioned this issue Mar 1, 2023

Assessment of Soundness for the NMT Partial Namespace Absence Proof #118

Closed

2 tasks

staheri14 mentioned this issue Mar 9, 2023

EPIC: Specify data root construction using rsmt2d, nmt, and the nmt wrapper celestiaorg/celestia-app#1296

Closed

2 tasks

staheri14 mentioned this issue Apr 4, 2023

NMT mainnet readiness #173

Closed

staheri14 mentioned this issue May 2, 2023

Verification functions do not check the consistency between the NID of the supplied leaves and the queried NID #183

Closed

staheri14 closed this as completed Jun 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vulnerability in VerifyNamespace: Partial Absence Proofs #110

Vulnerability in VerifyNamespace: Partial Absence Proofs #110

staheri14 commented Feb 21, 2023 •

edited

Loading

staheri14 commented Feb 21, 2023

liamsi commented Feb 28, 2023 •

edited

Loading

liamsi commented Feb 28, 2023

rootulp commented Feb 28, 2023

staheri14 commented Feb 28, 2023 •

edited

Loading

staheri14 commented Feb 28, 2023 •

edited

Loading

staheri14 commented Feb 28, 2023 •

edited

Loading

liamsi commented Mar 2, 2023

staheri14 commented Jun 7, 2023

Vulnerability in VerifyNamespace: Partial Absence Proofs #110

Vulnerability in VerifyNamespace: Partial Absence Proofs #110

Comments

staheri14 commented Feb 21, 2023 • edited Loading

Problem

Proposed Solutions

staheri14 commented Feb 21, 2023

liamsi commented Feb 28, 2023 • edited Loading

liamsi commented Feb 28, 2023

rootulp commented Feb 28, 2023

staheri14 commented Feb 28, 2023 • edited Loading

staheri14 commented Feb 28, 2023 • edited Loading

staheri14 commented Feb 28, 2023 • edited Loading

liamsi commented Mar 2, 2023

staheri14 commented Jun 7, 2023

staheri14 commented Feb 21, 2023 •

edited

Loading

liamsi commented Feb 28, 2023 •

edited

Loading

staheri14 commented Feb 28, 2023 •

edited

Loading

staheri14 commented Feb 28, 2023 •

edited

Loading

staheri14 commented Feb 28, 2023 •

edited

Loading