Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulnerability in VerifyNamespace: Partial Absence Proofs #110

Closed
staheri14 opened this issue Feb 21, 2023 · 9 comments
Closed

Vulnerability in VerifyNamespace: Partial Absence Proofs #110

staheri14 opened this issue Feb 21, 2023 · 9 comments
Assignees
Labels
enhancement New feature or request

Comments

@staheri14
Copy link
Contributor

staheri14 commented Feb 21, 2023

Problem

Identified as part of EPIC celestiaorg/celestia-app#1296.
The current implementation of VerifyNamespace is vulnerable to an issue in which a queried node can act inconsistently with the protocol specifications for when a queried namespace ID does not have corresponding items in the tree (i.e., absence proof). The attack/issue involves providing a partial absence proof for a namespace ID, while remaining undetected by the verification implementation. A partial proof in this case refers to a proof of an ancestor of a leaf to the root, rather than a proof of inclusion of the leaf to the tree. This is because the implementation cannot distinguish between a full or partial proof.

Note: Please note that although VerifyNamespace is unable to detect partial proofs, the soundness of the namespace proof is not compromised. In other words, a malicious node is not able to convince someone of the absence of a namespace ID that actually exists in the tree. However, the inability to distinguish this inconsistent behavior may lead to incorrect conclusions being drawn at the application layer about the underlying data represented by NMT. The impact of this issue depends on the context in which NMT is being used and may or may not be considered an attack. Nevertheless, it is important to address this issue properly as it currently causes inconsistency and ambiguity in the verification process.
Update: I suspect that the current implementation may be intentional to help with the non-interactive default rules for inclusion proofs of Blobs. Even if this is the case, the specs needs to be updated to make it clear whether partial absence proofs are considered valid or not.

Root Cause:
This is because VerifyNamespace does not consider the height of the tree or the total number of leaves during verification. Furthermore, the absence proof is based solely on the leaf hash and not on its underlying data. As a result, any internal node (non-leaf node) can be used as a leaf hash in the proof.

Example:
Below is an example of such a partial proof that can be successfully verified by the VerifyNamespace algorithm.
Description of the example: A correct absence proof for nID=1 would consist of the following: proof.start=1, proof.end=2, proof.leafHash=leaf1, and proof.nodes=[leaf0, hash2].
However, it is possible to construct a partial proof that would also be considered correct by the verifyNamespace function.
This alternate proof would have proof.start=0, proof.end=1, proof.leafHash=hash1, and proof.nodes=[hash2].
Screenshot 2023-02-28 at 4 24 38 PM

nID := namespace.ID{1}
// create a tree with 4 leaves, variables names match the figure above
n := New(sha256.New(), NamespaceIDSize(1))
d0 := append([]byte{0}, []byte("data0")...)
d1 := append([]byte{2}, []byte("data1")...)
d2 := append([]byte{3}, []byte("data2")...)
d3 := append([]byte{4}, []byte("data3")...)

n.Push(d0)
n.Push(d1)
n.Push(d2)
n.Push(d3)

// this will populate n.leafHashes with the hash of d0...d3
n.computeLeafHashesIfNecessary()

root := n.Root()
hash1 := n.computeRoot(0, 2)
hash2 := n.computeRoot(2, 4)
leaf0 := n.leafHashes[0]
leaf1 := n.leafHashes[1]

// attack scenario: create a partial proof from hash1 to the root instead of leaf1 to the root
partialProof := Proof{start: 0, end: 1, leafHash: hash1, nodes: [][]byte{hash2}, isMaxNamespaceIDIgnored: true}
// run VerifyNamespace for the fabricated absence proof and see if it can pass
if partialProof.VerifyNamespace(sha256.New(), nID, nil, root) {
    fmt.Println("partial proof is successfully verified")  // this will be executed
} else {
    fmt.Println("verification of the partial proof failed")
}

// correct scenario: create a full proof from leaf1 to the root
fullProof := Proof{start: 1, end: 2, leafHash: leaf1, nodes: [][]byte{leaf0, hash2}}
if fullProof.VerifyNamespace(sha256.New(), nID, nil, root) {
    fmt.Println("full proof is successfully verified")  // this will be executed
} else {
    fmt.Println("verification of the full proof failed") 
}

Proposed Solutions

One of the following options will solve this issue:

  • If the current behaviour does not cause serious issue in the entire system, then we shall consider the current behavior of the VerifyNamespace valid and include it in the specification.
  • For an absence proof, we should return the underlying leaf data instead of the leaf hash so that the internal nodes wont be acceptable in place of leaf nodes. Please see this comment for the rationale of this solution.
  • To prevent partial proofs from being successfully verified, we can modify the implementation of VerifyNamespace by adding a new parameter to take the size of the tree, i.e., the number of data items it represents. This new parameter would be used to adjust the verification process to account for the tree size. By doing so, we can ensure that partial proofs do not comply with the height of the tree, since they will include fewer nodes than what is required for a full proof.
@staheri14 staheri14 self-assigned this Feb 21, 2023
@staheri14
Copy link
Contributor Author

cc: @liamsi @evan-forbes Please let me know your thoughts on this, thanks!

@liamsi
Copy link
Member

liamsi commented Feb 28, 2023

If the current behaviour does not cause serious issue in the entire system, then we shall consider the current behavior of the VerifyNamespace valid and include it in the specification.

Hmm, I fail to see if this actually causes any problems. But need to think about it further. The example you provided basically states that a root is sufficient to prove absence of a namespace (and is basically several instances of this case). In this case this is a subtree root but the base-case we should decide is: is it sufficient to provide only the root to prove that a namespace is not included?
Edit: actually, no. It’s not that simple. hash1 does not prove that ns 1 is outside of the namespace range of its subtree (hash2 does though). So that case is different from the base case I was talking about. The question instead boils down to: does hash1 sufficiently prove that no leaf with ns 1 is included in its subtree?

For an absence proof, we should return the underlying leaf data instead of the leaf hash so that the internal nodes wont be acceptable in place of leaf nodes.

Why is that? Don't the internal node equivalently prove the absence? If anything, the proof should consider the height as you suggested but I am not sure that is even necessary. Note that leaf and inner nodes are domain separated (like in rfc 6962) s.t. hashing a leaf and an inner node yields a different result. Though it sounds like in this specific questions/case is orthogonal to that.

@liamsi
Copy link
Member

liamsi commented Feb 28, 2023

In other words, a malicious node is not able to convince someone of the absence of a namespace ID that actually exists in the tree.

I think as long as this property really holds we are good either way.

@rootulp
Copy link
Collaborator

rootulp commented Feb 28, 2023

The question instead boils down to: does hash1 sufficiently prove that no leaf with ns 1 is included in its subtree?

No. hash1 is 0 (minNamespaceID) | 2 (maxNamespaceID) | hash so it is possible that a node with namespaceID=1 is present in the subtree under hash1. Additional data is needed in the proof (e.g. the leafHash under that subtree). Question with regards to the leafHash godoc:

        // leafHash contains a tree leaf that 1) its namespace ID is the largest
	// namespace ID less than nid and 2) the namespace ID of the leaf to the
	// left of it is smaller than the nid 3) the namespace ID of the  leaf to
	// the right of it is larger than nid.

Why is statement 2 necessary? If statement 1 is true, statement 2 must be true because the assumption that leaves of the NMT are sorted.

@staheri14
Copy link
Contributor Author

staheri14 commented Feb 28, 2023

For an absence proof, we should return the underlying leaf data instead of the leaf hash so that the internal nodes wont be acceptable in place of leaf nodes.
Why is that? Don't the internal node equivalently prove the absence? If anything, the proof should consider the height as you suggested but I am not sure that is even necessary. Note that leaf and inner nodes are domain separated (like in rfc 6962) s.t. hashing a leaf and an inner node yields a different result. Though it sounds like in this specific questions/case is orthogonal to that.

@liamsi One reason that VerifyNamespace accepts a partial proof is that it does not verify whether the supplied leafHash belongs to a leaf or an intermediate node. Due to this, any intermediate node e.g., hash1 in the preceding example can be sent in place of a leafHash. However, if the underlying data of the leafHash is supplied, then we can perform the HashLeaf operations ourselves (using the LeafPrefix 0x00) and make sure the node corresponds to a leaf (hence the supplied proof is not partial rather a full proof). This allows us to detect whether the provided proof is partial or full.

@staheri14
Copy link
Contributor Author

staheri14 commented Feb 28, 2023

Why is statement 2 necessary? If statement 1 is true, statement 2 must be true because the assumption that leaves of the NMT are sorted.

@rootulp Right, I recall that we discussed it for the NMT specification, the spec was updated, however, the code documentation was not updated accordingly at that time. Thank you for bringing this to my attention. Please see the following PR that addresses your concrn as well #114.

@staheri14
Copy link
Contributor Author

staheri14 commented Feb 28, 2023

The question instead boils down to: does hash1 sufficiently prove that no leaf with ns 1 is included in its subtree?

@liamsi
The absence of nID=1 cannot be proven by hash1, as noted in #110 (comment).
If we choose to implement the first proposed solution, which is to accept partial absence proof as valid, we could send the highest ancestor (or one of the ancestors) of the leafHash instead of the leafHash itself. This ancestor should be the one closest to the root on the branch connecting the leafHash to the root, and its range of namespace IDs should not overlap with the queried nID.

I want to clarify that I am not promoting this solution, but rather explaining how it can be implemented correctly.

staheri14 added a commit that referenced this issue Feb 28, 2023
…proof (#114)

## Overview

This PR intends to bring the description of the `leafHash` field in the
`Proof `data structure in line with the specification. Additionally,
among the three criteria enumerated for the `leafHash` the third
criterion is redundant since it is already covered by the first two.
Although this has already been addressed in the specification, the code
description needs to be updated, and this PR accomplishes that (this
comment was originally raised in
#110 (comment)).

## Checklist

- [x] New and updated code has appropriate documentation
- [x] New and updated code has new and/or updated testing
- [x] Required CI checks are passing
- [x] Visual proof for any user facing features like CLI or
documentation updates
- [x] Linked issues closed with keywords
@staheri14 staheri14 added the enhancement New feature or request label Mar 1, 2023
@liamsi
Copy link
Member

liamsi commented Mar 2, 2023

The absence of nID=1 cannot be proven by hash1, as noted in #110 (comment).

Yeah, I see. We need to fix this. Great find and write-up. Looking forward to the outcome of #118

@staheri14
Copy link
Contributor Author

Considering the results established in the following notion page we can consider this issue as completed.
I also created a follow-up issue to decide on whether to include the partial absence proof description in the NMT specs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants