-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Causal guarantees #2
Comments
The merkle tree construction is not inherently preserving causal ordering, I can freely choose the order in which I concatenate things before hashing. The trivial special case: If I have two messages A and B, I can create either the log A | B ( I can't do this permutation at arbitrary points, since the size of the trees that make up the log must be (non-strictly) decreasing. But whenever two trees of equal size are merged, I get to freely choose the ordering, violating causality if I choose to do so. To prevent this, you could use something like threaded authentication trees (section 5.3 of the paper), but as far as I'm aware hypercore doesn't do any of this. I'd be happy to be wrong about this. (The threaded authentication trees as presented in the paper have O(n log(n)) space overhead, but there's a fairly simple optimization to get it back to O(n) by adding additional edges to inner nodes and thus covering more leaves per edge. I can go into more detail if you are interested in this, but I haven't done a writeup I could link you to and I haven't seen this optimization in the literature either.) |
No, you cannot prepend to the Hypercore or reorder it, even in the even sized sub tree case
The treeHash is the hash the secret key signs (ie the entire ordered history) The important part here is peak.index as that encodes the position of the Merkle peak, making sure to lock that tree down in terms of causal time. (See the actual impl here https://github.com/mafintosh/hypercore-crypto/blob/master/index.js#L62-L76) For example let's say you have your example M3
/ \
/ \ / \
D0 D1 D2 D3 The tree hash (ie the hash the keypair signs) is
If you append D0' the tree looks like this
With the tree hash looking like this:
Let me know if that makes sense or if I'm missing something. Perhaps an example with actual data makes it easier to grok. The protocol paper def needs an update I can see, as that does not reflect the tree hashing stuff that has always been there (ETOOMANYDOCS :)) |
My concern are malicious authors. Are you kindly asking the author to use your code and to not lie about the index, or is there some verification logic to enforce it? I'll be satisfied when you can show me the code that rejects a hypercore where the indexes are inconsistent, i.e. the code that rejects going from
to
This is not the only reordering a malicious author could attempt in this step, there is one for each tree merge, i.e. (Probably obvious but still worth stating explicitly (the verification section of DEP-0002 doesn't): The verification code also has to check the sizes that the treeHash claims.) |
The replicating peer would reject your above example as the Merkle roots don't add up anymore. It already has M3, so when trying to build M'' it will compute a different hash rejecting the incoming replication.
The tree verification is run here https://github.com/mafintosh/hypercore/blob/master/index.js#L935 The treeHash is re-produced by the peer that is trying to replicate the data, no data is implicitly trusted from the author. |
That's looking reassuring @mafintosh. One last (hopefully) question before I'm happy: Suppose a malicious author Mal decides on eight data items D1, ..., D8 that should make up their log. Mal decides that the first item should be D1, the second D2, the fifth D5, the sixth D6, the seventh D7 and the eights D8. Mal's aim is to not commit to whether D3 and D4 are the third and fourth or the fourth and third item respectively, until it is absolutely necessary. A peer who has no prior information about Mal's log requests the fifth entry, so Mal supplies D5 and the paths through the merkle tree of eight leaf nodes. Does the additional data that Mal has to send to prove that claim already imply the ordering between D3 and D4? Or could Mal, if at a later point the same peer requested the fourth entry still decide whether to place D3 or D4 there? |
Let me try and sketch out the example you mention So the replicating peer, P will receive the following merkle tree from Mal on their initial request:
( It uses the top Now in an evil act, Mal reorders D3 and D4 as you mention. Let's say P asks for D3. Mal will send back the following tree
( Now in this case when it produces the top root hash it will compare this hash against the one it already produced from the previous request that it has stored. It won't match as D3 and D4 has been reordered changing the hash of that tree. Hence P rejects the data and drops the connection to Mal. Makes sense? :) |
Yup, I'm satisfied now. Not really happy - proving that hypercore's scheme provides these guarantees seems to be far more finicky than for antimonotone graphs - but I'll remove the claim about hypercore's guarantees being weaker from the readme :) Thanks for taking the time explaining this. Bamboo still has the better time complexity of O(1) per append and verify operation though :P Then again, hypercore achieves better constants for the O(log(n)) certificate size. |
Yep, pros and cons rule the world :) A final note is that hypercore uses the above design for maximum deduplication through the merkle trees. I'll make sure the docs are updated. |
Do you mind going into a bit deeper detail on why you consider the causal guarantees stronger with bamboo vs hypercore?
The text was updated successfully, but these errors were encountered: