-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merkle Patricia Trie #10
Conversation
…ient into feature/patriciaTrie
…-client into feature/patriciaTrie
… was rejecting the uptate
…eFromStorage lists in put
…the node list on memory
…-client into feature/patriciaTrie
case (Nil, Some(value)) => LeafNode(Array.emptyByteArray, value, hashFn) | ||
case _ => node | ||
} | ||
case extensionNode@ExtensionNode(sharedKey, next, _) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next
is unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
toDeleteFromStorage: Seq[Node] = Seq(), | ||
toUpdateInStorage: Seq[Node] = Seq()) | ||
|
||
private case class NodeRemoveResult(hasChanged: Boolean, maybeNewChild: Option[Node], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NodeInsertResult
and NodeRemoveResult
are use internally in MerklePatriciaTree
, right? So maybe it would better to place them in the companion object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they are only used for the output of the remove and put functions. I agree, I'm currently fixing it.
def fromBytes(bytes: Array[Byte]): T | ||
} | ||
|
||
trait DataSource { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are those traits in a package object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rtkaczyk Because they are part of the MPT public API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we decided to put them there as they should be commonly implemented in order to use the trie (if any our implementations of them are not used). Do you think it would be better to put them elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but why package object
? Package objects are usually used for things that cannot be placed at a top-level of a package, like vals and defs. But traits can :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was proposed by @mcsherrylabs
…oveResult to companion object
@@ -1,7 +1,7 @@ | |||
package io.iohk.ethereum | |||
|
|||
import java.math.BigInteger | |||
import org.scalacheck.{Arbitrary, Gen} | |||
import org.scalacheck.{Arbitrary, Gen, _} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is equivalent to import org.scalacheck._
. Isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. As we are only using org.scalacheck.{Arbitrary, Gen}
I'll change it to that.
…-client into feature/patriciaTrie
val keyNibbles = HexPrefix.bytesToNibbles(bytes = kSerializer.toBytes(key)) | ||
val root = getNode(rootId, dataSource) | ||
remove(root, keyNibbles) match { | ||
case NodeRemoveResult(true, Some(newRoot), nodesToRemoveFromStorage, nodesToUpdateInStorage) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
about second parameter: it is called here newRoot
but in case class this field is named maybeNewChild
https://github.com/input-output-hk/etc-client/pull/10/files#diff-2c1aab8e8ca156fbe430bbb586ecfc01R17
should it be that way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In NodeInsertResult
and NodeRemoveResult
we named it that way as it is more general, the maybeNewChild
could be the an inner node of the trie, as is the case in the uses of both case classes in the put and remove function. Maybe we should rename maybeNewChild
as maybeNewNode
as the use of a the child word can be mixed with it meaning in the BranchNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just newNode
, as it is in NodeInsertResult
?
I think maybe
part is implied by Option
type
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I'll change the variable's name to newNode
.
…io.iohk.ethereum.mpt
…-client into feature/patriciaTrie
…/etc-client into feature/patriciaTrie
…-client into feature/patriciaTrie
|
assert(obtainedAfterDelete.isEmpty) | ||
} | ||
|
||
ignore("IODB test - Insert of the first 5000 numbers hashed and then remove half of them"){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these tests ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We set some of the IODB and EthereumJ Compatibility tests to be ignored so that they are not run every time, as they take a while to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have several remarks about those tests:
- If we're ignoring tests in such manner it deserves at least a comment. Otherwise an unknowing developer might come at those tests and wonder what he has to do to fix them.
- For such long running tests I think a better solution would be to create them in a separate test configuration. Then in Circle CI we could easily run them conditionally with
sbt long-running:test
, e.g. only when merging to master.
Another reason for a separate config is that, unless I'm missing something, they don't look like regular unit tests. Are they stress tests? Does running them with 40000 numbers give us a big boost of confidence wrt correctness as compared to running with 4 numbers?
Lastly, if we don't run those tests automatically how are we going maintain them? Are we going to nominate a person who's going to run them occasionally to see if they're not broken? I certainly do not volunteer 😁 - I ran one of those tests:
"IODB Test - PatriciaTrie insert and get"
and it takes ~600 ms. I wouldn't say that's terribly slow. - Could we test
IodbDataSource
in a separate suite and without dependency onMerklePatriciaTrie
? It is my understanding MPT depends on a DataSource, not vice versa. More importantly though, what if we find out that IODB doesn't meet our expectations? We shouldn't have to change MPT tests when we replace the DataSource implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I do agree it's confusing. That being said, I don't like (at least in general terms) to push a broken test marked as
ignore
. All tests should be working or get fixed. - Lol. yes, we can create a separate config that's a good idea. The goal of some of them is to maintain certain level of compatibility against other implementations and allow others to compare our client against ethereum wiki ones.
- No it's not too much atm, we can uningnore it
- Yes we can, but, based on item (2) there is a benchmark test that requires a non ephemeral trie so it does make sense to use them there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good, then I would propose to name the configuration benchmark
, and later we can configure Circle to run it when merging to master or by using nightly builds.
Regarding IodbDataSource
it's fine if the tests utilise MPT when advantageous, especially in benchmark. However, a separate IodbDataSourceSuite
testing get/update
in isolation could be useful, even if it's really simple, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
90 sec does not looks scary to me ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
90 sec does not looks scary to me ;)
So I'll have to wait 90 secs every time I save a file i a project ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlanVerbner let's use custom configs. Imho it's most convenient approach from user's point of view. Think of running sbt it:test
in SBT console instead of dealing with Tests.Argument("-l", "LongRunningTest")
settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whysoserious deal. Will create a custom config called benchmark
(as @rtkaczyk suggested) and will move this tests over there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
90 sec does not looks scary to me ;)
It may be scary once we accumulate more of such tests. I wouldn't want to wait over an hour for a PR to go green.
in order to finish this PR CircleCI will have to successfully run an additional step - sbt it:test
@whysoserious it looks that we need a proper circle.yml
file first (I thought Circle was smart enough to pick up travis.yml
but apparently it's just using some defaults for Scala). So I'd say that's a follow-up PR.
Anyway if those tests currently take just a few minutes then it's OK to require them for every PR, but ultimately I'd like that we run them conditionally: either when merging to master or during nightly builds.
…/etc-client into feature/patriciaTrie
…artitioning [MPT] Test partitioning in order not to ignore long running tests
One final remarks about the tests is that to putting those tests Anyway, for now I'm glad we have a separation of non-unit tests. Once we have more such tests and we can clearly categorise them we can think about further separation. |
LGTM 👍 @adamsmo wanna give a second approval? |
LGTM as well 👍 |
@rtkaczyk I definetly agree with your comment but what I do like most is
|
Description
This branch has the MPT implementation. It also has a DataSource implementation based on IODB.
We tried to have as most coverage as we could for this code because of it's critical nature so please keep special attention at it while reviewing.
Design choices
LSMStore
keySize
config.MerklePatriciaTrie
class could be immutable but as long as it needs to access a mutableDataSource
it's also mutable.ByteArraySerializable
that will be responsible for their byte array encoding. There is no specific guideline regarding how that should be done as long as the merkle root returns the same hash