-
Notifications
You must be signed in to change notification settings - Fork 63
Conversation
Huh, what a PR! Really looking forward to have a look into this, thanks so much! 🤓 📚 |
Just for my test preparation: this should also work with a fast-synced Geth DB to a post-Byzantium state, shouldn't it? |
Yes, it should be able to load all of the block headers, transactions and uncle headers from a fast-synced Geth DB, including post-Byzantium blocks. Something like this should let you iterate through the chain:
Also, here is an example of running the VM on a full or fast sync geth db after a specific block number:
I get a "tx has a higher gas limit than the block" error when attempting to run the code above. I'm not sure if this is due to a problem with the VM or loading the geth db. Will need to look into this further... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for this PR, I'm realizing more and more what is all inside it and how much work this had to be.
I'm now once through line-by-line and I think I am getting most of it structurally, not able yet to make comments on the detail level though.
One thing to be aware of: while this is supporting the old constructors, it won't be possible with this to use an already written DB any more (this is correct, isn't it?). I think that's worth it, also tried to re-cap and I don't think that there are many users of the library who use it on more than a simulation level. Nevertheless I think this should be stated once.
Will continue tomorrow dig a bit deeper into the tests and also locally checkout your fork. Also hope to have a geth fast sync ready, can't wait to try this out! 😄
@vpulim this is awesome! Thanks @holgerd77 for putting in the effort to review these changes. This is a big PR! |
return { | ||
rawHead: this._headHeader, | ||
heads: this._heads, | ||
genesis: this._genesis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The meta
getter
is not really returning the meta like in the previous version and as expected after reading your point 3) description point:
As a result, the meta field has been moved into a getter that generates the old meta info from other internal fields
Old format:
{ heads: {},
td: <BN: 400000000>,
rawHead: 'd4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3',
height: 0,
genesis: 'd4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3' }
New format:
{ rawHead: <Buffer d4 e5 67 40 f8 76 ae f8 c0 10 b8 6a 40 d5 f5 67 45 a1 18 d0 90 6a 34 e6 9a ec 8c 0d b1 cb 8f a3>,
heads: {},
genesis: <Buffer d4 e5 67 40 f8 76 ae f8 c0 10 b8 6a 40 d5 f5 67 45 a1 18 d0 90 6a 34 e6 9a ec 8c 0d b1 cb 8f a3> }
So types are different and td
and height
are missing. Is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while this is supporting the old constructors, it won't be possible with this to use an already written DB any more (this is correct, isn't it?)
Yes, this PR is not compatible with existing DBs and we should definitely make that very clear. I could create a script to migrate old DBs into geth format if there is enough demand for it.
So types are different and td and height are missing. Is this intentional?
The td and height were intentionally left out. The README doesn't mention a meta field and it refers to BlockChain Properties that don't exist so I was unsure whether removing meta would break the published interface. My preference would be to remove the meta field completely and explicitly expose certain properties (headHeader, headBlock, genesis) and async get methods such as getTd(cb) and getHeight(cb). The async methods are needed since computing td and height require db operations under the geth db design.
However, if we absolutely needed to keep the current meta interface and make these values available synchronously, it is possible but would require additional db calls to ensure these values are always up-to-date. My opinion is that the convenience of this doesn't outweigh the additional performance overhead of preemptively computing these values whenever there is a change to the blockchain (instead of computing them on-demand). As a compromise, I could fix the meta getters to return correct values (including pre-computing td and height), but also deprecate meta and add new properties and async get methods to the interface going forward.
Regarding the difference in types for meta.rawHead and meta.genesis, that was a mistake on my part! I can change the getter to return hex strings instead. Internally, I keep all hash values as Buffers until a conversion to String is absolutely necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- DB compatibility
I would assume that it won't become necessary but it's good to know that there is this fallback solution with a migration script in case there are more people relying on this then realized. People can also still use thev2.1.0
version (for some time).
- meta
I would also say that we can drop themeta
"interface" completely. This was always something very implicit, just did a short GitHub search, within theethereumjs
ecosystem I found only two direct accesses on this from within the VM implementation which can be easily updated. I very much prefer your solution to expose these properties directly in the way you described above.
@@ -303,40 +369,63 @@ Blockchain.prototype._putBlock = function (block, cb, isGenesis) { | |||
/** | |||
*Gets a block by its hash | |||
* @method getBlock | |||
* @param {String|Buffer|Number} hash - the sha256 hash of the rlp encoding of the block | |||
* @param {Buffer|Number|BN} hash - the sha256 hash of the rlp encoding of the block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm unsure how to proceed with API documentation. This is a bit of a mess anyhow atm and we should switch to generated documentation
API docs from the code. My tendency is to not update the README on this with this PR and then do the autogeneration on a direct subsequent one and then switch to that and remove the current (already incomplete) API docs from the README.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(if you want to take on the extra work you can also just add the documentation
dependency to the dev dependencies, add a npm command like "build:docs": "documentation build ./index.js --format md --shallow > ./docs/index.md
or similar (would be cool to omit the _
functions, not sure if such a flag exists) and the do the documentation changes above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the idea of autogenerating docs. Once this PR is accepted, I'm happy to do another one implementing the approach you describe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already done in various of the other ethereumjs
libraries, e.g. in ethereumjs-block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm now done going through the tests.
I have a strong tendency to give this a go, since both test coverage (including testing of existing functionality) has increased significantly and code is more readable/understandable then before.
I will leave this open over the weekend and then eventually approve on Tuesday or Wednesday next week. Everyone who wants to have another look at the code might do so in between.
We also might want to do at least some basic investigation/origin search about the VM "tx has a higher gas limit than the block" error (see this comment) and make sure this is actually originating in the VM code.
I would then release this as a new major |
Ok, have tested the iterator example, this works like a charm, will let this run through a bit... 🏇🏇🏇 Couple of minutes later, just passed the 100.000 mark and no signs of slowing down. Watched the memory a bit, stays relatively constantly around 4%. Pretty cool. 😄 |
Did a test-PR over on the VM running the tests with the changed I also tried to run the VM example, I came to the conclusion that this is a separate construction site which we can approach slowly/independently on top of this. Actually got the example running to some extend (I had to add Ok. I'll leave this open for another 24 hours for comments. |
@holgerd77 Awesome! Happy to hear that all tests passed :) Thanks again for all your work on reviewing/testing this PR. |
Could you do a review of ethereumjs/ethereumjs-block#44 since you have already looked into the commons library? |
Sure, I'll take a look at it today. |
Ok, will now merge this. Thanks once more @vpulim for this wonderful PR. Will do a subsequent PR with the docs changes and then maybe do a release tomorrow or the day after. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Short note: So regarding documentation I'll stick to the conservative approach for now and just manually update the README, we can take on the above separately. Will also update the README with the first usage example you posted. |
…ity), added example code snippet
…ity), added example code snippet
Updated API docs (Geth compatibility PR #47)
I would like to note that we do not guarantee stability of the go-ethereum database schema. It can change without notice. You have been warned ;). |
@fjl Hehe. Thanks for letting us now, we'll keep this in mind. 😄 Will be useful for us anyhow, minimally for VM testing and development purposes. |
And any reason you didn't put the |
@holgerd77 That bit of code is a little confusing to read unfortunately, but yes that is the intention. Both of those callbacks feed their return values (in that order) to the lookupByHashAndNumber function which takes a hash as the first value and number as the second. In the first cb() call, blockTag is a hash and in the second call, blockTag is a number. So in both cases, cb() is being called with hash and number, in that order. |
@holgerd77 headHeader is just a hash value, not a full header object. So a db get operation must be made in order to retrieve the height (from the block number). |
This is a major refactoring of the code to support reading from geth leveldb databases. One major benefit of these changes is to allow testing of ethereumjs-vm on a blockchain previously synched by geth, including all of the various hardforks (the current ethereumjs-vm only supports the byzantium fork, however).
Apologies ahead of time for the large code diff, but I couldn't come up with a way to make smaller incremental changes since the current architecture makes heavy use of the doubly linked list from detailsDB which had to be removed.
This is a list of the major changes required for geth db compatibility:
Finally, the the ethereumjs-vm blockchain tests have been run on this PR and the number of passing tests remained the same as compared to the current HEAD (https://gist.github.com/vpulim/efbb864d5790643e06cf87b616036141)