Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Ethereum network indexer (phase 1: blocks only) #1383

Merged
merged 53 commits into from
Dec 30, 2019

Conversation

Jannis
Copy link
Contributor

@Jannis Jannis commented Nov 26, 2019

I consider this generally ready for review!

This PR implements phase 1 of #297, including the following features:

  1. A network indexer that indexes blocks from the selected network. (Transactions, logs, receipts, accounts, balances come in the next two phases.) This network indexer handles reorgs to an unlimited depth (bounded only by the memory of the machine the node runs on).

  2. A --network-subgraphs CLI flag to enable network subgraph indexing per network, e.g. with

    graph-node ... --network-subgraphs ethereum/mainnet ethereum/kovan
  3. A built-in GraphQL schema that allows Ethereum network subgraphs to be queried (or subscribed to) at /subgraphs/ethereum/mainnet. Apart from slightly different relationship fields, this schema is heavily based on Geth's GraphQL schema.

  4. Tests for basic indexing and reorg handling. These should be extended further by also verifying that the blocks are actually written to the store. Right now they only test the Revert and AddBlock events emitted by the indexer (which are only emitted after reverting/writing blocks successfully; however that doesn't mean the written data is correct).

Database size

Regarding the size of the indexed data: The most recent 4000 mainnet blocks resulted in a Postgres database size of 20MB. That makes ~65GB for all 9,000,000 blocks, assuming they are the same size on average. They are probably smaller on average (older blocks were less busy), so we're looking at maybe 50GB just for the blocks (or their headers, rather).

Review guide

I recommend reviewing commit by commit first. I've consolidated the PR into commits that mostly change only one thing at a time, so it's easier to follow what was added when.

It may help to read up on the terminology used across the network indexer here:

/// Terminology used in this component:
///
/// Head / head block:
/// The most recent block of a chain.
///
/// Local head:
/// The block that the network indexer is at locally.
/// We get this from the store.
///
/// Chain head:
/// The block that the network is at.
/// We get this from the Ethereum node(s).
///
/// Common ancestor (during a reorg):
/// The most recent block that two versions of a chain (e.g. the locally
/// indexed version and the latest version that the network recognizes)
/// have in common.
///
/// When handling a reorg, this is the block after which the new version
/// has diverged. All blocks up to and including the common ancestor
/// remain untouched during the reorg. The blocks after the common ancestor
/// are reverted and the blocks from the new version are added after the
/// common ancestor.
///
/// The common ancestor is identified by traversing new blocks from a reorg
/// back to the most recent block that we already have indexed locally.
///
/// Old blocks (during a reorg):
/// Blocks after the common ancestor that are indexed locally but are
/// being removed as part of a reorg. We collect these from the store by
/// traversing from the current local head back to the common ancestor.
///
/// New blocks (during a reorg):
/// Blocks between the common ancestor and the block that triggered the
/// reorg. After reverting the old blocks, these are the blocks that need
/// to be fetched from the network and added after the common ancestor.
///
/// We collect these from the network by traversing from the block that
/// triggered the reorg back to the common ancestor.

The state machine for the network indexer is documented here:

/// State machine that handles block fetching and block reorganizations.
#[derive(StateMachineFuture)]
#[state_machine_future(context = "Context")]
enum StateMachine {
/// The indexer start in an empty state and immediately moves on
/// to loading the local head block from the store.
#[state_machine_future(start, transitions(LoadLocalHead))]
Start,
/// This state waits until the local head block has been loaded from the
/// store. It then moves on to polling the chain head block.
#[state_machine_future(transitions(PollChainHead, Failed))]
LoadLocalHead { local_head: LocalHeadFuture },
/// This state waits until the chain head block has been polled
/// successfully.
///
/// Based on the (local head, chain head) pair, the indexer then moves
/// on to fetching and processing a range of blocks starting at
/// local head + 1 up, leading up to the chain head. This is done
/// in chunks of e.g. 100 blocks at a time for two reasons:
///
/// 1. To limit the amount of blocks we keep in memory.
/// 2. To be able to re-evaluate the chain head and check for reorgs
/// frequently.
#[state_machine_future(transitions(ProcessBlocks, PollChainHead, Failed))]
PollChainHead {
local_head: Option<EthereumBlockPointer>,
chain_head: ChainHeadFuture,
},
/// This state takes the next block from the stream. If the stream is
/// exhausted, it transitions back to polling the chain head block
/// and deciding on the next chunk of blocks to fetch. If there is still
/// a block to read from the stream, it's passed on to vetting for
/// validation and reorg checking.
#[state_machine_future(transitions(VetBlock, PollChainHead, Failed))]
ProcessBlocks {
local_head: Option<EthereumBlockPointer>,
chain_head: LightEthereumBlock,
next_blocks: BlockStream,
},
/// This state vets incoming blocks with regards to two aspects:
///
/// 1. Does the block have a number and hash? This is a requirement for
/// indexing to continue. If not, the indexer re-evaluates the chain
/// head and starts over.
///
/// 2. Is the block the successor of the local head block? If yes, move
/// on to indexing this block. If not, we have a reorg.
///
/// Notes on the reorg handling:
///
/// By checking parent/child succession, we ensure that there are no gaps
/// in the indexed data (class mathematical induction). So if the local
/// head is `x` and a block `f` comes in that is not a successor/child, it
/// must be on a different version/fork of the chain.
///
/// E.g.:
///
/// ```ignore
/// a---b---c---x
/// \
/// +--d---e---f
/// ```
///
/// In that case we need to do the following:
///
/// 1. Find the common ancestor of `x` and `f`, which is the block after
/// which the two versions diverged (in the above example: `b`).
///
/// 2. Collect old blocks betweeen the common ancestor and (including)
/// the local head that need to be reverted (in the above example:
/// `c`, `x`).
///
/// 3. Fetch new blocks between the common ancestor and (including) `f`
/// that are to be inserted instead of the old blocks in order to
/// make the incoming block (`f`) the local head (in the above
/// example: `d`, `e`, `f`).
#[state_machine_future(transitions(FetchNewBlocks, AddBlock, PollChainHead, Failed))]
VetBlock {
local_head: Option<EthereumBlockPointer>,
chain_head: LightEthereumBlock,
next_blocks: BlockStream,
block: BlockWithUncles,
},
/// This state waits until all new blocks from the incoming block back to
/// the common ancestor are available. Identifying the common ancestor is
/// part of this process.
///
/// If successful, the indexer moves on to collecting old blocks and
/// reverting the indexed data to the common ancestor. If fetching the new
/// blocks fails, it discards any new information and re-evaluates the chain
/// head.
///
/// The new blocks that were fetched are prepending to the incoming blocks
/// stream, so that after reverting blocks the indexer can proceed with these
/// as if no reorg happened. It'll still want to vet these blocks so it wouldn't
/// be wise to just index the blocks without further checks.
///
/// Note: This state also carries over the incoming block stream to not lose
/// its blocks. This is because even if there was a reorg, the blocks following
/// the current block that made us detect it will likely be valid successors.
/// So once the reorg has been handled, the indexer should be able to
/// continue with the remaining blocks on the stream.
///
/// Only when going back to re-evaluating the chain head, the incoming
/// blocks stream is thrown away in the hope that of receiving a better
/// chain head with different blocks leading up to it.
#[state_machine_future(transitions(RevertToCommonAncestor, PollChainHead, Failed))]
FetchNewBlocks {
local_head: Option<EthereumBlockPointer>,
chain_head: LightEthereumBlock,
next_blocks: BlockStream,
new_blocks: NewBlocksFuture,
},
/// This state collects and reverts old blocks in the store. If successful,
/// the indexer moves on to processing the blocks regularly (at this point,
/// the incoming blocks stream includes new blocks for the reorg, the
/// block that triggered the reorg and any blocks that were already in the
/// stream following the block that triggered the reorg).
///
/// After reverting, the local head is updated to the common ancestor.
///
/// If reverting fails at any block, the local head is updated to the
/// last block that we managed to revert to. Following that, the indexer
/// re-evaluates the chain head and starts over.
///
/// Note: failing to revert an old block locally may be something that
/// the indexer cannot recover from, so it may run into a loop at this
/// point.
#[state_machine_future(transitions(ProcessBlocks, PollChainHead, Failed))]
RevertToCommonAncestor {
local_head: Option<EthereumBlockPointer>,
chain_head: LightEthereumBlock,
next_blocks: BlockStream,
new_local_head: RevertBlocksFuture,
},
/// This state waits until a block has been written and an event for it
/// has been sent out. After that, the indexer continues processing the
/// next block. If anything goes wrong at this point, it's back to
/// re-evaluating the chain head and fetching (potentially) different
/// blocks for indexing.
#[state_machine_future(transitions(ProcessBlocks, PollChainHead, Failed))]
AddBlock {
chain_head: LightEthereumBlock,
next_blocks: BlockStream,
old_local_head: Option<EthereumBlockPointer>,
new_local_head: AddBlockFuture,
},
/// This is unused, the indexing never ends.
#[state_machine_future(ready)]
Ready(()),
/// State for fatal errors that cause the indexing to terminate. This should
/// almost never happen. If it does, it should cause the entire node to crash
/// and restart.
#[state_machine_future(error)]
Failed(Error),
}


While this is being reviewed, I'll work on improving the tests to test data correctness and potentially generalizing the network indexer across chains. I've done some initial thinking to identify how the indexer depends on Ethereum right now and I think we can abstract that away.

@Jannis Jannis force-pushed the jannis/block-explorer-phase-1-v1 branch from e1de997 to 5a69e25 Compare November 26, 2019 15:28
@Jannis Jannis requested a review from a team November 26, 2019 18:39
@Jannis Jannis self-assigned this Nov 26, 2019
@Jannis Jannis added chains/ethereum enhancement New feature or request labels Nov 26, 2019
@Jannis Jannis marked this pull request as ready for review November 26, 2019 18:43
@Jannis Jannis force-pushed the jannis/block-explorer-phase-1-v1 branch from ec19ad2 to 02e93a5 Compare November 26, 2019 18:46
@leoyvens
Copy link
Collaborator

potentially generalizing the network indexer across chains

This is probably quite a diff, would it be done in this PR or a follow up?

@Jannis
Copy link
Contributor Author

Jannis commented Nov 26, 2019

@leoyvens I'd be happy to make that a follow up, given the size of this PR.

@Jannis
Copy link
Contributor Author

Jannis commented Nov 26, 2019

One thing I'll still do is put metrics back in. I added the utilities (Aggregate and .measure()) to the PR and was using them at some point. But I rewrote this code about three times and dropped the metrics along the way.

@Jannis Jannis force-pushed the jannis/block-explorer-phase-1-v1 branch from 25fd80e to b2232a9 Compare November 26, 2019 22:06
@Jannis
Copy link
Contributor Author

Jannis commented Nov 28, 2019

I've added extensive instrumentation to enable Grafana dashboards like this one:

image

@Jannis Jannis force-pushed the jannis/block-explorer-phase-1-v1 branch 2 times, most recently from ef11471 to ae8b81b Compare December 3, 2019 22:52
@Jannis
Copy link
Contributor Author

Jannis commented Dec 4, 2019

Tests generally pass, just sometimes they hang on Travis. I'm not sure yet what causes it.

Copy link
Collaborator

@leoyvens leoyvens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't yet fully understood the reorg algorithm, but it seems complicated due to the need of finding the common ancestor. I wonder if we could do a simpler algorithm which is to revert a single block if the next block is not a child of the current one, and go back to the starting state. This would naturally find the common ancestor by reverting one block at a time until it is found. It would in theory be less efficient for large reorgs, but looking at Etherscan statistics, it seems that 95% of reorgs are 1 block deep, and I couldn't even find a 3 block deep reorg, those must happen only once every full moon. So the simpler algorithm could maybe be more efficient because it needs to do less work on 1 block reorgs which are the common case. My point being that the performance difference would not matter if there is any, so we should favor simplicity.

chain/ethereum/src/network_indexer/block_writer.rs Outdated Show resolved Hide resolved

Box::new(
// Add the block entity
self.set_entity(block.as_ref(), Some(vec![("isOmmer", false.into())]))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems hacky that isOmmer is set here, probably reflecting the fact that this field was the last thing added. It would be nicer if we set that in a data structure when the block is fetched.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I've changed this so ommer blocks are wrapped in an Ommer newtype and the isOmmer flag is now set in the TryIntoEntity implementations for Ommer and BlockWithUncles.

#[derive(Clone, Debug, Default, PartialEq)]
pub struct BlockWithUncles {
pub block: EthereumBlock,
pub uncles: Vec<Option<Block<H256>>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Vec<Option<_>> is weird, what does None mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs are not super clear; I think https://github.com/ethereum/wiki/wiki/JSON-RPC#eth_getUncleByBlockHashAndIndex and it's link to https://github.com/ethereum/wiki/wiki/JSON-RPC#eth_getblockbyhash suggests that None means the uncle couldn't be found.

We can't allow different nodes to return different uncles though – we need them all for computing block rewards. So I think if an uncle (ommer) is not found, that's a serious reason to fail the network indexer. I'll drop the Option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

chain/ethereum/src/network_indexer/mod.rs Outdated Show resolved Hide resolved
chain/ethereum/src/network_indexer/mod.rs Outdated Show resolved Hide resolved
// Check whether we have a reorg (parent of the new block != our local head).
if block.inner().parent_ptr() != state.local_head {
let depth = block.inner().number.unwrap().as_u64()
- state.local_head.map_or(0u64, |ptr| ptr.number);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the purpose of this variable, it's only used in logs and metrics, and afaict it's either 0 if block is genesis or 1 otherwise, so not very meaningful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, you're right, this will never report the real depth of the reorg. I do want that information, but I think I can only log it once we have found the common ancestor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

let state = state.take();

transition!(PollChainHead {
local_head: state.old_local_head,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we went back to the starting state here, and then we wouldn't need keep old_local_head.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

chain/ethereum/src/lib.rs Outdated Show resolved Hide resolved
@@ -224,6 +228,10 @@ impl<F: Future> FutureExtension for F {
on_cancel,
}
}

fn measure<C: FnOnce(&Self::Item, Duration)>(self, callback: C) -> Measure<Self, C> {
Measure::new(self, callback)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having this helper is unecessary, it only has one caller from what I can tell, and with async/.await it won't be idiomatic because it's a variation of and_then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can move it into the indexer code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this is done also.

@Jannis
Copy link
Contributor Author

Jannis commented Dec 13, 2019

@leoyvens I think I've addressed all comments with the appropriate changes. Could you take another look?

@Jannis Jannis force-pushed the jannis/block-explorer-phase-1-v1 branch from 127bdba to c072697 Compare December 13, 2019 12:17
@Jannis
Copy link
Contributor Author

Jannis commented Dec 13, 2019

Rebased on top of master.

@Jannis
Copy link
Contributor Author

Jannis commented Dec 13, 2019

Reorg handling was simplified as per @leoyvens's suggestion. Reduced the implementation by about 600 lines and made it a lot easier to follow.

@Jannis
Copy link
Contributor Author

Jannis commented Dec 13, 2019

Left to do: Count consecutive reverts to capture and log reorg depths.

Copy link
Collaborator

@leoyvens leoyvens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is looking good, I only have few minor comments. Once this is ready for merging I'll also give it a run locally.

difficulty: block.difficulty,
total_difficulty: block.total_difficulty,
seal_fields: block.seal_fields,
uncles: block.uncles,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is correct and worth it to assert that this is empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I wouldn't think uncles ever report more uncles (that's not how it works). Asserting this might cause failures though. I'd rather have references to uncles of uncles in the resulting data that resolve to null.

chain/ethereum/src/network_indexer/network_indexer.rs Outdated Show resolved Hide resolved
chain/ethereum/src/network_indexer/network_indexer.rs Outdated Show resolved Hide resolved
@@ -0,0 +1,83 @@
""" Block is an Ethereum block."""
type Block @entity {
id: ID!
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we start rolling this out, we should brush up the work I did on making it possible to make ID equivalent to Bytes for a subgraph so that these id's take only 20 instead of 40 (or 42) bytes to store. It will be transparent to the rest of the code, but requires that we pass a flag to create_subgraph that indicates whether ID should be a String or Bytes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code for this is in store::postgres::relational::Layout::new, but it's not exposed to callers, and instead capped at IdType::String - we should expose this up in the callstack as arguments so it can be set in create_subgraph and is stored in the database (maybe as a field on deployment_schemas)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much is there to do? And how much, if any, risk does that support introduce? Does it affect clients, filters, anything?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides passing IdType::Bytes or IdType::String through when creating a subgraph, the following needs to be done in the Store:

  • fix up a handful of places in relational_queries.rs where we assume that the id is a String
  • Look at the type of the id column in information_schema.columns when starting a subgraph and decide whether it uses Bytes or String as the id

At the Entity layer, the block explorer would have to make sure to pass id as a Value::Bytes rather than a Value::String.

At the GraphQL layer, users have to pass the id as something that can be converted to Value::Bytes, and we'd have to do that conversion when coercing values. (I could make it so that we convert a Value::String into a Value::Bytes in relational_queries.rs, but that seems a bit hacky)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked a little more, and to avoid changing too much in the code base, I think the best course of action is to keep that distinction within the relational mapping code. That means that code that deals with entity ID's continues to use strings, and the conversion from string to bytea and vice versa all happens in relational_queries. For users of the storage layer, the main change is that they might get a new error when the id is not a string in the form 0xdeadbeef.

Those changes should be possible to do in a couple of days. The only change for the block explorer would be to pass IdType::Bytes when creating the schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this separately? We're not going to activate this feature right away.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, totally; we just need to do this before the feature goes live. We can migrate the database after the fact, but it's likely to take long (maybe hours) and during that time, the block explorer would be unavailable.

When this goes into a release, we need to make sure it's behind a feature flag so that users who install that release don't wind up with this data in their database which we would have to migrate.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #1414 to track this properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already behind a --network-subgraphs CLI flag.

@Jannis Jannis force-pushed the jannis/block-explorer-phase-1-v1 branch 2 times, most recently from 99f3dcb to d085330 Compare December 16, 2019 10:24
Set it to `false` for regular blocks and to `true` for ommers.
If they all start with `Failed ...`, they are easier to grep for.
When writing blocks, set the `isOmmer` entity field based on whether the
block being written is an `Ommer` (true) or a `BlockWithOmmers` (false).
This is more idiomatic, apart from `LightEthereumBlock`, where a new
`format()` method is added because `LightEthereumBlock` is a foreign
type that we can't implement `Display` for without a wrapper.
Move these into `graph` so they can be used in other places as
well (like other chain integrations in the future).
This avoids dealing with `Option` blocks in the rest of the indexer and
therefore simplifies things a bit.
Since it's only used in one place right now (`track_future!` in the
network indexer), we can get away with something as simple as

```rust
let start_time = Instant::now();
...
    .inspect(move |_| {
        let duration = start_time.elapsed();
        ...
    })
```

Squashme: remove measure
Instead of collecting all old and new blocks to find the common
ancestor and revert old blocks, we simply revert the local head block
one block at a time and re-evaluate the situation (by polling the chain
head block again and deciding which blocks to look up next). Eventually,
this procedure will revert the local head back to the common ancestor.

For deep reorgs, this will be slow, but about 99% of Ethereum reorgs
have a depth of one, so this is something we can live with easily.
@Jannis Jannis force-pushed the jannis/block-explorer-phase-1-v1 branch from f3363e6 to f92ced3 Compare December 29, 2019 10:31
@Jannis
Copy link
Contributor Author

Jannis commented Dec 29, 2019

@leoyvens @Zerim I've made it so that the following routes / subgraph names are used:

The subgraph name itself becomes network/ethereum/mainnet, network/ethereum/kovan.

The routes become: /subgraphs/network/ethereum/mainnet and /subgraphs/network/ethereum/kovan.

@Jannis Jannis merged commit 096dab9 into master Dec 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chains/ethereum enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants