Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timestamp to _meta.block #3060

Closed
schmidsi opened this issue Dec 13, 2021 · 5 comments · Fixed by #3738
Closed

Add timestamp to _meta.block #3060

schmidsi opened this issue Dec 13, 2021 · 5 comments · Fixed by #3738
Assignees

Comments

@schmidsi
Copy link
Member

schmidsi commented Dec 13, 2021

Do you want to request a feature or report a bug?

A feature

What is the current behavior?

If a subgraph consumer wants to know the freshness of their subgraph, they can send a query like:

{
  _meta {
    block {
      number
    }
  }
}

Now they know until which block that subgraph is indexed, but without another source (JSON RPC endpoint like Infura or Alchemy) of what is the number of the latest block they do not know how fresh the subgraph is.

What is the expected behavior?

It would be much simpler to be also able to query for the timestamp of the block that a subgraph has indexed. Like:

{
  _meta {
    block {
      number
      timestamp
    }
  }
}

With a timestamp, they can directly calculate the freshness by comparing it to the system time.

@schmidsi schmidsi changed the title Add Add timestamp to _meta.block Dec 13, 2021
@azf20
Copy link
Contributor

azf20 commented Dec 15, 2021

Discussed yesterday with @lutter and @leoyvens, this is not a quick win as it would require a call to get the timestamp from the block cache (currently the _meta field just uses the hash and number which are stored on the deployment)

@leoyvens
Copy link
Collaborator

Actually in a call by block hash going to the block cache is always necessary anyways, so this wouldn't add overhead in that case which is the relevant one in the network.

@schmidsi
Copy link
Member Author

schmidsi commented May 6, 2022

@azf20 Circling back on this. This was an issue yesterday as we did not receive new blocks from one of the chains. The front-ends do not know much about it unless they have a second source of block info to compare. Do you still think this is not a quick win?

@mangas
Copy link
Contributor

mangas commented Jul 11, 2022

For Hash constraint:
Currently it looks like for hash we only call store.block_number which returns Option<BlockNumber>, to introduce this we would need to either change the BlockNumber type or introduce a new trait function that either gets the entire Block or number+timestamp?

The ChainStore trait already something similar, which I think prolly need a new type here in order to add more fields return (instead of adding to the tuple)

 /// Find the block with `block_hash` and return the network name and number
   357     fn block_number(&self, hash: &BlockHash) -> Result<Option<(String, BlockNumber)>, StoreError>;

For number constraint:

It looks like we don't actually query anything, in this case the timestamp either cannot be provided or we'd need to add the call here as well.

 138             BlockConstraint::Number(number) => {
     1                 check_ptr(state, number)?;
     2                 // We don't have a way here to look the block hash up from
     3                 // the database, and even if we did, there is no guarantee
     4                 // that we have the block in our cache. We therefore
     5                 // always return an all zeroes hash when users specify
     6                 // a block number
     7                 // See 7a7b9708-adb7-4fc2-acec-88680cb07ec1
     8                 Ok(BlockPtr::from((web3::types::H256::zero(), number as u64)))
     9             }

Lastly, the issue seems to be figuring the freshness of the last block being served specifically. If this is the case, adding a timestamp doesn't distinguish between the block not being ingested or the block not being produced by the chance (on an outage).

Perhaps just having a meta field for last_block_ingestion_ts and last_block_update_ts could solve the problem, and have this periodically updated against the latest ingestion could provide enough visibility and would be cheaper to calculate and update? Thoughts?

@schmidsi
Copy link
Member Author

Thanks for the follow up. Since the whole query including { _meta: { block { ... } } needs to be the same across all indexers, it can not contain the info which was the last block an indexer saw. To know this, teams usually query the indexer-status endpoint, which is on the hosted service here: https://api.thegraph.com/index-node/graphql

There you can have chainHeadBlock and latestBlock to check if the subgraph has caught up. Still, this does not resolve the issue if the underlying JSON RPC endpoint does not retrieve any new blocks despite the chain by itself would still produce new blocks.

So having this very simple tool, a consumer can always know how fresh its data is, regardless of the underlying complexities. Stable chains have usually a more or less predictable block times and some dapps are happy with data that is 2-10 blocks old. So they can do a simple math in the frontend like "average block time" * "acceptable number of blocks behind" and then check the freshness and display a warning if it is outdated.

The idea is also that the front-end does not need to send a query to the indexer-status endpoint and to the graph-node endpoint to know about data freshness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants