-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hydra Indexer unstable during runtime upgrade #4741
Comments
What runtime ugprade are you talking about here?
|
This would be ephesus to nara, spec version 2001 -> 2002 (in indexer logs however we see the spec update to 3000 that was original planned version but we since reverted to 2002) |
Upon further investigation the failure to decode storage values at particular blocks are happening to the one ore more blocks that were being fetched (these blocks are always before the runtime upgrade block) just before the node goes through a runtime upgrade and polkadot.js api detects the runtime upgrade. The indexer uses a As the indexer happily indexing blocks, and the node runtime is updated, internally the polkadot-js api detects this through metadata subscription updates. It then updates the default registry with new metadata. I believe there is some data race condition happening that for some reason is causing the decoration for the failing blocks to either be overridden/corrupted or just not created for the correct version when the api is updating the default registry. (I think it is because the default registry was also being used for the decorated api I can't pinpoint it but the polkadot-js core maintainers might be able to find it more easily. A few things I tried:
related issues: |
Fixed in Joystream/hydra#534 |
Problem
When running the runtime upgrade intergration tests, the runtime upgrade works successfully but the
test is failing at the step waiting for status of runtime upgrade proposal execution status to go from "ProposalStatusGracing" to "ProposalStatusExecuted" because the QN seems to not be properly processing the runtime upgrade block with the "ProposalExecuted" event in it. And processor altogether stops processing any events.
The first few lines of logs after the runtime upgrade (for the indexer) the RPC-CORE error which comes from polkadot-js/api immediately after upgrade.
If we allow indexer to keep running, eventually:
Recovery
Although the indexer is observed to make progress fetching additional blocks, the processor behaves correctly, ie it doesn't make progress. That is a good thing (we wouldn't get inconsistent state)
When indexer restarts, it continues processing from a couple of blocks before the runtime upgrade block, logging a few
index-builder:indexer Block N has already been indexed
lines, until it finally makes progress. The indexer can also be restarted manually (without resetting db) and it also "recovers" in the same way.Some questions:
Relevant runtime type changes that are at the root cause of this behavior?
Some background about what type changes in the substrate runtime are possibly causing this.
Note that each block has a
timestamp.set
extrinsic added by the block producer. Therefore each block produced will have asystem.extrinsicSuccess
Event, and therefore aDispatchInfo
type.Before Upgrade
After Upgrade
Another changes in new substrate but not likely relevant, adding it just in-case:
I think it just affects the storage_info annotation and how benchmarks handle counting reading/writing to the storage key. The change in new substrate is that this is now an unbounded (git diff):
Indexer Logs
Runtime upgrade at block 607
┆Issue is synchronized with this Asana task by Unito
The text was updated successfully, but these errors were encountered: