Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Return executed blocks sorted on execution error #772

Merged
merged 3 commits into from
Nov 4, 2020

Conversation

ntallar
Copy link

@ntallar ntallar commented Nov 3, 2020

Description

Fixes missing blocks on testnet, caused by:

  1. An invalid block (1505: 0b) was generated by the miner of the node (due to https://jira.iohk.io/browse/ETCM-301), probably caused by an expected race condition between the miner and the importing process
  2. The above block import required a reorg of more than one block. From [1502: 8c, 1503: 87, 1504: 6c] to [1502: 8a, 1503: 5a, 1504: 0b, 1505: 0b] (the list are: [block number: first two bytes of the hash])
  3. Failure in executing it caused the reorg to be reverted and BlockImport.revertChainReorganisation to be called with the executedBlocks=[1505: 0b, 1504: 0b, 1503: 5a, 1502: 8a]
  4. This caused the call: removeBlocksUntil(0b, 1502), meaning that all blocks from the db were deleted
  5. Eventually block 1506 had an ommer whose parent was deleted in the last step, meaning that it was marked as invalid by this node with reason OmmersNotValidError. As this was the chain followed by the two other miners, the testnet irrevocably forked

Simple scenario to reproduce:

Pre-requisites:

  • Due to https://jira.iohk.io/browse/ETCM-329 the mocked miner generates invalid blocks in some occasions, the ValidationAfterExecError(s"Block has invalid state root hash, expected ${Hex.toHexString(header.stateRoot.toArray)} but got ${Hex.toHexString(stateRootHash.toArray)}") validation has to be disabled
  • Patch Mantis to consider a block to be invalid, I did so by changing BlockExecution.executeAndValidateBlock so that all blocks with number 13 are considered invalid

Reproducing it

  1. Start a node and mine 12 blocks in a mocked way
  2. Mine 5 blocks having the block 8 as it's parent
  3. As the last mined block has number 13 it will be detected as invalid
  4. Query the blocks by number (through JSON RPC) and check that blocks 1-7 were deleted

Proposed Solution

Return the executedBlocks in proper order

Testing

  • Run the scenario to reproduce and see that it no longer causes any issues
  • No test was added due to the complexity of producing this exact scenario

@ntallar ntallar added the bug Something isn't working label Nov 3, 2020
@@ -139,7 +139,7 @@ class BlockExecution(
if (remainingBlocks.isEmpty) {
(executedBlocks.reverse, None)
} else if (error.isDefined) {
(executedBlocks, error)
(executedBlocks.reverse, error)
Copy link
Contributor

@mirkoAlic mirkoAlic Nov 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm struggling with this analysis in the code. Maybe as a work arround we could add a suffix at variables name level, in order to know if the list is in incremental or decremental order. WDYT?

ex.
remainingBlocks -> remainingBlocksIncOrder // [bn=1, bn=2, bn=3]
executedBlocks -> executedBlocksDecOrder // [bn=3, bn=2, bn=1]

And also add a comment regarding the order of the list returned by this function

@mirkoAlic
Copy link
Contributor

wdyt about adding a regression test at least at integration test level?

@ntallar ntallar force-pushed the fix-deleting-all-on-reverted-reorg branch from 9977286 to 64b1b41 Compare November 4, 2020 12:17
@ntallar ntallar marked this pull request as ready for review November 4, 2020 12:17
@ntallar
Copy link
Author

ntallar commented Nov 4, 2020

Regression test at an integration level wasn't that easy, I finally added one at unit level that was quite straightforward to add

@ntallar ntallar force-pushed the fix-deleting-all-on-reverted-reorg branch from 0a406e4 to 58ad5d5 Compare November 4, 2020 15:35
Copy link
Contributor

@mirkoAlic mirkoAlic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ntallar ntallar merged commit b8e013d into develop Nov 4, 2020
@ntallar ntallar deleted the fix-deleting-all-on-reverted-reorg branch November 4, 2020 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants