Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Main] Automate database corruption fix caused by out of sync txdb. #405

Merged
merged 5 commits into from
Nov 24, 2017

Conversation

presstab
Copy link

@presstab presstab commented Nov 21, 2017

Summary:

  • Refactor the automated db corruption repair code to be based on comparing pcoinsTip's best block, with the best block loaded from block index db. If the wallet client is closed forcefully before it completes CheckBlock() it causes a state where the block index is stored to disk but the transaction data is not. The automated repair code will detect on initialization when there is a mismatch between the two databases. If the transaction database is behind the block index database, each block from the best height in the tx database up until the best block of the block index database will be parsed and all transactions recorded and flushed to disk.

  • On block 908000 a block hash checkpoint is added. If the validation process fails to validate block 908000 because it was previously marked invalid, the user will not need to use the reconsiderblock RPC command to clear the status. The code will now see that they are on block 908000, check if the blockhash matches the checkpoint, and reprocess the block.

  • On initial block download (IDB), the wallet client is not triggered to calculate the invalid outpoint map. The invalid outpoint map is used to calculate the accumulator checkpoint for block 908000, so if it never gets populated, then a checkpoint will be generated that does not match the blockheader's accumulator checkpoint, which will result in the block being rejected. Before calculating the checkpoint for 908000 the code now forces the population of the invalid outpoint map, avoiding this scenario.

  • Around 908000 many nodes have a zerocoinDB that is not in its proper state with accumulators properly calculated. If an accumulator checkpoint is rejected in close proximity to block 908000, then a reindexing of the accumulators will automatically be done, and the checkpoint recalculated in an attempt to fix this issue.

  • Deprecate spork_11 and replace it with a value called by Params()

Tested Scenarios:

  • Sync from scratch - works.
  • Reindex - works.
  • Blockchain snapshot that synced from scratch, but got stuck at block 907999 - fixed by running the new wallet.
  • Blockchain snapshot that was experiencing assert(hashPrevBlock == view.GetBestBlock()); from ConnectBlock() due to having a txdb that was behind the blockchain db : fixed by running the new wallet. Note: This is the issue that the majority of nodes that are stuck on block 907999 will have because the wallet froze in the middle of ConnectBlock() causing the txdb to not be flushed to disk.
  • Blockchain snapshot that was experiencing assert(hashPrevBlock == view.GetBestBlock()); from ConnectBlock() due to having a txdb that was ahead of the blockchain db : fixed by restarting wallet with -reindex.

@ghost ghost assigned presstab Nov 21, 2017
@ghost ghost added the review label Nov 21, 2017
@presstab presstab changed the title Automate database corruption fix caused by out of sync txdb. [Main] Automate database corruption fix caused by out of sync txdb. Nov 21, 2017
@presstab presstab assigned presstab and unassigned presstab Nov 21, 2017
Copy link

@Mrs-X Mrs-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (Checkpoints::CheckBlock(pindex->nHeight - 1, block.hashPrevBlock, true)) {

pindex might be NULL if the block header is not yet known

@presstab
Copy link
Author

@Mrs-X good find. Changed to:
if (pindex && Checkpoints::CheckBlock(pindex->nHeight - 1, block.hashPrevBlock, true)) {

@presstab presstab removed the Bug label Nov 22, 2017
@Fuzzbawls
Copy link
Collaborator

confirmed that a fresh sync using a bootstrap file now properly handles the accumulator/supply recalculations at block 908000

if (pindex->GetBlockHeader().nVersion >= Params().Zerocoin_HeaderVersion())
nZerocoinStart = pindex->nHeight;
else if (nZerocoinStart)
break;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would have helped to know you were looping backwards here. (comment). Also would a "==" have worked?

// GetCheckpoint could have terminated due to a shutdown request. Check this here.
if (ShutdownRequested())
break;
return InitError(_("Failed to calculate accumulator checkpoint"));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would the reason be for failure here?

src/main.cpp Outdated
@@ -3196,6 +3198,59 @@ bool RecalculatePIVSupply(int nHeightStart)
return true;
}

bool ReindexAccumulators(list<uint256> listMissingCheckpoints, string& strError)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't list be a Const Reference here? Such as "const list<uint256> & listMissingCheckpoints" ? Otherwise the whole list is needlessly copied. If not a "const" input, then it would need to be a reference if it's an output, otherwise only modified internal to this routine

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I have it coded, it erases a checkpoint from listMissingCheckpoints once it finds/processes it. Unable to do that with const ref.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then why not just regular reference?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

src/main.cpp Outdated
bool ReindexAccumulators(list<uint256> listMissingCheckpoints, string& strError)
{
// PIVX: recalculate Accumulator Checkpoints that failed to database properly
if (!listMissingCheckpoints.empty() && chainActive.Height() >= Params().Zerocoin_StartHeight()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below looks like a cut-and-paste section from earlier code? Is it worthwhile making a function?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is exactly what this code is. You were commenting on the removed code above.

It was formerly not a function and just a one time procedure ran in init.cpp. Now it is accessible anytime with ReindexAccumulators()

}

string strError;
if (!ReindexAccumulators(listCheckpoints, strError) || !CalculateAccumulatorCheckpoint(pindex->nHeight, nCheckpointCalculated))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK to pass a potentially empty "listCheckpoints" to ReindexAccumulators?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty list will skip the procedure within the function and return true.

} else {
return state.DoS(100, error("ConnectBlock() : failed to calculate accumulator checkpoint"));
}
}

if (nCheckpointCalculated != block.nAccumulatorCheckpoint) {
LogPrintf("%s: block=%d calculated: %s\n block: %s\n", __func__, pindex->nHeight, nCheckpointCalculated.GetHex(), block.nAccumulatorCheckpoint.GetHex());
return state.DoS(100, error("ConnectBlock() : accumulator does not match calculated value"));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if possible more text plausing possible reasons for errors would be quite useful to other people looking at this code/error message, as it's difficult to reason through the various levels as to why something is happening

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Adding an error message that will feed into the reindex popup.

pull_405_popup

@@ -4665,9 +4758,20 @@ bool AcceptBlock(CBlock& block, CValidationState& state, CBlockIndex** ppindex,
if (mi == mapBlockIndex.end())
return state.DoS(0, error("%s : prev block %s not found", __func__, block.hashPrevBlock.ToString().c_str()), 0, "bad-prevblk");
pindexPrev = (*mi).second;
if (pindexPrev->nStatus & BLOCK_FAILED_MASK)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeated code below?

src/main.cpp Outdated
while (nSortedPos < vSortedByHeight.size()) {
CBlock block;
assert(ReadBlockFromDisk(block, pindex));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to have a pop-up or some explanation for user rather than just asserting and crashing

@Mrs-X
Copy link

Mrs-X commented Nov 24, 2017

utACK and merging...

@Mrs-X Mrs-X merged commit 8469261 into PIVX-Project:master Nov 24, 2017
Mrs-X added a commit that referenced this pull request Nov 24, 2017
…sync txdb.

8469261 Deprecate spork_11. (presstab)
15c355b Ask for sporks from 1st peer. Consider Verifying Blocks as IDB. (presstab)
43af66b Add message popup when db corrupt. Make ReindexAccumulators() use ref list. (presstab)
3eabaa2 Remove unused code and add null check. (presstab)
5654732 Automate database corruption fix caused by out of sync txdb. (presstab)

Tree-SHA512: 6c612d7a8e6a80c456514c402afb161932d947fa9761544ab6f53df6932969095bfd4022a2f426913bcf1456fc5369b654c58cbfd5065081ee29634843dfe7c4
@ghost ghost removed the review label Nov 24, 2017
@presstab presstab mentioned this pull request Nov 28, 2017
Fuzzbawls pushed a commit to Fuzzbawls/PIVX that referenced this pull request Nov 29, 2017
Fuzzbawls pushed a commit to Fuzzbawls/PIVX that referenced this pull request Nov 29, 2017
Fuzzbawls pushed a commit to Fuzzbawls/PIVX that referenced this pull request Nov 29, 2017
Fuzzbawls pushed a commit to Fuzzbawls/PIVX that referenced this pull request Nov 29, 2017
Fuzzbawls pushed a commit to Fuzzbawls/PIVX that referenced this pull request Nov 29, 2017
Github-Merge: PIVX-Project#405
Rebased-From: 8469261
meyer9 pushed a commit to phoreproject/Phore that referenced this pull request Feb 15, 2018
meyer9 pushed a commit to phoreproject/Phore that referenced this pull request Feb 15, 2018
meyer9 pushed a commit to phoreproject/Phore that referenced this pull request Feb 15, 2018
Remove invalid old invalid output logic from Dash fork.

Github-Merge: PIVX-Project#405
Rebased-From: 8469261
meyer9 pushed a commit to phoreproject/Phore that referenced this pull request Feb 20, 2018
meyer9 pushed a commit to phoreproject/Phore that referenced this pull request Feb 20, 2018
meyer9 pushed a commit to phoreproject/Phore that referenced this pull request Feb 20, 2018
Remove invalid old invalid output logic from Dash fork.

Github-Merge: PIVX-Project#405
Rebased-From: 8469261
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants