Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/snailchain.go: continue to iteration if ErrPrunedAncestor #19105

Closed
wants to merge 2 commits into from
Closed

core/snailchain.go: continue to iteration if ErrPrunedAncestor #19105

wants to merge 2 commits into from

Conversation

ywzqwwt
Copy link
Contributor

@ywzqwwt ywzqwwt commented Feb 15, 2019

if block is not nil and err is ErrPrunedAncestor,we should continue to WriteBlockWithState,just this can reorg if externTd.Cmp(localTd) > 0

@holiman
Copy link
Contributor

holiman commented Feb 15, 2019

Two things,

  • Any chain of blocks that make it into insertChain are contiguous, so that it'll always be a chain of parent, child, child2, child3 etc.
  • Thus, the case of childN having a pruned ancestor but parent does not is impossible,

So pruned ancestor should immediately throw the whole (side)chain into importSideChain.

Your fix tries to handle pruned ancestor in a place where that really should not occur. If it does, then something is wrong, so I'd be interested to hear why you proposed this change ?

@holiman
Copy link
Contributor

holiman commented Feb 15, 2019

Your PR also adds an extra goroutine which has a side-effect of collecting remote transasctions in memory (in channels). I'm not sure that's good, and should either way be split out into a separate PR.

Also, the transaction pools are highly complex and also very important, so we generally want testcases along with modifications in that area.

@ywzqwwt
Copy link
Contributor Author

ywzqwwt commented Feb 15, 2019

why i add err == consensus.ErrPrunedAncestor is that,for example a Canonical chain is 100 101 102 103,Sidechain is 100 101 102 103,and 99 is there common ancestor,but now Sidechain's td is bigger than Canonical, if do not add err == consensus.ErrPrunedAncestor,it can not replace the Canonical chain,because the err is ErrPrunedAncestor for the number 101.

@holiman
Copy link
Contributor

holiman commented Feb 15, 2019

So, in the case:
Canon: 200,
Side: 201,
Common ancestor: 99.
Pruned up to: 100.

What should happen is that the sideblocks get delivered and stored. At some point, during sidechain import, we notice that sidechain just went ahead of canon. At that point, we're here:

https://github.com/ethereum/go-ethereum/blob/master/core/blockchain.go#L1350

We will then read back as much as we have to, (in this case 0-99) here:

https://github.com/ethereum/go-ethereum/blob/master/core/blockchain.go#L1356

to obtain the state that was pruned. So we put together a chain of blocks, 0-99-201.We then feed it into insertChain in chunks of 2048 blocks here: https://github.com/ethereum/go-ethereum/blob/master/core/blockchain.go#L1396 .

So, the scenario you described, insertSideChain would collect 0-201 and feed them as on chunk of bocks into insertChain. There's no possibility of hitting an ErrPrunedAncestor that way. Or rather, if there is, please let us know how you did it.

@holiman
Copy link
Contributor

holiman commented Feb 15, 2019

cc @karalabe am I correct in assuming that the pruner will not eat away more state during the time this whole thing happens?

@ywzqwwt
Copy link
Contributor Author

ywzqwwt commented Feb 15, 2019

for your case, it will insertchain 100-201,100 verifybody will return err=nil,but 101's err would return pruner ancestor.

@ywzqwwt
Copy link
Contributor Author

ywzqwwt commented Feb 15, 2019

so the Side will not replace the Canon,because 101 not match err ==nil,https://github.com/ethereum/go-ethereum/blob/master/core/blockchain.go#L1356 will pass

@holiman
Copy link
Contributor

holiman commented Feb 15, 2019

for your case, it will insertchain 100-201,100 verifybody will return err=nil,but 101's err would return pruner ancestor.

So.. you mean that the first block is 100, which is already canon (but pruned) and then continues on 101 but finds that that one has a pruned ancestor?

If 100 is already canon, then that one should yield ErrKnownBlock here: https://github.com/ethereum/go-ethereum/blob/master/core/blockchain.go#L1172 .

Hm. I see, that's a problem. I don't think the way you did it is correct. Rather, after throwing away all the known blocks, we should invoke bc.insertSideChain if the first of the remaining is pruned. I'm not sure what the most elegant construct is.

@ywzqwwt
Copy link
Contributor Author

ywzqwwt commented Feb 17, 2019

no,ErrKnownBlock is just for his parent block is in Canon chain. here: https://github.com/ethereum/go-ethereum/blob/master/core/block_validator#L53 .but in this case the 100 is in SideChain.

@ywzqwwt
Copy link
Contributor Author

ywzqwwt commented Feb 17, 2019

for the case above:
Canon: 200,
Side: 201,
Common ancestor: 99.
Pruned up to: 100.

@holiman
Copy link
Contributor

holiman commented Feb 17, 2019

I'll make some testcases around this, then consider the best way to deal with it. It's not a serious issue for mainnet, but definitely should be handled properly.

@ywzqwwt
Copy link
Contributor Author

ywzqwwt commented Feb 17, 2019

the insertSidechain will call insertchain(100-201),but at https://github.com/ethereum/go-ethereum/blob/master/core/blockchain.go#L1356,the block 100's err is nil,the for loop will break,because 101 is ErrPrunedAncestor, if i add err == nil || err == consensus.ErrPrunedAncestor, the for loop will not break,the old chain 100-200 will be replaced by the new 100-201.

@ywzqwwt
Copy link
Contributor Author

ywzqwwt commented Feb 17, 2019

and i also tested,add this judge,the new chain replaced the old chain(if not add the old chain can not be replaced),and i also run the test testReorg(in blockchain_test.go),both err == nil || err == consensus.ErrPrunedAncestor and err == nil can pass.

@holiman
Copy link
Contributor

holiman commented Feb 17, 2019

Please take a look at #19113 , and see if it captures the scenario you described. It does capture the scenario I described, but I'm not sure we're talking about the same scenario(s).
If it does not, perhaps you can change the parameters of the testcase I made, to reflect what you're seeing.

@ywzqwwt
Copy link
Contributor Author

ywzqwwt commented Feb 19, 2019

I want to confirm the process with you.In the above example,the number 99 is their Common ancestor.Now Sidechain td is bigger than Canonchian,so it will print log.Info("Importing sidechain segment", "start", 100, "end", 201) here: https://github.com/ethereum/go-ethereum/blob/master/core/blockchain.go#L1395 . now call insertChain(blocks, false) and the blocks is 100-201. block, err := it.next() at : https://github.com/ethereum/go-ethereum/blob/master/core/blockchain.go#L1148 will return (the block which number is 100,nill) because the 100's parent 99 is in Canonchian.it will be case SideStatTy at: https://github.com/ethereum/go-ethereum/blob/master/core/blockchain.go#L1251 (because the new 100's td is lower than the old 100's td),when block, err = it.next() will return(the block will number is 101,ErrPrunedAncestor) at: https://github.com/ethereum/go-ethereum/blob/master/core/blockchain.go#L1189 and the for loop will break because now err is not nill.so the Sidechain 100-201 will not replace the old Canonchian 100-200.because in the case above when sidechain's number is 201,the td is bigger than the Canonchian 200.but when number is 101 the for loop is broke,so the Sidechain 100-201 will not have the chance to replace the old Canonchian 100-200.

@holiman
Copy link
Contributor

holiman commented Feb 19, 2019

So the scenario you describe:

Afaict, it works correctly, with the patch I made in that PR to remove (trimleft) already canon blocks from the beginning of the chain to import.

Did you see my testcase(s)? That testcase is pretty easy to extend to a different scenario, it may be set up your scenario in code rather than text.

@ywzqwwt
Copy link
Contributor Author

ywzqwwt commented Feb 20, 2019

yes,i have seen your testcase(s).but i think you did not catch my point of view.my point is that when the sidechain is try to instead the old chain.for the case above,when insertchain(100-201),when the number is 101,ValidateBody at: https://github.com/ethereum/go-ethereum/blob/master/core/block_validator.go#L51 will return an err,and the err is not nil,so the loop will break,here: https://github.com/ethereum/go-ethereum/blob/master/core/blockchain.go#L1189 ,so the Canonchian is also 100-200,not 100-201.

@holiman
Copy link
Contributor

holiman commented Feb 21, 2019

@ywzqwwt did you ever hit this issue for real -- or just by browsing the code? Because that scenario that's being handled by my fix should not happen if the findAncestor works correctly.

@karalabe karalabe closed this in 8577b5b Feb 21, 2019
@karalabe
Copy link
Member

Superseded by #19113

@ywzqwwt
Copy link
Contributor Author

ywzqwwt commented Feb 22, 2019

I've encountered it when i use more than ten machine to test this scene. and findAncestor will find 99 not 0,so insertsidechain will call insetchain(100-201),You misunderstood me.so you cut 0-99 in the new cold.

@ywzqwwt
Copy link
Contributor Author

ywzqwwt commented Feb 22, 2019

And when sidechain's td is bigger than Canonchian the method insertSidechain will call insertChain not InsertChain.

@holiman
Copy link
Contributor

holiman commented Feb 22, 2019

Right, so here's what is supposed to happen

  1. findAncestor will find 99 as common ancestor, not 0,so
  2. will call Insertchain(100-201) (Capital I)
  3. since parent(100) == 99 is pruned,
    3.1 ErrPrunedAncestor: will go into insertSideChain
    3.2 Will store blocks 100-201, until it detect that td will beat canon
    3.3 Will collect parents until it finds one where state is non-pruned (in this case genesis).
    3.4 Will call insertChain(0-201).

So what point above does not match up with what you observe?

@ywzqwwt
Copy link
Contributor Author

ywzqwwt commented Feb 22, 2019

3.since parent(100) == 99 is pruned, no, 100 is not pruned,because his parent 99 is exist. the error will return nil.

@holiman
Copy link
Contributor

holiman commented Feb 22, 2019

In that case, this is the scenario:

Right, so here's what is supposed to happen

  1. findAncestor will find 99 as common ancestor, not 0,so
  2. will call Insertchain(100-201) (Capital I)
  3. The first err is nil, it will fall directly down into // No validation errors for the first block,
    3.1 and loop through blocks 100 to 201.
    3.2 Eventually, the bc.writeBlockWithState call will do the reorg, when we hit block 201.

At no point should we hit ErrPrunedAncestor.

A pruned ancestor means that the state for the parent missing -- not that the block is missing, but that in order to (fully) process a child block, we need to iterate old blocks to regenerate the state.

So, same question again, what point does not match up with what you observe?

And btw, thanks a lot for your patience in this, I highly appreciate extra eyes on these things, since it's pretty intricate and very important to have right.

So what point above does not match up with what you observe?

kiku-jw pushed a commit to kiku-jw/go-ethereum that referenced this pull request Mar 29, 2019
…#19113)

* blockchain: more tests for sidechain import, fixes ethereum#19105

* core/blockchain: rework import of pruned canon blocks and canon-prepended sidechains

* core/blockchain: minor clarity change

* core/blockchain: remove unused method
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants