Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/les: import header chains in batches #19456

Closed
wants to merge 1 commit into from

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Apr 12, 2019

This PR tries to improve a couple of things are in fast-sync.

  • Process headers in chains, not one-by-one
    • This means db writes will be batched,
    • Also means that there is less reading-back what we just wrote
  • A minor change in ethash uncle validation to do less lookups of ancestors uncles (to not have to load the full ancestor block in most cases)
  • Makes HasHeader a part of the ChainReader interface. This can save a few lookups where we don't have to load from disk
  • Increase numberCache

Needs some general cleanup, and probably some tests will fail, and it needs some fixes for LES to keep working, but I'll post some charts here later on.

@holiman holiman requested a review from karalabe as a code owner April 12, 2019 15:06
@holiman
Copy link
Contributor Author

holiman commented Apr 13, 2019

This chart is from this PR but with an additional fix that I later removed for this PR, where I also modded the db interface for state lookups -- which I think is unnessecary.

Screenshot_2019-04-13 Enterprise Benchmark Dashboard Pro+ Datadog

The linear section in before the marker is when it downloads headers. The less linear section after is when it wraps up the receipts/bodies. This PR is markedly lower on db reads during this section.
When both were finished, these were the stats

  • datadir size 158 GB (both)
  • db writes 1.65 TB vs 2.25 TB
  • db reads 1.48 TB vs 2.04 TB

I'll spin it up a new benchmark with this PR

@holiman
Copy link
Contributor Author

holiman commented Apr 13, 2019

Ok, restarted on mon08/mon09.

@holiman
Copy link
Contributor Author

holiman commented Apr 21, 2019

I increased the size of the numberCache from 2048 to 4096. That cache is used by HasHeader, and is very cheap (it's all uint64s) compared to a db lookup.
I also shaved some db lookup from the import of block bodies; with the assumption that if block N is not complete, then the child block is not compete either.

A remaining TODO is to see if we can port over lightchain to use the new format. Also, I believe I return the wrong integers from InsertHeaderChain.

@holiman
Copy link
Contributor Author

holiman commented Apr 22, 2019

I fixed up LES so it uses the same mechanism. The only tangible change from before, is that when a new chain is imported that reorgs and takes over the old chain, previously the events would be e.g. like this:
[side, side, side, side, canon, canon]. This PR changes it to be [canon, canon canon, canon, canon, canon]. Which IMO is not incorrect (?), but still a change.

@zsfelfoldi PTAL

@@ -199,15 +199,24 @@ func (ethash *Ethash) VerifyUncles(chain consensus.ChainReader, block *types.Blo

number, parent := block.NumberU64()-1, block.ParentHash()
for i := 0; i < 7; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not be the worst thing to turn this into a constant while you're here, such as maxKin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really a fan of that. The way it's written, it's very clear upon inspection what the limit is, without having to look up the constant in some params-file. Also, since we're not likely to change it ever, I don't see what the benefit would be in parameterizing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. It was not about parameterizing it, but about making it clear what the 7 means as some people might not be able to immediately know the kin relation for ommers.

core/headerchain.go Show resolved Hide resolved
//
// Note: This method is not concurrent-safe with inserting blocks simultaneously
// into the chain, as side effects caused by reorganisations cannot be emulated
// without the real blocks. Hence, writing headers directly should only be done
// in two scenarios: pure-header mode of operation (light clients), or properly
// separated header/block phases (non-archive clients).
func (hc *HeaderChain) WriteHeader(header *types.Header) (status WriteStatus, err error) {
// Cache some values to prevent constant recalculation
func (hc *HeaderChain) WriteHeaders(headers []*types.Header, pwCallbackFn PWCallback) (ignored, imported int, status WriteStatus, lastHash common.Hash, lastHeader *types.Header, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this feels like a lot of vales to return from a single function. It makes me wonder if it is trying to do too much or there is another way to split this up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's a bit silly. I'll see if I can shave off some of it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The easy way out is to combine all of these into a WriteHeadersResult since you've already given all the fields name. I'd still leave error separate if you decide to go that route.

if len(headers) == 0 {
return ignored, imported, NonStatTy, lastHash, nil, err
}
ptd := hc.GetTd(headers[0].ParentHash, headers[0].Number.Uint64()-1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it was not immediately clear to me that ptd is parentTD just by looking at its name. You may want to consider a name change.

number = header.Number.Uint64()
lastNumber = headers[0].Number.Uint64() - 1 // Last successfully imported number
externTd *big.Int // TD of successfully imported chain
inserted []numberHash // Quick lookup of number/hash for the chain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if you could get away with using headers instead of inserted since you will know lastNumber and have the hash and number of the first header at header[0]. I'm not a big fan of method-scoped struct definitions personally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem with that is that I don't want to write the hashes until later, and if I don't stash the hashes here, I'll have to call Hash() on each header again.
types.Block.Hash() calls Header.Hash and remembers the hash internally, but headers don't store it internally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but headers don't store it internally.

Perhaps this is something that could be done (and could potentially benefit the rest of the code base) (maybe in the future).

whFunc := func(header *types.Header) error {
status, err := lc.hc.WriteHeader(header)

postWriteCallback := func(header *types.Header, status core.WriteStatus) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving this function declaration outside this function and naming it to something that describes that it's doing.

@holiman holiman changed the title [wip] core: import header chains in batches core/les: import header chains in batches Apr 23, 2019
@holiman holiman added this to the 1.9.0 milestone Apr 30, 2019
@holiman holiman modified the milestones: 1.9.0, 1.9.1 May 16, 2019
@holiman
Copy link
Contributor Author

holiman commented May 17, 2019

I have now rebased this on top of latest master, and squashed it. @rjl493456442 @karalabe PTAL

core/blockchain.go Show resolved Hide resolved
core/headerchain.go Outdated Show resolved Hide resolved
core/headerchain.go Outdated Show resolved Hide resolved
core/headerchain.go Outdated Show resolved Hide resolved
core/headerchain.go Show resolved Hide resolved
core/headerchain.go Outdated Show resolved Hide resolved
core/headerchain.go Show resolved Hide resolved
}
log.Info("Imported new block headers", context...)

return 0, nil
return 0, status, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also return the last index of the inserted header instead of returning 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I understood the return parameters (which are poorly documented) is that the first return value is the "index of the erroring header".

Copy link
Member

@rjl493456442 rjl493456442 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except the compilation error, otherwise LGTM

hash = header.Hash()
number = header.Number.Uint64()
lastNumber uint64 // Last successfully imported number
lastHash common.Hash // Last successfully imported hash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

core/headerchain.go:148:3: lastHash redeclared in this block

core/headerchain.go Show resolved Hide resolved
@karalabe karalabe modified the milestones: 1.9.1, 1.9.2 Jul 23, 2019
@karalabe karalabe modified the milestones: 1.9.2, 1.9.3 Aug 13, 2019
@karalabe karalabe modified the milestones: 1.9.3, 1.9.4 Sep 4, 2019
@holiman
Copy link
Contributor Author

holiman commented Oct 25, 2019

Doing another run with this one, combined with #20197. Idea being that this PR makes headers use batches, and may get a boost from larger batches.

@holiman
Copy link
Contributor Author

holiman commented Oct 25, 2019

The first fifteen minutes or so (experimental is yellow: this PR plus larger batch size), master is green.

experimental has same or lower write ops per second to disk, despite writing 1.5x the amount of data:

Screenshot_2019-10-25 Dual Geth - Grafana

And is also markedly ahead in header/receipts:

Screenshot_2019-10-25 Dual Geth - Grafana(1)

@holiman
Copy link
Contributor Author

holiman commented Oct 25, 2019

Although they're pretty close to one another now, here's the most recent logs from papertrail:
Experimental:

mon08.ethdevops.io [10-25|10:51:53.685] Imported new block headers count=2048 elapsed=356.472ms number=5911574 hash=e83626…05cc91 age=1y3mo3w
mon08.ethdevops.io [10-25|10:52:00.649] Imported new block headers count=2048 elapsed=439.699ms number=5913622 hash=1df29c…a5b7b5 age=1y3mo3w
mon08.ethdevops.io [10-25|10:52:07.371] Imported new block headers count=2048 elapsed=377.544ms number=5915670 hash=0bdc50…776b50 age=1y3mo3w
mon08.ethdevops.io [10-25|10:52:15.037] Imported new block headers count=2048 elapsed=444.360ms number=5917718 hash=8b5080…650e70 age=1y3mo3w
mon08.ethdevops.io [10-25|10:52:21.647] Imported new block headers count=2048 elapsed=307.541ms number=5919766 hash=0bf2c3…b1c635 age=1y3mo3w
mon08.ethdevops.io [10-25|10:52:29.173] Imported new block headers count=2048 elapsed=342.597ms number=5921814 hash=f3f342…688bd2 age=1y3mo3w
mon08.ethdevops.io [10-25|10:52:34.689] Imported new block headers count=2048 elapsed=365.197ms number=5923862 hash=fe6772…c609af age=1y3mo3w
mon08.ethdevops.io [10-25|10:52:41.492] Imported new block headers count=2048 elapsed=529.440ms number=5925910 hash=9880eb…d83d60 age=1y3mo3w
mon08.ethdevops.io [10-25|10:52:46.829] Imported new block headers count=2048 elapsed=211.559ms number=5927958 hash=600c9e…db5b9e age=1y3mo3w
mon08.ethdevops.io [10-25|10:52:54.425] Imported new block headers count=2048 elapsed=360.680ms number=5930006 hash=deb89e…a543ad age=1y3mo3w
mon08.ethdevops.io [10-25|10:53:01.154] Imported new block headers count=2048 elapsed=382.476ms number=5932054 hash=b8cf26…48dcfa age=1y3mo3w
mon08.ethdevops.io [10-25|10:53:08.834] Imported new block headers count=2048 elapsed=351.325ms number=5934102 hash=4ad170…530ab4 age=1y3mo3w
mon08.ethdevops.io [10-25|10:53:15.610] Imported new block headers count=2048 elapsed=377.542ms number=5936150 hash=50284f…092495 age=1y3mo3w
mon08.ethdevops.io [10-25|10:53:23.997] Imported new block headers count=2048 elapsed=239.141ms number=5938198 hash=3d02c4…3b3d86 age=1y3mo3w
mon08.ethdevops.io [10-25|10:53:28.618] Imported new block headers count=2048 elapsed=370.648ms number=5940246 hash=4ca561…ca0860 age=1y3mo3w
mon08.ethdevops.io [10-25|10:53:35.996] Imported new block headers count=2048 elapsed=223.579ms number=5942294 hash=87929a…ef5d6c age=1y3mo3w

master:

mon09.ethdevops.io [10-25|10:51:54.216] Imported new block headers count=2048 elapsed=379.546ms number=5830656 hash=31884d…659095 age=1y4mo1w
mon09.ethdevops.io [10-25|10:52:01.063] Imported new block headers count=2048 elapsed=657.646ms number=5832704 hash=bd8778…4ef8de age=1y4mo1w
mon09.ethdevops.io [10-25|10:52:08.155] Imported new block headers count=2048 elapsed=768.018ms number=5834752 hash=2ad161…552798 age=1y4mo1w
mon09.ethdevops.io [10-25|10:52:17.825] Imported new block headers count=2048 elapsed=409.307ms number=5836800 hash=5cf3a2…1e6d51 age=1y4mo1w
mon09.ethdevops.io [10-25|10:52:28.847] Imported new block headers count=2048 elapsed=711.594ms number=5838848 hash=5706b6…089e52 age=1y4mo1w
mon09.ethdevops.io [10-25|10:52:33.757] Imported new block headers count=2048 elapsed=586.430ms number=5840896 hash=2b5e12…b5b9eb age=1y4mo1w
mon09.ethdevops.io [10-25|10:52:39.597] Imported new block headers count=2048 elapsed=580.768ms number=5842944 hash=ecd349…08de33 age=1y4mo1w
mon09.ethdevops.io [10-25|10:52:44.423] Imported new block headers count=2048 elapsed=597.074ms number=5844992 hash=138694…ff55f7 age=1y4mo1w
mon09.ethdevops.io [10-25|10:52:49.016] Imported new block headers count=2048 elapsed=418.781ms number=5847040 hash=482720…46a691 age=1y4mo1w
mon09.ethdevops.io [10-25|10:52:55.987] Imported new block headers count=2048 elapsed=616.988ms number=5849088 hash=fa22d2…532bb6 age=1y4mo1w
mon09.ethdevops.io [10-25|10:53:02.142] Imported new block headers count=2048 elapsed=916.186ms number=5851136 hash=a4b77f…78f581 age=1y4mo1w
mon09.ethdevops.io [10-25|10:53:09.027] Imported new block headers count=2048 elapsed=584.069ms number=5853184 hash=43b169…b92805 age=1y4mo6d
mon09.ethdevops.io [10-25|10:53:14.925] Imported new block headers count=2048 elapsed=684.613ms number=5855232 hash=3bd9ec…211164 age=1y4mo6d
mon09.ethdevops.io [10-25|10:53:20.080] Imported new block headers count=2048 elapsed=756.576ms number=5857280 hash=aad5a4…35cfed age=1y4mo5d
mon09.ethdevops.io [10-25|10:53:27.030] Imported new block headers count=2048 elapsed=562.049ms number=5859328 hash=99e129…86e523 age=1y4mo5d
mon09.ethdevops.io [10-25|10:53:33.941] Imported new block headers count=2048 elapsed=710.964ms number=5861376 hash=4ec280…4e2edf age=1y4mo5d
mon09.ethdevops.io [10-25|10:53:41.963] Imported new block headers count=2048 elapsed=562.819ms number=5863424 hash=ecaac0…59a96c age=1y4mo4d 

So, from the looks of it, experimental processes headers almost twice as fast:

(356+439+377+444+307+342+365+529+211+360+382+351+377+239+370+223)/16= 354.5

(379+657+768+409+711+586+580+597+418+616+916+584+684+756+562+710+562)/17 = 617.35

@holiman
Copy link
Contributor Author

holiman commented Oct 25, 2019

@holiman
Copy link
Contributor Author

holiman commented Oct 27, 2019

It seemed kind of odd to me why faster writing of headers to disk doesn't really affect the fast-sync speed. I think the cause is that we're intentionally throttling the requests, so even increasing the peers won't affect it.

The x-axis on the graphs are a bit funky, because of daylight changes: 2am occurs twice, so take that into account...

Egress + throttle graph
Screenshot_2019-10-27 Single Geth - Grafana(1)

Ingress + throttle graph
Screenshot_2019-10-27 Single Geth - Grafana

I'm not saying we should remove the throttling, but it's definitely worth looking into if the throttling can be improved.
Memory + throttle graph:
Screenshot_2019-10-27 Single Geth - Grafana(2)

Memory goes down a lot when sync completes, at the second 2am, but block+receipt download finished 30 minutes earlier -- and when it did, there was no noticeable drop in mem usage.

There's also a very high throttle rate in the beginning of sync, while mem usage is still low. Maybe we should mem-weight of in-flight reqeust less when deciding about throttling

@holiman
Copy link
Contributor Author

holiman commented Aug 21, 2020

Closing in favour of #21471

@holiman holiman closed this Aug 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants