core,eth: add rpc method to trie flush frequency #24785

s1na · 2022-04-28T16:26:38Z

holiman · 2022-04-29T11:22:44Z

core/blockchain.go

+	if mode := atomic.LoadInt32(&bc.gcmode); mode == archiveMode {
 		return triedb.Commit(root, false, nil)
 	} else {


The upper clause returns, let's get rid of the else and unindent the next clause

And after that, maybe

if current := block.NumberU64(); current <= TriesInMemory { return nil }

And we can un-indent some more

holiman · 2022-04-29T11:34:01Z

Good start, but this way we'll only be able to toggle archive on or off. I think it would be neat to set it to a more linear value.

So by default, we have it set to "2 hours of block processing". Which is a nice UX. However, I'd like to be able to say "I want to always flush after 1024 blocks" or something like that, because then I know what the max reexec will be.

For that to happen,

if bc.gcproc > bc.cacheConfig.TrieTimeLimit

needs to be modified to something like

	// Find the next state trie we need to commit
	chosen := current - TriesInMemory
	doFlush := bc.gcproc > bc.cacheConfig.TrieTimeLimit // flush on time limits
	doFlush |= chosen-lastWrite > maxStatesInMemory     // flush on explicit limit
	// flush an entire trie to disk
	if doFlush {

(and also possibly verifying so we don't wrap around on negatives or whatever)

s1na · 2022-04-29T13:11:15Z

What about replacing TrieTimeLimit with MaxStatesInMemory? We can set the number of blocks so it ends up in the same ballpark as the current time limit. Makes the logic easier.

Edit: TrieTimeout is settable through the config file so just removing it is a breaking change. We could infer the number of blocks from the timeout for mainnet and testnets (targetting 15s) but not for custom networks. I'm gonna just add another parameter like you suggested.

holiman · 2022-04-29T14:02:03Z

I think it's nice to have both. Setting MaxStatesInMemory based off the calculated time on some random blocks can make it error wildly, and if we make it into some sort of moving average, then it becomes pretty complex anyway.

s1na · 2022-04-29T14:28:04Z

I combined maxStatesInMemory and gcmode in one parameter called trieFlushFreq:

freq == 1 => archive mode
freq > 1 => full mode, flush every freq blocks or after time limit whichever comes sooner
freq == 0 => full mode, flush after time limit

holiman · 2022-05-02T13:46:49Z

core/blockchain.go

+	snaps         *snapshot.Tree // Snapshot tree for fast trie leaf access
+	triegc        *prque.Prque   // Priority queue mapping block numbers to tries to gc
+	gcproc        time.Duration  // Accumulates canonical block processing for trie dumping
+	sinceFlush    uint32         // Accumulates number of canonical blocks since last trie flush


Does this have to be a member on Blockchain? It can be just a variable inside that loop, no?

holiman · 2022-05-02T13:47:46Z

core/blockchain.go

@@ -1250,6 +1262,7 @@ func (bc *BlockChain) writeBlockWithState(block *types.Block, receipts []*types.
 					triedb.Commit(header.Root, true, nil)
 					lastWrite = chosen
 					bc.gcproc = 0
+					bc.sinceFlush = 0


Do you really need both sinceFlush and lastWrite? Isn't sinceFlush derivable from lastWrite?

holiman · 2022-05-02T13:50:17Z

eth/api.go

@@ -607,3 +607,13 @@ func (api *PrivateDebugAPI) GetAccessibleState(from, to rpc.BlockNumber) (uint64
 	}
 	return 0, fmt.Errorf("No state found")
 }
+
+type gcPolicy struct {
+	Archive        bool


It's nice to surface this to the user, so he/she can set { "archive"=true} , but not so nice to just ignore archive and only care about FlushFrequency it later on

core/blockchain.go

holiman · 2022-05-30T18:40:59Z

We discussed this on last triage -- IIRC, @karalabe and @fjl thought that it should be switched back to be a time-based setting, as opposed to a block-based setting.
Meaning, what we set is the "amount of block processing time before flush", not wall-clock time.

holiman · 2022-10-17T11:30:54Z

Meaning, what we set is the "amount of block processing time before flush", not wall-clock time.

Back to you, @s1na :)

holiman · 2022-11-05T14:42:44Z

@s1na please don't forget about this one, it would be nice to have in several situations

holiman · 2022-11-29T09:58:55Z

@s1na please don't forget about this one, it would be nice to have in several situations

^^ 🥺

s1na · 2022-11-29T09:59:58Z

@holiman very sorry I've been putting this off. I'm gonna get started on it asap.

s1na · 2022-12-04T18:02:12Z

It seems gcproc (which accumulates block processing time) that we use to detect if its time to flush is not updated anymore when catalyst is driving the blockchain. That's because we don't set head when first inserting the block. I have to find a work-around.

s1na · 2022-12-07T21:45:45Z

I've rebased the PR and updated it so that you can configure it with block processing time. Now running it on a node for testing.

s1na · 2022-12-08T08:38:25Z

For the normal case:

INFO [12-08|08:33:13.133] Deciding to flush                        interval=1h0m0s chosen=16,138,695 time=10m46.357989217s lastWrite=0
INFO [12-08|08:33:13.149] Imported new potential chain segment     number=16,138,823 hash=9008c1..260763 blocks=1 txs=143 mgas=12.913 elapsed=103.009ms    mgasps=125.36
2 dirty=1021.78MiB

// Now call debug.setTrieFlushInterval('10m47s')

// Ok interval updated
INFO [12-08|08:33:25.016] Deciding to flush                        interval=10m47s chosen=16,138,696 time=10m46.421472469s lastWrite=0
INFO [12-08|08:33:25.028] Imported new potential chain segment     number=16,138,824 hash=5aba19..72d9e8 blocks=1 txs=113 mgas=10.422 elapsed=125.296ms    mgasps=83.178
  dirty=1022.18MiB

// Here it went over the interval and flushed
INFO [12-08|08:34:36.568] Deciding to flush                        interval=10m47s chosen=16,138,702 time=10m47.175393223s lastWrite=0
INFO [12-08|08:34:59.657] Persisted trie from memory database      nodes=3,859,148 size=910.81MiB time=32.773272419s    gcnodes=7,785,269 gcsize=3.15GiB gctime=37.463748875s livenodes=496,459 livesize=179.47MiB
INFO [12-08|08:34:59.657] Imported new potential chain segment     number=16,138,830 hash=ec63f0..8ebb58 blocks=1 txs=112 mgas=29.195 elapsed=23.345s      mgasps=1.251   dirty=225.66MiB

// And we see time was reset and lastWrite now indicates the correct block
INFO [12-08|08:35:04.063] Deciding to flush                        interval=10m47s chosen=16,138,703 time=154.97035ms      lastWrite=16,138,702
INFO [12-08|08:35:04.065] Imported new potential chain segment     number=16,138,831 hash=0be578..f9b398 blocks=1 txs=100 mgas=9.205  elapsed=520.668ms    mgasps=17.679  dirty=226.68MiB

s1na · 2022-12-08T11:48:48Z

Meaning, what we set is the "amount of block processing time before flush", not wall-clock time.

I find it hard to choose a good parameter value to target a specific flushing interval since there's quite a bit of variance in block processing times and the accumulated time being different from actual wall clock.

However it's good if all you care about is relativity. I.e. if you know the default is 1h and you want twice more flushing you can set it to 30m, etc.

holiman · 2022-12-08T12:03:02Z

I find it hard to choose a good parameter value to target

Well, the parameters to target is bc.gcproc. I don't understand the confusion ... (?)

holiman · 2022-12-08T12:05:52Z

core/blockchain.go

+	flushInterval := time.Duration(atomic.LoadInt64(&bc.flushInterval))
+	// If we exceeded time allowance, flush an entire trie to disk
+	if bc.gcproc > flushInterval {


The diff is large mainly due to un-indenting an if-clause (which is good). But just to verify: this change right here is the only real diff, and the heart of the change, right?

We compare the gcproc with the configurable flushInterval instead of cacheConfig.TrieTimeLimit.

Yes see this commit: bea73e5

holiman

LGTM!

s1na · 2022-12-08T12:13:47Z

Well, the parameters to target is bc.gcproc. I don't understand the confusion ... (?)

Yes but if I set it to 1h it will actually take 70 hours to hit that target. That's why I mean un-intuitive.

holiman · 2022-12-09T11:09:31Z

Yes but if I set it to 1h it will actually take 70 hours to hit that target.

No, it will take one hour. AS in

If geth crashes, it will take up to one hour (not 70) to replay from the latest flush
If I want to recreate a state, it will take up to one hour (not 70) to do so

You are right that

I will have to wait 70 hours to see the log message saying it flushed.

But other than that, isn't it kind of "the most UX friendly of two bad options" ?

s1na · 2022-12-09T11:29:52Z

Ah I did notice that when node is stopped and restarted the processing time goes by quicker. Right, I see the other option isn't so much better.

I think the PR is ready.

…ency (ethereum#24785) This PR makes it possible to modify the flush interval time via RPC. On one extreme, `0s`, it would act as an archive node. If set to `1h`, means that after one hour of effective block processing time, the trie would be flushed. If one block takes 200ms, this means that a flush would occur every `5*3600=18000` blocks -- however, if the memory size of the cached states grows too large, it will flush sooner. Essentially, this makes it possible to configure the node to be more or less "archive:ish", and without restarting the node while reconfiguring it.

holiman reviewed Apr 29, 2022

View reviewed changes

s1na mentioned this pull request May 2, 2022

core,eth: api for flushing block state #24736

Closed

holiman reviewed May 2, 2022

View reviewed changes

s1na force-pushed the rpc/gcmode branch from 68b13a9 to f459215 Compare May 2, 2022 15:54

s1na changed the title ~~core,eth: add rpc method to toggle archive mode~~ core,eth: add rpc method to trie flush frequency May 3, 2022

s1na added the status:triage label May 3, 2022

holiman reviewed May 6, 2022

View reviewed changes

core/blockchain.go Outdated Show resolved Hide resolved

holiman reviewed May 6, 2022

View reviewed changes

core/blockchain.go Outdated Show resolved Hide resolved

ligi removed the status:triage label May 17, 2022

s1na force-pushed the rpc/gcmode branch from 190d129 to 262b452 Compare May 23, 2022 15:50

s1na added the status:triage label May 23, 2022

ligi removed the status:triage label May 31, 2022

holiman assigned s1na Oct 17, 2022

s1na added 3 commits December 7, 2022 22:34

core: unindent code

78a6738

mv lastWrite to blockchain struct

9f78416

Make trie flush interval settable via rpc method

bea73e5

s1na force-pushed the rpc/gcmode branch from 262b452 to bea73e5 Compare December 7, 2022 21:44

rm log

cb0d070

holiman reviewed Dec 8, 2022

View reviewed changes

s1na marked this pull request as ready for review December 8, 2022 12:07

s1na requested review from karalabe and rjl493456442 as code owners December 8, 2022 12:07

holiman approved these changes Dec 8, 2022

View reviewed changes

fix comment

1726d8c

holiman merged commit 711afbc into ethereum:master Dec 9, 2022

holiman added this to the 1.11.0 milestone Dec 9, 2022

s1na mentioned this pull request Dec 15, 2022

doc: add setTrieFlushInterval method #26366

Merged

Tristan-Wilson mentioned this pull request Feb 28, 2023

Merge v1.11.2 OffchainLabs/go-ethereum#205

Merged

13 tasks

This was referenced May 19, 2023

core,eth: add rpc method to trie flush frequency bnb-chain/bsc#1624

Closed

core,eth: add rpc debug_setTrieFlushInterval maticnetwork/bor#870

Closed

holiman mentioned this pull request Apr 18, 2024

update disk root more frequently #29565

Closed

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core,eth: add rpc method to trie flush frequency #24785

core,eth: add rpc method to trie flush frequency #24785

s1na commented Apr 28, 2022 •

edited

Loading

holiman Apr 29, 2022

holiman Apr 29, 2022

holiman commented Apr 29, 2022 •

edited

Loading

s1na commented Apr 29, 2022 •

edited

Loading

holiman commented Apr 29, 2022 •

edited

Loading

s1na commented Apr 29, 2022

holiman May 2, 2022

holiman May 2, 2022

holiman May 2, 2022

holiman commented May 30, 2022

holiman commented Oct 17, 2022

holiman commented Nov 5, 2022

holiman commented Nov 29, 2022

s1na commented Nov 29, 2022

s1na commented Dec 4, 2022

s1na commented Dec 7, 2022

s1na commented Dec 8, 2022

s1na commented Dec 8, 2022

holiman commented Dec 8, 2022

holiman Dec 8, 2022

s1na Dec 8, 2022

holiman left a comment

s1na commented Dec 8, 2022

holiman commented Dec 9, 2022

s1na commented Dec 9, 2022

core,eth: add rpc method to trie flush frequency #24785

core,eth: add rpc method to trie flush frequency #24785

Conversation

s1na commented Apr 28, 2022 • edited Loading

holiman Apr 29, 2022

Choose a reason for hiding this comment

holiman Apr 29, 2022

Choose a reason for hiding this comment

holiman commented Apr 29, 2022 • edited Loading

s1na commented Apr 29, 2022 • edited Loading

holiman commented Apr 29, 2022 • edited Loading

s1na commented Apr 29, 2022

holiman May 2, 2022

Choose a reason for hiding this comment

holiman May 2, 2022

Choose a reason for hiding this comment

holiman May 2, 2022

Choose a reason for hiding this comment

holiman commented May 30, 2022

holiman commented Oct 17, 2022

holiman commented Nov 5, 2022

holiman commented Nov 29, 2022

s1na commented Nov 29, 2022

s1na commented Dec 4, 2022

s1na commented Dec 7, 2022

s1na commented Dec 8, 2022

s1na commented Dec 8, 2022

holiman commented Dec 8, 2022

holiman Dec 8, 2022

Choose a reason for hiding this comment

s1na Dec 8, 2022

Choose a reason for hiding this comment

holiman left a comment

Choose a reason for hiding this comment

s1na commented Dec 8, 2022

holiman commented Dec 9, 2022

s1na commented Dec 9, 2022

s1na commented Apr 28, 2022 •

edited

Loading

holiman commented Apr 29, 2022 •

edited

Loading

s1na commented Apr 29, 2022 •

edited

Loading

holiman commented Apr 29, 2022 •

edited

Loading