Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trie: remove inconsistent trie nodes during sync in path mode #28595

Merged
merged 7 commits into from
Dec 8, 2023

Conversation

rjl493456442
Copy link
Member

@rjl493456442 rjl493456442 commented Nov 24, 2023

This pull request fixes #28587


State sync recap

In Geth state sync mechanism, there are two steps: (1) snap sync (2) state healing. In the first stage, Geth will request a batch of states(accounts and storage slots), verify the correctness via range proof and construct the internal merkle tree nodes locally.

However due to the fact that the retrieved states may be incomplete, lacking either the head or the tail, the locally constructed merkle trie nodes may be inconsistent with the correct one. In order to avoid committing inconsistent trie nodes, the boundary trie nodes are filtered out and this logic is implemented in #28327.

After retrieving the states of the entire keyspace, the persistent states, along with their Merkle tree nodes, can still be inconsistent with each other due to differences in sync targets between different sync cycles. To fix this inconsistency, state healing is necessary.

State healing essentially traverses the entire Merkle trie from root to bottom, retrieving any inconsistent nodes and continuously expanding the trie until the entire trie aligns with the provided root node.


What's the issue this pull request trying to fix?

Originally, in the state healing stage, the inconsistent trie nodes would be left in the database, awaiting overwriting. However, one corner case is not handled: What if the children of this inconsistent node are committed, and the sync cycle is aborted without overwriting this inconsistent one? The persistent state ends up in a weird situation where nodes on the same path are inconsistent with each other.

And, even worse, what if the target of the next sync cycle switches to one that matches the leftover inconsistent node? The entire subtrie won't be expanded because the state healer believes the subtrie must exist and be consistent with the sub-root node.

截屏2023-11-27 下午2 54 48

Here is an example to demonstrate the situation.

  • The original state root is 1
  • The current state root is 1'
  • State healer detects the inconsistency and expands the node path from 1 to 6
  • State healer commits the 6' into database
  • Healing is aborted
  • The next state root is switched to 1 again
  • The original root node 1 is present in database, healing finished

However, the 6' is left in database, causing the state as corrupted.

Although this corner case is pretty hard to occur, but it's totally possible and #28587 is the example.


How to fix the state healer to address this it?

This fix is specifically for path mode and not relevant with hash based at all.

In path mode, whenever the inconsistent nodes are met by state healer, these nodes must be marked as deleted, from top to bottom, consistent with the order of traverse.

  • If a few nodes are bypassed by a shortNode, should these node be marked as deleted?

Yes. in the picture above, node 3 is bypassed by node 2'. In this case, the node 3 should also be deleted otherwise the leftover 3 will be inconsistent with bottom nodes.

  • If a fullNode is replaced by a shortNode and do we need to delete the other branches?

No. These branches are not linked with the trie anymore, and also they are complete. It's totally safe to leave them in disk and they can be re-linked if state switches.

  • It might be possible that a node is deleted because of inconsistency and re-written. The delete and write can happen in the same database batch, does leveldb and pebble support it?

Yes, both database engine support mixing delete and write in same batch and there is a test case to prove that in codebase.

Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally lgtm, bugfix-wise, although I think it could be improved upon.

Also, I know that we have pretty good sync-tests, but the sync-tests are notoriously not very good at simulating a trie that actually progresses, so it's a bit hard to write those kinds of tests.

trie/sync.go Outdated Show resolved Hide resolved
trie/sync.go Outdated Show resolved Hide resolved
@@ -214,11 +251,8 @@ func (s *Sync) AddSubTrie(root common.Hash, path []byte, parent common.Hash, par
if root == types.EmptyRootHash {
return
}
if s.membatch.hasNode(path) {
Copy link
Member Author

@rjl493456442 rjl493456442 Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checking is meaningless and no longer required.

Originally in hash scheme, it's possible that a sub trie(storage trie) is shared by different contracts and the check here can serve as the de-duplication mechanism. However it's totally different in path scheme, which should still keep the duplication for different contracts. Therefore, path is used as the identifier here for checking duplication(originally it's node hash).

Due the fact the path can uniquely identify a trie node, so it's not really possible that the node with specific path is already cached in local batch, even for hash mode. It turns out a completely useless operation.

Therefore, we can remove it.

@rjl493456442 rjl493456442 force-pushed the fix-state-sync branch 3 times, most recently from ff2868c to d7c749b Compare November 27, 2023 07:13
trie/sync.go Outdated
Comment on lines 697 to 699
// Remove the inconsistent node before expanding the path.
if len(blob) != 0 {
s.membatch.delNode(owner, path)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only deletes the top level, but doesn't drill down into the leafs, does it? What triggers the deeper deletion?

Copy link
Member Author

@rjl493456442 rjl493456442 Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only deletes the node that is inconsistent with the one requested at the specific position.

It's not immediately removed from the database, the deletion is queued in the memory batch.

The relevant children will also be deleted when the trie traverse continues, from top to bottom.

All in all, the traverse order is from top to bottom, left to right(the priority control the order), and all inconsistent nodes will be marked as deleted with this order.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this pull request, all the state mutations are ordered in FIFO. e.g.

  1. DEL node at path []
  2. DEL node at path [1]
  3. DEL node at path [1,1]
  4. WRITE node at path [1,1]
  5. DEL node at path [1,2]
  6. WRITE node at path [1,2]
  7. WRITE node at path [1]
  8. WRITE node at path []

With this order, no matter when interruption happens, we can make sure the inconsistent nodes at top are removed first before writing any children. This property can support the guarantee in state healing: if the node is present, the whole consistent sub trie are also present.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I don't like, is that deletions will be added to the membatch from within methods that used to be free of side effects. With the changes in this PR, hasNode now must be called to clean up the database, and it is no longer simply a check whether the node exists. It is usually better to structure things such that side effects are performed explicitly based on the outcome of idempotent checks. I need to look into the PR more to understand how that could be done here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a mutation of the membatch here could also introduce a race because we run hasNode in a new goroutine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, good catch

Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as far as I can tell

@rjl493456442
Copy link
Member Author

INFO [11-27|20:45:50.603] State is complete accounts=226,655,097 slots=1,081,610,296 codes=37,170,588 elapsed=8h17m56.677s at least the state is complete after the snap sync.

@fjl fjl self-requested a review November 30, 2023 09:30
@holiman holiman added this to the 1.13.6 milestone Dec 1, 2023
@fjl fjl changed the title core, trie: remove inconsistent trie node in path mode trie: remove inconsistent trie nodes during sync in path mode Dec 8, 2023
@fjl fjl merged commit e206d3f into ethereum:master Dec 8, 2023
2 of 3 checks passed
GrapeBaBa pushed a commit to optimism-java/shisui that referenced this pull request Dec 10, 2023
…um#28595)

This fixes a database corruption issue that could occur during state healing.
When sync is aborted while certain modifications were already committed, and a
reorg occurs, the database would contain incorrect trie nodes stored by path.
These nodes need to detected/deleted in order to obtain a complete and fully correct state
after state healing.

---------

Co-authored-by: Felix Lange <fjl@twurst.com>
holiman added a commit that referenced this pull request Dec 21, 2023
Original problem was caused by #28595, where we made it so that as soon as we start to sync, the root of the disk layer is deleted. That is not wrong per se, but another part of the code uses the "presence of the root" as an init-check for the pathdb. And, since the init-check now failed, the code tried to re-initialize it which failed since a sync was already ongoing.

The total impact being: after a state-sync has begun, if the node for some reason is is shut down, it will refuse to start up again, with the error message: `Fatal: Failed to register the Ethereum service: waiting for sync.`. 

This change also modifies how `geth removedb` works, so that the user is prompted for two things: `state data` and `ancient chain`. The former includes both the chaindb aswell as any state history stored in ancients. 

---------

Co-authored-by: Martin HS <martin@swende.se>
arcivanov pushed a commit to arcivanov/go-ethereum that referenced this pull request Dec 22, 2023
…#28718)

Original problem was caused by ethereum#28595, where we made it so that as soon as we start to sync, the root of the disk layer is deleted. That is not wrong per se, but another part of the code uses the "presence of the root" as an init-check for the pathdb. And, since the init-check now failed, the code tried to re-initialize it which failed since a sync was already ongoing.

The total impact being: after a state-sync has begun, if the node for some reason is is shut down, it will refuse to start up again, with the error message: `Fatal: Failed to register the Ethereum service: waiting for sync.`. 

This change also modifies how `geth removedb` works, so that the user is prompted for two things: `state data` and `ancient chain`. The former includes both the chaindb aswell as any state history stored in ancients. 

---------

Co-authored-by: Martin HS <martin@swende.se>
Doozers pushed a commit to kilnfi/pgeth that referenced this pull request Dec 22, 2023
…um#28595)

This fixes a database corruption issue that could occur during state healing.
When sync is aborted while certain modifications were already committed, and a
reorg occurs, the database would contain incorrect trie nodes stored by path.
These nodes need to detected/deleted in order to obtain a complete and fully correct state
after state healing.

---------

Co-authored-by: Felix Lange <fjl@twurst.com>
Doozers pushed a commit to kilnfi/pgeth that referenced this pull request Dec 22, 2023
…#28718)

Original problem was caused by ethereum#28595, where we made it so that as soon as we start to sync, the root of the disk layer is deleted. That is not wrong per se, but another part of the code uses the "presence of the root" as an init-check for the pathdb. And, since the init-check now failed, the code tried to re-initialize it which failed since a sync was already ongoing.

The total impact being: after a state-sync has begun, if the node for some reason is is shut down, it will refuse to start up again, with the error message: `Fatal: Failed to register the Ethereum service: waiting for sync.`. 

This change also modifies how `geth removedb` works, so that the user is prompted for two things: `state data` and `ancient chain`. The former includes both the chaindb aswell as any state history stored in ancients. 

---------

Co-authored-by: Martin HS <martin@swende.se>
Dergarcon pushed a commit to specialmechanisms/mev-geth-0x2mev that referenced this pull request Jan 31, 2024
…um#28595)

This fixes a database corruption issue that could occur during state healing.
When sync is aborted while certain modifications were already committed, and a
reorg occurs, the database would contain incorrect trie nodes stored by path.
These nodes need to detected/deleted in order to obtain a complete and fully correct state
after state healing.

---------

Co-authored-by: Felix Lange <fjl@twurst.com>
Dergarcon pushed a commit to specialmechanisms/mev-geth-0x2mev that referenced this pull request Jan 31, 2024
…#28718)

Original problem was caused by ethereum#28595, where we made it so that as soon as we start to sync, the root of the disk layer is deleted. That is not wrong per se, but another part of the code uses the "presence of the root" as an init-check for the pathdb. And, since the init-check now failed, the code tried to re-initialize it which failed since a sync was already ongoing.

The total impact being: after a state-sync has begun, if the node for some reason is is shut down, it will refuse to start up again, with the error message: `Fatal: Failed to register the Ethereum service: waiting for sync.`. 

This change also modifies how `geth removedb` works, so that the user is prompted for two things: `state data` and `ancient chain`. The former includes both the chaindb aswell as any state history stored in ancients. 

---------

Co-authored-by: Martin HS <martin@swende.se>
maoueh pushed a commit to streamingfast/go-ethereum that referenced this pull request Jun 14, 2024
* cmd, core, trie: verkle-capable `geth init` (ethereum#28270)

This change allows the creation of a genesis block for verkle testnets. This makes for a chunk of code that is easier to review and still touches many discussion points.

* eth/tracers/js: fix isPush for push0 (ethereum#28520)

Fixes so that `push0` opcode is correctly reported as `true` by the `IsPush` function

---------

Co-authored-by: Martin Holst Swende <martin@swende.se>

* trie: spelling - fix comments in hasher (ethereum#28507)

Co-authored-by: VM <arimas@foxmail.com>

* tests/fuzzers: move fuzzers into native packages (ethereum#28467)

This PR moves our fuzzers from tests/fuzzers into whatever their respective 'native' package is.

The historical reason why they were placed in an external location, is that when they were based on go-fuzz, they could not be "hidden" via the _test.go prefix. So in order to shove them away from the go-ethereum "production code", they were put aside.

But now we've rewritten them to be based on golang testing, and thus can be brought back. I've left (in tests/) the ones that are not production (bls128381), require non-standard imports (secp requires btcec, bn256 requires gnark/google/cloudflare deps).

This PR also adds a fuzzer for precompiled contracts, because why not.

This PR utilizes a newly rewritten replacement for go-118-fuzz-build, namely gofuzz-shim, which utilises the inputs from the fuzzing engine better.

* tests: skip tests on windows 32bit CI (ethereum#28521)

tests: skip half the blockchain- and state-tests on windows 32bit CI-tests

* cmd/geth: more special cases logging tests (ethereum#28527)

adds logging tests for errors and custom fmt.Stringer-types which output strings that needs to be quoted/escaped.

* accounts,cmd,console,les,metrics:  refactor some errors checked by (ST1005) go-staticcheck (ethereum#28532)

fix: fix some (ST1005)go-staticcheck

* miner: run tests in parallel (ethereum#28506)

Changes many of the tests in the miner package to run in parallel

* internal/jsre/deps: fix typo in jsdoc (ethereum#28511)

minor typo fix

* accounts/abi: improve readability of method-to-string conversion  (ethereum#28530)

refactor: improve readability of NewMethod print

* all: replace some cases of strings.SplitN with strings.Cut (ethereum#28446)

* ethdb/memorydb, trie: reduced allocations (ethereum#28473)

* trie: use pooling of iterator states in iterator

The node iterator burns through a lot of memory while iterating a trie, and a lot of
that can be avoided by using a fairly small pool (max 40 items).

name        old time/op    new time/op    delta
Iterator-8    6.22ms ± 3%    5.40ms ± 6%  -13.18%  (p=0.008 n=5+5)

name        old alloc/op   new alloc/op   delta
Iterator-8    2.36MB ± 0%    1.67MB ± 0%  -29.23%  (p=0.008 n=5+5)

name        old allocs/op  new allocs/op  delta
Iterator-8     37.0k ± 0%     29.8k ± 0%     ~     (p=0.079 n=4+5)

* ethdb/memorydb: avoid one copying of key

By making the transformation from []byte to string at an earlier point,
we save an allocation which otherwise happens later on.

name           old time/op    new time/op    delta
BatchAllocs-8     412µs ± 6%     382µs ± 2%   -7.18%  (p=0.016 n=5+4)

name           old alloc/op   new alloc/op   delta
BatchAllocs-8     480kB ± 0%     490kB ± 0%   +1.93%  (p=0.008 n=5+5)

name           old allocs/op  new allocs/op  delta
BatchAllocs-8     3.03k ± 0%     2.03k ± 0%  -32.98%  (p=0.008 n=5+5)

* Dockerfile: update Go to 1.21 (ethereum#28538)

* cmd/evm: validate blockchain tests poststate account storage (ethereum#28443)

This PR verifies the accounts' storage as specified in a blockchain test's postState field

The expect-section, it does really only check that the test works. It's meant for the test-author to verify that "If the test does what it's supposed to, then the nonce of X should be 2, and the slot Y at Z should be 0x123.

    This expect-section is not exhaustive (not full post-state)
    It is also not auto-generated, but put there manually by the author.

We can still check it, as a test-sanity-check, in geth

* signer: run tests in parallel (ethereum#28536)

marks tests as parallel-safe in package signer

* accounts, cmd: fix typos (ethereum#28526)

* core/txpool/legacypool: respect nolocals-setting (ethereum#28435)

This change adds a check to ensure that transactions added to the legacy pool are not treated as 'locals' if the global locals-management has been disabled. 

This change makes the pool enforce the --txpool.pricelimit setting.

* cmd: run tests in parallel (ethereum#28546)

* core/state/snapshot: print correct error from trie iterator (ethereum#28560)

* cmd/evm: capitalize evm commands (ethereum#28569)

* standard:fix for a unified standard

* standard:fix more as a complements

---------

Co-authored-by: haotian <haotian@haotiandeMacBook-Air.local>

* accounts/abi: context info on unpack-errors (ethereum#28529)

adds contextual information to errors returned by unpack

* core, trie, rpc: speed up tests (ethereum#28461)

* rpc: make subscription test faster

reduces time for TestClientSubscriptionChannelClose
from 25 sec to < 1 sec.

* trie: cache trie nodes for faster sanity check

This reduces the time spent on TestIncompleteSyncHash
from ~25s to ~16s.

* core/forkid: speed up validation test

This takes the validation test from > 5s to sub 1 sec

* core/state: improve snapshot test run
brings the time for TestSnapshotRandom from 13s down to 6s

* accounts/keystore: improve keyfile test

This removes some unnecessary waits and reduces the
runtime of TestUpdatedKeyfileContents from 5 to 3 seconds

* trie: remove resolver
* trie: only check ~5% of all trie nodes

* ethdb/pebble: don't double-close iterator inside pebbleIterator (ethereum#28566)

Adds 'released' flag to pebbleIterator to avoid double closing cockroachdb/pebble.Iterator as it is an invalid operation.

Fixes ethereum#28565

* eth/filters: reuse error msg for invalid block range (ethereum#28479)

* core/types: make 'v' optional for DynamicFeeTx and BlobTx (ethereum#28564)

This fixes an issue where transactions would not be accepted when they have only
'yParity' and not 'v'.

* rpc: improve performance of subscription notification encoding (ethereum#28328)

It turns out that encoding json.RawMessage is slow because
package json basically parses the message again to ensure it is valid.
We can avoid the slowdown by encoding the entire RPC notification once,
which yields a 30% speedup.

* cmd/utils: validate pre-existing genesis in --dev mode (ethereum#28468)

geth --dev can be used with an existing data directory and genesis block. Since
dev mode only works with PoS, we need to verify that the merge has happened.

Co-authored-by: Felix Lange <fjl@twurst.com>

* cmd/geth: add support for --dev flag in dumpgenesis (ethereum#28463)


Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: lightclient <lightclient@protonmail.com>

* les/vflux: run tests in parallel (ethereum#28524)

* cmd/{geth,utils}: add cmd to export preimages in snap enumeration order (ethereum#28256)

Adds a subcommand: `geth snapshot export-preimages`, to export preimages of every hash found during a snapshot enumeration: that is, it exports _only the active state_, and not _all_ preimages that have been used but are no longer part of the state. 

This tool is needed for the verkle transition, in order to distribute the preimages needed for the conversion. Since only the 'active' preimages are exported, the output is shrunk from ~70GB to ~4GB.

The order of the output is the order used by the snapshot enumeration, which avoids database thrashing. However, it also means that storage-slot preimages are not deduplicated.

* cmd/geth: fix build error (ethereum#28585)

* cmd/devp2p/internal/ethtest: undo debug-hack (ethereum#28588)

cmd/devp2p/internal/ethtest: remove a debug-hack flaw which prevented certain tests from running

* params: update discV5 bootnodes (ethereum#28562)

update discV5 bootnodes from https://github.com/eth-clients/eth2-networks/blob/master/shared/mainnet/bootstrap_nodes.txt

* cmd, les, tests: remove light client code (ethereum#28586)

* cmd, les, tests: remove light client code

This commit removes the light client (LES) code.
Since the merge the light client has been broken and
it is hard to maintain it alongside the normal client.
We decided it would be best to remove it for now and
maybe rework and reintroduce it in the future.

* cmd, eth: remove some more mentions of light mode

* cmd: re-add flags and mark as deprecated

* cmd: warn the user about deprecated flags

* eth: better error message

* eth, internal/ethapi: drop some weird indirection (ethereum#28597)

* trie: fix random test generator early terminate (ethereum#28590)

This change fixes a minor bug in the `randTest.Generate` function, which caused the `quick.Check` to be a no-op.

* eth/gasestimator, internal/ethapi: move gas estimator out of rpc (ethereum#28600)

* go.mod: update uint256 to v1.2.4 (ethereum#28612)

* eth/catalyst, eth/downloader: expose more sync information (ethereum#28584)

This change exposes more information from sync module internally

* light: remove package light(ethereum#28614)

This changes removes the package 'light', which is currently unused.

* cmd/evm, core/state: fix post-exec dump of state (statetests, blockchaintests) (ethereum#28504)

There were several problems related to dumping state. 

- If a preimage was missing, even if we had set the `OnlyWithAddresses` to `false`, to export them anyway, the way the mapping was constructed (using `common.Address` as key) made the entries get lost anyway. Concerns both state- and blockchain tests. 
- Blockchain test execution was not configured to store preimages.

This changes makes it so that the block test executor takes a callback, just like the state test executor already does. This callback can be used to examine the post-execution state, e.g. to aid debugging of test failures.

* ethereum: remove TODO comment about subscription (ethereum#28609)

* eth/tracers/js: fix type inconsistencies (ethereum#28488)

This change fixes two type-inconsistencies in the JS tracer:

- In most places we return byte arrays as a `Uint8Array` to the tracer. However it seems we missed doing the conversion for `ctx` fields which are passed to the tracer during `result`. They are passed as simple arrays. I think Uint8Arrays are more suitable and we should change this inconsistency. Note: this will be a breaking-change. But I believe the effect is small. If we look at our tracers we see that these fields (`ctx.from`, `ctx.to`, etc.) are used in 2 ways. Passed to `toHex` which takes both array or buffer. Or the length was measured which is the same for both types.
- The `slice` taking in `int, int` params versus `memory.slice` taking `int64, int64` params. I suggest changing `slice` types to `int64`. This should have no effect almost in any case.

* crypto/secp256k1: fix 32-bit tests when CGO_ENABLED=0 (ethereum#28602)

* consensus: verify the nonexistence of shanghai- and cancun-specific header fields (ethereum#28605)

* eth/gasestimator: allow slight estimation error in favor of less iterations (ethereum#28618)

* eth/gasestimator: early exit for plain transfer and error allowance

* core, eth/gasestimator: hard guess at a possible required gas

* internal/ethapi: update estimation tests with the error ratio

* eth/gasestimator: I hate you linter

* graphql: fix gas estimation test

---------

Co-authored-by: Oren <orenyomtov@users.noreply.github.com>

* all: replace log15 with slog (ethereum#28187)

This PR replaces Geth's logger package (a fork of [log15](https://github.com/inconshreveable/log15)) with an implementation using slog, a logging library included as part of the Go standard library as of Go1.21.

Main changes are as follows:
* removes any log handlers that were unused in the Geth codebase.
* Json, logfmt, and terminal formatters are now slog handlers.
* Verbosity level constants are changed to match slog constant values.  Internal translation is done to make this opaque to the user and backwards compatible with existing `--verbosity` and `--vmodule` options.
* `--log.backtraceat` and `--log.debug` are removed.

The external-facing API is largely the same as the existing Geth logger.  Logger method signatures remain unchanged.

A small semantic difference is that a `Handler` can only be set once per `Logger` and not changed dynamically.  This just means that a new logger must be instantiated every time the handler of the root logger is changed.

----
For users of the `go-ethereum/log` module. If you were using this module for your own project, you will need to change the initialization. If you previously did 
```golang
log.Root().SetHandler(log.LvlFilterHandler(log.LvlInfo, log.StreamHandler(os.Stderr, log.TerminalFormat(true))))
```
You now instead need to do 
```golang
log.SetDefault(log.NewLogger(log.NewTerminalHandlerWithLevel(os.Stderr, log.LevelInfo, true)))
```
See more about reasoning here: ethereum#28558 (comment)

* core/state: make stateobject.create selfcontain (ethereum#28459)

* trie/triedb/hashdb: take lock around access to dirties cache (ethereum#28542)

Add read locking of db lock around access to dirties cache in hashdb.Database to prevent
data race versus hashdb.Database.dereference which can modify the dirities map by deleting
an item.

Fixes ethereum#28541

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>

* accounts/abi/bind: fix typo (ethereum#28630)

* slog: faster and less memory-consumption (ethereum#28621)

These changes improves the performance of the non-coloured terminal formatting, _quite a lot_. 

```
name               old time/op    new time/op    delta
TerminalHandler-8    10.2µs ±15%     5.4µs ± 9%  -47.02%  (p=0.008 n=5+5)

name               old alloc/op   new alloc/op   delta
TerminalHandler-8    2.17kB ± 0%    0.40kB ± 0%  -81.46%  (p=0.008 n=5+5)

name               old allocs/op  new allocs/op  delta
TerminalHandler-8      33.0 ± 0%       5.0 ± 0%  -84.85%  (p=0.008 n=5+5)
```

I tried to _somewhat_ organize the commits, but the it might still be a bit chaotic. Some core insights: 

- The function `terminalHandler.Handl` uses a mutex, and writes all output immediately to 'upstream'. Thus, it can reuse a scratch-buffer every time. 
- This buffer can be propagated internally, making all the internal formatters either write directly to it,
- OR, make  use of the `tmp := buf.AvailableBuffer()` in some cases, where a byte buffer "extra capacity" can be temporarily used. 
- The `slog` package  uses `Attr` by value. It makes sense to minimize operating on them, since iterating / collecting into a new slice, iterating again etc causes copy-on-heap. Better to operate on them only once. 
- If we want to do padding, it's better to copy from a constant `space`-buffer than to invoke `bytes.Repeat` every single time.

* eth/tracers: tx-level state in debug_traceCall (ethereum#28460)

* cmd/evm: fix Env struct json tag (ethereum#28635)

* accounts/abi/bind: fixed typos (ethereum#28634)

* Update auth.go

* Update backend.go

* Update bind.go

* Update bind_test.go

* eth/fetcher: fix invalid tracking of received at time for block (ethereum#28637)

eth/fetcher: fix invalid tracking of received at time

* accounts: run tests in parallel (ethereum#28544)

* eth/tracers/logger: make structlog/json-log stack hex again (ethereum#28628)

* common/hexutil: define hex wrappers for uint256.Int

* eth/tracers/logger: make structlog/json-log stack hex again

* common/hexutil: goimports

* log: remove lazy, remove unused interfaces, unexport methods (ethereum#28622)

This change 

- Removes interface `log.Format`, 
- Removes method `log.FormatFunc`, 
- unexports `TerminalHandler.TerminalFormat` formatting methods (renamed to `TerminalHandler.format`)
- removes the notion of `log.Lazy` values


The lazy handler was useful in the old log package, since it
could defer the evaluation of costly attributes until later in the
log pipeline: thus, if the logging was done at 'Trace', we could
skip evaluation if logging only was set to 'Info'.

With the move to slog, this way of deferring evaluation is no longer
needed, since slog introduced 'Enabled': the caller can thus do
the evaluate-or-not decision at the callsite, which is much more
straight-forward than dealing with lazy reflect-based evaluation.

Also, lazy evaluation would not work with 'native' slog, as in, these
two statements would be evaluated differently:

```golang
  log.Info("foo", "my lazy", lazyObj)
  slog.Info("foo", "my lazy", lazyObj)
```

* .github: use github actions to run 32-bit linux tests (ethereum#28549)

use github actions to run 32-bit linux tests

* ethdb/pebble: remove a dependency (ethereum#28627)

The dependency was not really used anyway, so we can get rid of it.

Co-authored-by: Felix Lange <fjl@twurst.com>

* tests/fuzzers/bls12381: deactivate BLS fuzzer when CGO_ENABLED=0 (ethereum#28653)

tests/fuzzers/bls12381: deactivate fuzzer when CGO_ENABLED=0

* build: upgrade -dlgo version to Go 1.21.5 (ethereum#28648)

* rpc: fix ns/µs mismatch in metrics (ethereum#28649)

The rpc/duration/all meter was in nanoseconds, the individual meter in microseconds.
This PR changes it so both of them use nanoseconds.

* cmd/evm: fix dump after state-test exec (ethereum#28650)

The dump after state-test didn't work, the problem was an error, "Already committed", which was silently ignored. 

This change re-initialises the state, so the dumping works again.

* beacon/light: add CommitteeChain (ethereum#27766)

This change implements CommitteeChain which is a key component of the beacon light client. It is a passive data structure that can validate, hold and update a chain of beacon light sync committees and updates, starting from a checkpoint that proves the starting committee through a beacon block hash, header and corresponding state. Once synced to the current sync period, CommitteeChain can also validate signed beacon headers.

* cmd/utils, eth: disallow invalid snap sync / snapshot flag combos (ethereum#28657)

* eth: prevent startup in snap mode without snapshots

* cmd/utils: try to fix bad flag combos wrt snap sync and snapshot generation

* trie: remove inconsistent trie nodes during sync in path mode (ethereum#28595)

This fixes a database corruption issue that could occur during state healing.
When sync is aborted while certain modifications were already committed, and a
reorg occurs, the database would contain incorrect trie nodes stored by path.
These nodes need to detected/deleted in order to obtain a complete and fully correct state
after state healing.

---------

Co-authored-by: Felix Lange <fjl@twurst.com>

* cmd/utils: fix HTTPHost, WSHost flag priority (ethereum#28669)


Co-authored-by: Felix Lange <fjl@twurst.com>

* eth/protocols/eth: fix typos in comments (ethereum#28652)

* core/txpool : small cleanup refactors (ethereum#28654)

* eth/fetcher, eth/gasestimator: fix typos in comments (ethereum#28675)

* all: fix typos in comments (ethereum#28662)


Co-authored-by: Felix Lange <fjl@twurst.com>

* miner: eliminate the dead loop possibility for `newWorkLoop` and `mainLoop` (ethereum#28677)

discard the intervalAdjust message if the channel is full

* all: fix typos in comments (ethereum#28682)

chore(core,eth):fix a couple of typos

* p2p/discover: add liveness check in collectTableNodes (ethereum#28686)

* p2p/discover: add liveness check in collectTableNodes

* p2p/discover: fix test

* p2p/discover: rename to appendLiveNodes

* p2p/discover: add dedup logic back

* p2p/discover: simplify

* p2p/discover: fix issue found by test

* internal/flags: add missing flag types for auto-env-var generation (ethereum#28692)

Certain flags, such as `--rpc.txfeecap` currently do not have an env-var auto-generated for them. This change adds three missing cli flag types to the auto env-var helper function to fix this.

* cmd/evm:  default to mirror mainnet forks enabled (ethereum#28691)

cmd/evm:  default to using dev chain config (all mainnet HFs activated at block/timestamp 0

* cmd/evm, cmd/clef, cmd/bootnode: fix / unify logging (ethereum#28696)

This change fixes a problem with our non-core binaries: evm, clef, bootnode.

First of all, they failed to convert from legacy loglevels 1 to 5, to the new slog loglevels -4 to 4.

Secondly, the logging was actually setup in the init phase, and then overridden in the main. This is not needed for evm, since it used the same flag name as the main geth verbosity. Better to let the flags/internal handle the logging init.

* cmd/evm: t8n support custom tracers (ethereum#28557)

This change implements ability for the `evm t8n` tool to use custom tracers; either 'native' golang tracers or javascript tracers.

* params: release go-ethereum v1.13.6 stable

* Fix build errors

* Fix test-integration

---------

Co-authored-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
Co-authored-by: Sina Mahmoodi <1591639+s1na@users.noreply.github.com>
Co-authored-by: Martin Holst Swende <martin@swende.se>
Co-authored-by: VM <112189277+sysvm@users.noreply.github.com>
Co-authored-by: VM <arimas@foxmail.com>
Co-authored-by: jwasinger <j-wasinger@hotmail.com>
Co-authored-by: Zoro <40222601+BabyHalimao@users.noreply.github.com>
Co-authored-by: Håvard Anda Estensen <haavard.ae@gmail.com>
Co-authored-by: aliening <128203330+aliening@users.noreply.github.com>
Co-authored-by: Halimao <1065621723@qq.com>
Co-authored-by: danceratopz <danceratopz@gmail.com>
Co-authored-by: levisyin <150114626+levisyin@users.noreply.github.com>
Co-authored-by: jp-imx <109574657+jp-imx@users.noreply.github.com>
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
Co-authored-by: Haotian <51777534+tmelhao@users.noreply.github.com>
Co-authored-by: haotian <haotian@haotiandeMacBook-Air.local>
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
Co-authored-by: Maciej Kulawik <10907694+magicxyyz@users.noreply.github.com>
Co-authored-by: ucwong <ucwong@126.com>
Co-authored-by: Mario Vega <marioevz@gmail.com>
Co-authored-by: Delweng <delweng@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: lightclient <lightclient@protonmail.com>
Co-authored-by: Mikel Cortes <45786396+cortze@users.noreply.github.com>
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
Co-authored-by: Ng Wei Han <47109095+weiihann@users.noreply.github.com>
Co-authored-by: lightclient <14004106+lightclient@users.noreply.github.com>
Co-authored-by: Shivam Sandbhor <shivam.sandbhor@gmail.com>
Co-authored-by: Jakub Freebit <49676311+jakub-freebit@users.noreply.github.com>
Co-authored-by: Oren <orenyomtov@users.noreply.github.com>
Co-authored-by: BorkBorked <107079055+BorkBorked@users.noreply.github.com>
Co-authored-by: ddl <dengdiliang@gmail.com>
Co-authored-by: Manav Darji <manavdarji.india@gmail.com>
Co-authored-by: Marius Kjærstad <sandakersmann@users.noreply.github.com>
Co-authored-by: Felföldi Zsolt <zsfelfoldi@gmail.com>
Co-authored-by: Ford <153042616+guerrierindien@users.noreply.github.com>
Co-authored-by: Ursulafe <152976968+Ursulafe@users.noreply.github.com>
Co-authored-by: Elias Rad <146735585+nnsW3@users.noreply.github.com>
Co-authored-by: FletcherMan <fanciture@163.com>
Co-authored-by: alex <152680487+bodhi-crypo@users.noreply.github.com>
Co-authored-by: Sebastian Stammler <seb@oplabs.co>
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected trie node error occurs after initial snap sync
3 participants