core: add code to witness when state object is accessed #30698

lightclient · 2024-10-30T00:24:13Z

I think the core code should generally be agnostic about the witness and the statedb layer should determine what elements need to be included in the witness. Because code is accessed via GetCode, and GetCodeLength, the statedb will always know when it needs to add that code into the witness.

The edge case is block hashes, so we continue to add them manually in the implementation of BLOCKHASH.

It probably makes sense to refactor statedb so we have a wrapped implementation that accumulates the witness, but this is a simpler change that makes #30078 less aggressive.

jwasinger

One edge case I can think of where this could unnecessarily include code in the witness is accessing the balance of an account via the BALANCE opcode. There may be others, but I'm not sure.

That being said, the motivation behind this makes sense to me, it just seems like it may require a bit more refactoring within the statedb to get it right.

jwasinger · 2024-10-30T08:07:09Z

I guess we don't need to attach code unless it's executed or the length is needed. The codehash can be trusted because we already trust the parent root.

karalabe · 2024-10-30T08:29:39Z

I'm not sure this change is all that good. The original code does the minimum work and gathers the minimum data. This PR on one hand collects more data than needed (e.g. @jwasinger's case) and also will hit this code load path on every storage slot modification (we "get" the account as the first step).

karalabe · 2024-10-30T08:30:50Z

Note, I'm happy to move the collector methods form the opcodes into the respective statedb calls, just not this high up the stack imo.

namiloh · 2024-10-30T08:56:51Z

Did you try putting it in a separate layer? Curious how that would work...

lightclient · 2024-10-30T14:13:21Z

Ah right I put the collection way too high up in the stack. I've move it to GetCode and GetCodeSize.

Did you try putting it in a separate layer? Curious how that would work...

I started doing this and then realized I would need to come up with a more comprehensive interface for statedb since core/blockchain simply operates on *state.StateDB. It still seems possible to make it as a wrapper, but at least to fix separation concerns that were affecting my other PR, this change here should be enough.

rjl493456442

sgtm

holiman

These changes looks good to me and I approve.

It's a bit annoying, however, that remaining little witness-invocation inside the instructions.

I took a stab at removing it, got this far:

diff --git a/core/evm.go b/core/evm.go
index 5d3c454d7c..f700c54c9c 100644
--- a/core/evm.go
+++ b/core/evm.go
@@ -22,6 +22,7 @@ import (
 	"github.com/ethereum/go-ethereum/common"
 	"github.com/ethereum/go-ethereum/consensus"
 	"github.com/ethereum/go-ethereum/consensus/misc/eip4844"
+	"github.com/ethereum/go-ethereum/core/stateless"
 	"github.com/ethereum/go-ethereum/core/tracing"
 	"github.com/ethereum/go-ethereum/core/types"
 	"github.com/ethereum/go-ethereum/core/vm"
@@ -65,7 +66,7 @@ func NewEVMBlockContext(header *types.Header, chain ChainContext, author *common
 	return vm.BlockContext{
 		CanTransfer: CanTransfer,
 		Transfer:    Transfer,
-		GetHash:     GetHashFn(header, chain),
+		GetHash:     GetHashFn(header, chain, nil),
 		Coinbase:    beneficiary,
 		BlockNumber: new(big.Int).Set(header.Number),
 		Time:        header.Time,
@@ -91,12 +92,11 @@ func NewEVMTxContext(msg *Message) vm.TxContext {
 }
 
 // GetHashFn returns a GetHashFunc which retrieves header hashes by number
-func GetHashFn(ref *types.Header, chain ChainContext) func(n uint64) common.Hash {
+func GetHashFn(ref *types.Header, chain ChainContext, witness *stateless.Witness) func(n uint64) common.Hash {
 	// Cache will initially contain [refHash.parent],
 	// Then fill up with [refHash.p, refHash.pp, refHash.ppp, ...]
 	var cache []common.Hash
-
-	return func(n uint64) common.Hash {
+	getter := func(n uint64) common.Hash {
 		if ref.Number.Uint64() <= n {
 			// This situation can happen if we're doing tracing and using
 			// block overrides.
@@ -127,6 +127,16 @@ func GetHashFn(ref *types.Header, chain ChainContext) func(n uint64) common.Hash
 		}
 		return common.Hash{}
 	}
+	// If we need to build a witness: wrap the getter and also invoke the witness.
+	// Note: this method is only invoked with santized values in the allowed
+	// range of block numbers.
+	if witness != nil {
+		return func(n uint64) common.Hash {
+			witness.AddBlockHash(n)
+			return getter(n)
+		}
+	}
+	return getter
 }
 
 // CanTransfer checks whether there are enough funds in the address' account to make a transfer.
diff --git a/core/vm/instructions.go b/core/vm/instructions.go
index 45139e8217..c5dda9e4d5 100644
--- a/core/vm/instructions.go
+++ b/core/vm/instructions.go
@@ -445,9 +445,6 @@ func opBlockhash(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) (
 	}
 	if num64 >= lower && num64 < upper {
 		res := interpreter.evm.Context.GetHash(num64)
-		if witness := interpreter.evm.StateDB.Witness(); witness != nil {
-			witness.AddBlockHash(num64)
-		}
 		num.SetBytes(res[:])
 	} else {
 		num.Clear()
diff --git a/core/vm/interface.go b/core/vm/interface.go
index 9229f4d2cd..3fb5dc6ae7 100644
--- a/core/vm/interface.go
+++ b/core/vm/interface.go
@@ -20,7 +20,6 @@ import (
 	"math/big"
 
 	"github.com/ethereum/go-ethereum/common"
-	"github.com/ethereum/go-ethereum/core/stateless"
 	"github.com/ethereum/go-ethereum/core/tracing"
 	"github.com/ethereum/go-ethereum/core/types"
 	"github.com/ethereum/go-ethereum/params"
@@ -94,8 +93,6 @@ type StateDB interface {
 	AddLog(*types.Log)
 	AddPreimage(common.Hash, []byte)
 
-	Witness() *stateless.Witness
-
 	// Finalise must be invoked at the end of a transaction
 	Finalise(bool)
 }

If we drop that one, we can drop Witness from the interface. However, what's lacking in my code above is that when we construct the GetHash function, we don't have access to the witness. We typically construct the block context prior to instantiating a statedb.

However, wouldn't it be logical to have the block witness be a part of the block context?

I don't want to just graft a thing onto another thing, to make it work, but actually make sure to place it where it is sensible. WDYT? Can we take this PR one step further?

I think the core code should generally be agnostic about the witness and the statedb layer should determine what elements need to be included in the witness. Because code is accessed via `GetCode`, and `GetCodeLength`, the statedb will always know when it needs to add that code into the witness. The edge case is block hashes, so we continue to add them manually in the implementation of `BLOCKHASH`. It probably makes sense to refactor statedb so we have a wrapped implementation that accumulates the witness, but this is a simpler change that makes ethereum#30078 less aggressive.

I think the core code should generally be agnostic about the witness and the statedb layer should determine what elements need to be included in the witness. Because code is accessed via `GetCode`, and `GetCodeLength`, the statedb will always know when it needs to add that code into the witness. The edge case is block hashes, so we continue to add them manually in the implementation of `BLOCKHASH`. It probably makes sense to refactor statedb so we have a wrapped implementation that accumulates the witness, but this is a simpler change that makes #30078 less aggressive.

lightclient requested review from karalabe, holiman and rjl493456442 as code owners October 30, 2024 00:24

jwasinger reviewed Oct 30, 2024

View reviewed changes

core: add code to witness when state object is accessed

1b91ab4

lightclient force-pushed the statedb-witness-add-code branch from 0edfb0f to 1b91ab4 Compare October 30, 2024 14:09

rjl493456442 approved these changes Oct 31, 2024

View reviewed changes

rjl493456442 added this to the 1.14.12 milestone Oct 31, 2024

jwasinger approved these changes Oct 31, 2024

View reviewed changes

holiman approved these changes Oct 31, 2024

View reviewed changes

fjl assigned holiman, rjl493456442, karalabe and jwasinger and unassigned holiman and rjl493456442 Oct 31, 2024

karalabe merged commit 9afb18d into ethereum:master Oct 31, 2024
3 checks passed

BrewTestBot mentioned this pull request Nov 19, 2024

ethereum 1.14.12 Homebrew/homebrew-core#198233

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core: add code to witness when state object is accessed #30698

core: add code to witness when state object is accessed #30698

lightclient commented Oct 30, 2024 •

edited

Loading

jwasinger left a comment •

edited

Loading

jwasinger commented Oct 30, 2024

karalabe commented Oct 30, 2024

karalabe commented Oct 30, 2024

namiloh commented Oct 30, 2024

lightclient commented Oct 30, 2024

rjl493456442 left a comment

holiman left a comment

core: add code to witness when state object is accessed #30698

core: add code to witness when state object is accessed #30698

Conversation

lightclient commented Oct 30, 2024 • edited Loading

jwasinger left a comment • edited Loading

Choose a reason for hiding this comment

jwasinger commented Oct 30, 2024

karalabe commented Oct 30, 2024

karalabe commented Oct 30, 2024

namiloh commented Oct 30, 2024

lightclient commented Oct 30, 2024

rjl493456442 left a comment

Choose a reason for hiding this comment

holiman left a comment

Choose a reason for hiding this comment

lightclient commented Oct 30, 2024 •

edited

Loading

jwasinger left a comment •

edited

Loading