core, core/rawdb, eth/sync: no tx indexing during snap sync #28703

rjl493456442 · 2023-12-19T03:56:48Z

This pull request simplifies the logic for indexing transactions and enhances the UX when transaction is not found by returning more information to users.

Originally Geth will construct the transaction indexes when node is still in snap sync. This approach can ensure all the required transaction indexes are available once the sync is finished. However, it introduces lots of complexity to index transaction correctly.

In ticket #28673 an issue is found that when a node is initialized with an external ancient store and runs a snap sync with history.transactions = 0, the indexes of ancient part will be lost, or another word, these ancient blocks are never indexed.

In order to simplify the entire process, this pull request changes the strategy to construct transaction indexes only once the snap sync is finished.

Although we lose the guarantee that transaction indexes are available immediately once the snap sync is finished, but Geth can return a message to users if the specified transaction is not found and transaction indexing is still going on in the RPC response to reduce confusion.

Besides, the transaction indexing is considered as a part of the initial sync, eth.syncing wil still be true if the background transaction indexing is not finished yet. API consumers can following the syncing status to determine if the node is usable.

e.g. If the node is still syncing, the transaction indexing progress will also be attached.

To exit, press ctrl-d or type exit
> eth.syncing
{
  currentBlock: 19060318,
  healedBytecodeBytes: 0,
  healedBytecodes: 0,
  healedTrienodeBytes: 0,
  healedTrienodes: 0,
  healingBytecode: 0,
  healingTrienodes: 0,
  highestBlock: 19060400,
  startingBlock: 19060298,
  syncedAccountBytes: 36505370223,
  syncedAccounts: 169452356,
  syncedBytecodeBytes: 5977956986,
  syncedBytecodes: 871976,
  syncedStorage: 788911086,
  syncedStorageBytes: 175632499638,
  txIndexFinishedBlocks: 0,
  txIndexRemainingBlocks: 2350000
}

> eth.syncing
{
  currentBlock: 19060773,
  healedBytecodeBytes: 261340,
  healedBytecodes: 43,
  healedTrienodeBytes: 193350007,
  healedTrienodes: 666738,
  healingBytecode: 0,
  healingTrienodes: 0,
  highestBlock: 19060684,
  startingBlock: 19060682,
  syncedAccountBytes: 50474316442,
  syncedAccounts: 233879357,
  syncedBytecodeBytes: 8108706003,
  syncedBytecodes: 1164773,
  syncedStorage: 1104622696,
  syncedStorageBytes: 246053347118,
  txIndexFinishedBlocks: 2347665,
  txIndexRemainingBlocks: 2335
}
> eth.syncing
false

I have tested the performance for indexing the entire ethereum mainnet chain, it will take roughly 2 hours. And for default setting(2.35m blocks), it takes 20 minutes. It means for the first ~2h time after initial sync, the transaction might not be available from RPC.

Transaction index performance

Index 2.35M blocks needs 20 minutes.

INFO [12-20|05:34:00.558] Indexed transactions blocks=2,350,000 txs=345,795,538 tail=16,474,969 elapsed=20m26.970s

Index the first 16M blocks needs 1h35m.

INFO [12-20|09:31:10.757] Indexed transactions blocks=16,475,763 txs=1,850,658,458 tail=0 elapsed=1h35m48.611s

Index the entire mainnet chain needs ~2h

Index progress info when tx is not found

> eth.getTransaction("0x6d82842cf80d3c1a35cb03e5e52b98c1968983223422d01d1b571b5ef0983b19")
Error: transaction indexing is in progress
	at web3.js:6367:9(39)
	at send (web3.js:5101:62(29))
	at <eval>:1:19(3)

> debug.traceTransaction("0x6d82842cf80d3c1a35cb03e5e52b98c1968983223422d01d1b571b5ef0983b19")
Error: transaction indexing is in progress
	at web3.js:6367:9(39)
	at send (web3.js:5101:62(29))
	at <eval>:1:23(3)

> eth.getTransactionReceipt("0x6d82842cf80d3c1a35cb03e5e52b98c1968983223422d01d1b571b5ef0983b19")
Error: transaction indexing is in progress
	at web3.js:6367:9(39)
	at send (web3.js:5101:62(29))
	at <eval>:1:26(3)

> eth.getRawTransaction("0x681ce931a6897452e27c1fe3754e046991c7834b93c77038e4ae534c50854da7")
Error: transaction indexing is in progress
	at web3.js:6367:9(39)
	at send (web3.js:5101:62(29))
	at <eval>:1:22(3)

holiman

Aside from one minor flaw (?), this looks good to me. I have only eyed through it, not activel tested it.

core/blockchain.go

ryanschneider · 2023-12-20T21:16:10Z

eth/api_backend.go

-	return tx, blockHash, blockNumber, index, nil
+	lookup, tx, err := b.eth.blockchain.GetTransactionLookup(txHash)
+	if err != nil {
+		return nil, common.Hash{}, 0, 0, fmt.Errorf("tx is not existent or not indexed, %w", err)


Note that this is a behavorial change for the eth_getTransactionByHash RPC, which previously returned null in this case. Which has a couple ramifications that should be considered:

I imagine fair number of upstream projects will be "surprised" by this RPC now potentially returning errors. For example I'm pretty sure most wallets use this method to determine whether a newly broadcast tx has reached the public mempool.

The error "leaks" configuration settings of the node (the index head and tail).

Good point, @fjl do you have any suggestion about the RPC behavioral change?

For example I'm pretty sure most wallets use this method to determine whether a newly broadcast tx has reached the public mempool.

If the tx is in the txpool, then it will be found for sure,
If the tx is not in the txpool yet, the error will be returned instead of returning nil, as you said.

I will ask team about their opinions.

Yeah we can't have this error. getTransactionByHash has to return null when the tx is not known to the node. It can only return an error if there is an internal error accessing the database or something.

I mean, it is an internal error from one perspective: the indexing is not done so the results are not reliable. From that perspective, the error message feels correct.

Tho I wouldn't add that much info since it's hard to interpret. Probably "transactions still indexing" or something along those lines is enough.

One possible solution we could do is to make indexing part of the sync. I.e. after snap sync is "done", we could have a next phase with indexing, whilst eth_syncing does not return false.

Still, the quirk is that the current block currently kind of means that everything that's needed has been done. On the other hand, IMO a syncing node should not be just dropped into prod serving RPC.

The main point is, we can't just return an error when the tx doesn't exist. We can return an error while we are still indexing, but the code here doesn't do that. If we decide to add an error during indexing, it would have to be identifiable by error code. This a decision affecting all API consumers and also proxy/LB implementations like Infura's. They all need to change their behavior to retry the lookup later or on a different node.

I have fixed the code.

The error is only returned if the transactions are not fully indexed.

it would have to be identifiable by error code

It's on my todo list.

What I still don't like about it, is that this would be a geth-specific error that all API consumers must pay attention to if they want to react to it properly. It really only makes sense if we can make eth_syncing return a sync status at the same time.

ryanschneider · 2024-01-03T17:03:11Z

FYI I confirmed here that this fixes the original issue: #28673 (comment)

eth/api_backend.go

holiman · 2024-01-16T13:51:22Z

Triage discussion

We should make it return an error iff the tx is not indexed, and indexing is ongoing
We should delay settingsyncing=false until indexing is finished

…nished

core/blockchain_reader.go

holiman

LGTM

ryanschneider

lgtm!

…#28703) This change simplifies the logic for indexing transactions and enhances the UX when transaction is not found by returning more information to users. Transaction indexing is now considered as a part of the initial sync, and `eth.syncing` will thus be `true` if transaction indexing is not yet finished. API consumers can use the syncing status to determine if the node is ready to serve users.

islishude · 2024-08-14T11:30:01Z

internal/ethapi/api.go

 	return tx.MarshalBinary()
 }

 // GetTransactionReceipt returns the transaction receipt for the given transaction hash.
 func (s *TransactionAPI) GetTransactionReceipt(ctx context.Context, hash common.Hash) (map[string]interface{}, error) {
-	tx, blockHash, blockNumber, index, err := s.b.GetTransaction(ctx, hash)
-	if tx == nil || err != nil {


Could geth team reconsider this change?

It breaks hardhat.

An unexpected error occurred: ProviderError: transaction indexing is in progress at HttpProvider.request (/workspace/node_modules/hardhat/src/internal/core/providers/http.ts:96:21) at processTicksAndRejections (node:internal/process/task_queues:95:5) at async EIP1193JsonRpcClient.getTransactionReceipt (/workspace/node_modules/@nomicfoundation/ignition-core/src/internal/execution/jsonrpc-client.ts:569:22) at async Promise.all (index 1) at async monitorOnchainInteraction (/workspace/node_modules/@nomicfoundation/ignition-core/src/internal/execution/future-processor/handlers/monitor-onchain-interaction.ts:110:28) at async FutureProcessor.processFuture (/workspace/node_modules/@nomicfoundation/ignition-core/src/internal/execution/future-processor/future-processor.ts:116:9) at async ExecutionEngine._executeBatch (/workspace/node_modules/@nomicfoundation/ignition-core/src/internal/execution/execution-engine.ts:153:30) at async ExecutionEngine.executeModule (/workspace/node_modules/@nomicfoundation/ignition-core/src/internal/execution/execution-engine.ts:114:25) at async Deployer.deploy (/workspace/node_modules/@nomicfoundation/ignition-core/src/internal/deployer.ts:194:25) at async SimpleTaskDefinition.action (/workspace/node_modules/@nomicfoundation/hardhat-ignition/src/index.ts:302:24)

I'm also unhappy with this, and there have been quite a few reports from across the ecosystem where this PR caused subtle issues. However, we ultimately decided to go ahead with the PR because we also changed eth_syncing to take indexing into account. Not sure if that works for hardhat...

A better way is to return nil as before.

We must to distinguish two different scenarios here:

The requested transaction/receipt is existent but not indexed yet

The requested transaction/receipt is unknown to node

Therefore, we explicitly return an error if the background indexing is still in progress
and the requested transaction is not indexed yet.

I don't think so.

We must

It's a breaking change, geth is a dominant ethereum client, every breaking change needs to consider backward compatibility.

The requested transaction/receipt is existent but not indexed yet

The code just checks if the index is running, it means geth can't know if the transaction is indexing or not, what did I miss?

tx, blockHash, blockNumber, txIndex := rawdb.ReadTransaction(bc.db, hash) if tx == nil { progress, err := bc.TxIndexProgress() if err != nil { return nil, nil, nil } // The transaction indexing is not finished yet, returning an // error to explicitly indicate it. if !progress.Done() { return nil, nil, errors.New("transaction indexing still in progress") } // The transaction is already indexed, the transaction is either // not existent or not in the range of index, returning null. return nil, nil, nil }

In both scenarios mentioned, no transaction index will be found. The error will simply state, Please try the RPC later once background indexing is fully complete.

Since Geth is a dominant Ethereum client, every breaking change must consider backward compatibility.

That's why we've modified the eth_syncing endpoint. Transaction indexing is also taken into account and will only return "synced" once all indexing is complete.

rjl493456442 force-pushed the fix-tx-indexer branch from 8235485 to 156e633 Compare December 19, 2023 06:17

rjl493456442 marked this pull request as ready for review December 20, 2023 12:48

rjl493456442 requested review from karalabe and holiman as code owners December 20, 2023 12:48

rjl493456442 mentioned this pull request Dec 20, 2023

bug: history.transactions flag not honored during resync (keeping ancients) #28673

Closed

holiman approved these changes Dec 20, 2023

View reviewed changes

core/blockchain.go Show resolved Hide resolved

ryanschneider reviewed Dec 20, 2023

View reviewed changes

rjl493456442 force-pushed the fix-tx-indexer branch from 9e73805 to 81f6cef Compare December 21, 2023 08:22

fjl added this to the 1.13.10 milestone Jan 10, 2024

holiman modified the milestones: 1.13.10, 1.13.11 Jan 12, 2024

holiman reviewed Jan 12, 2024

View reviewed changes

eth/api_backend.go Outdated Show resolved Hide resolved

karalabe added the status:triage label Jan 16, 2024

holiman removed the status:triage label Jan 16, 2024

rjl493456442 requested a review from s1na as a code owner January 18, 2024 08:19

rjl493456442 added 5 commits January 18, 2024 16:20

core, core/rawdb, eth/sync: no tx indexing during snap sync

8b1664d

core, eth, internal: return tx indexing progress

076ba50

core: remove useless test

93d7029

core: track the processed ancient blocks

0f3c1d0

eth, core, internal, graphql: return error only if indexing is not fi…

64196f9

…nished

rjl493456442 force-pushed the fix-tx-indexer branch from 91cca24 to 64196f9 Compare January 18, 2024 08:20

holiman reviewed Jan 18, 2024

View reviewed changes

core/blockchain_reader.go Outdated Show resolved Hide resolved

rjl493456442 added 6 commits January 19, 2024 14:06

internal, eth, core: return RPC error

4e0eb8b

core, eth, graphql, interfaces: improve eth.syncing

a4bb72e

core, eth: more fixes

811ed92

core, eth: minor fixes

aa21c1f

eth/downloader: better ux

9f13d29

internal/jsre/deps: fix web3 resolver

8f25093

holiman approved these changes Jan 22, 2024

View reviewed changes

holiman merged commit 78a3c32 into ethereum:master Jan 22, 2024
3 checks passed

ryanschneider reviewed Jan 22, 2024

View reviewed changes

dnhn mentioned this pull request Jan 24, 2024

ethereum 1.13.11 Homebrew/homebrew-core#160811

Merged

This was referenced Jan 25, 2024

core: reset tx lookup cache if necessary #28865

Merged

time to push first transaction in the network has increased since Monday #28877

Closed

tobidae-cb mentioned this pull request Jan 26, 2024

Transaction lookup using eth_getTransactionReceipt returns the wrong information after reorg #28885

Closed

rjl493456442 mentioned this pull request Feb 1, 2024

Waiting for a transaction fails on Geth 1.13.11 ethereum/web3.py#3212

Closed

islishude reviewed Aug 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core, core/rawdb, eth/sync: no tx indexing during snap sync #28703

core, core/rawdb, eth/sync: no tx indexing during snap sync #28703

rjl493456442 commented Dec 19, 2023 •

edited

Loading

holiman left a comment

ryanschneider Dec 20, 2023

rjl493456442 Dec 21, 2023

rjl493456442 Dec 21, 2023

fjl Jan 3, 2024

karalabe Jan 16, 2024

karalabe Jan 16, 2024

fjl Jan 16, 2024

rjl493456442 Jan 18, 2024

fjl Jan 18, 2024

rjl493456442 Jan 19, 2024

ryanschneider commented Jan 3, 2024

holiman commented Jan 16, 2024 •

edited by rjl493456442

Loading

holiman left a comment

ryanschneider left a comment

islishude Aug 14, 2024

fjl Aug 14, 2024

islishude Aug 15, 2024

rjl493456442 Aug 15, 2024

islishude Aug 15, 2024

rjl493456442 Aug 15, 2024

core, core/rawdb, eth/sync: no tx indexing during snap sync #28703

core, core/rawdb, eth/sync: no tx indexing during snap sync #28703

Conversation

rjl493456442 commented Dec 19, 2023 • edited Loading

holiman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ryanschneider commented Jan 3, 2024

holiman commented Jan 16, 2024 • edited by rjl493456442 Loading

holiman left a comment

Choose a reason for hiding this comment

ryanschneider left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rjl493456442 commented Dec 19, 2023 •

edited

Loading

holiman commented Jan 16, 2024 •

edited by rjl493456442

Loading