Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core, core/rawdb, eth/sync: no tx indexing during snap sync #28703

Merged
merged 11 commits into from
Jan 22, 2024

Conversation

rjl493456442
Copy link
Member

@rjl493456442 rjl493456442 commented Dec 19, 2023

This pull request simplifies the logic for indexing transactions and enhances the UX when transaction is not found by returning more information to users.

Originally Geth will construct the transaction indexes when node is still in snap sync. This approach can ensure all the required transaction indexes are available once the sync is finished. However, it introduces lots of complexity to index transaction correctly.

In ticket #28673 an issue is found that when a node is initialized with an external ancient store and runs a snap sync with history.transactions = 0, the indexes of ancient part will be lost, or another word, these ancient blocks are never indexed.


In order to simplify the entire process, this pull request changes the strategy to construct transaction indexes only once the snap sync is finished.

Although we lose the guarantee that transaction indexes are available immediately once the snap sync is finished, but Geth can return a message to users if the specified transaction is not found and transaction indexing is still going on in the RPC response to reduce confusion.

Besides, the transaction indexing is considered as a part of the initial sync, eth.syncing wil still be true if the background transaction indexing is not finished yet. API consumers can following the syncing status to determine if the node is usable.

e.g. If the node is still syncing, the transaction indexing progress will also be attached.

To exit, press ctrl-d or type exit
> eth.syncing
{
  currentBlock: 19060318,
  healedBytecodeBytes: 0,
  healedBytecodes: 0,
  healedTrienodeBytes: 0,
  healedTrienodes: 0,
  healingBytecode: 0,
  healingTrienodes: 0,
  highestBlock: 19060400,
  startingBlock: 19060298,
  syncedAccountBytes: 36505370223,
  syncedAccounts: 169452356,
  syncedBytecodeBytes: 5977956986,
  syncedBytecodes: 871976,
  syncedStorage: 788911086,
  syncedStorageBytes: 175632499638,
  txIndexFinishedBlocks: 0,
  txIndexRemainingBlocks: 2350000
}

> eth.syncing
{
  currentBlock: 19060773,
  healedBytecodeBytes: 261340,
  healedBytecodes: 43,
  healedTrienodeBytes: 193350007,
  healedTrienodes: 666738,
  healingBytecode: 0,
  healingTrienodes: 0,
  highestBlock: 19060684,
  startingBlock: 19060682,
  syncedAccountBytes: 50474316442,
  syncedAccounts: 233879357,
  syncedBytecodeBytes: 8108706003,
  syncedBytecodes: 1164773,
  syncedStorage: 1104622696,
  syncedStorageBytes: 246053347118,
  txIndexFinishedBlocks: 2347665,
  txIndexRemainingBlocks: 2335
}
> eth.syncing
false

I have tested the performance for indexing the entire ethereum mainnet chain, it will take roughly 2 hours. And for default setting(2.35m blocks), it takes 20 minutes. It means for the first ~2h time after initial sync, the transaction might not be available from RPC.


Transaction index performance

Index 2.35M blocks needs 20 minutes.

INFO [12-20|05:34:00.558] Indexed transactions blocks=2,350,000 txs=345,795,538 tail=16,474,969 elapsed=20m26.970s

Index the first 16M blocks needs 1h35m.

INFO [12-20|09:31:10.757] Indexed transactions blocks=16,475,763 txs=1,850,658,458 tail=0 elapsed=1h35m48.611s

Index the entire mainnet chain needs ~2h


Index progress info when tx is not found

> eth.getTransaction("0x6d82842cf80d3c1a35cb03e5e52b98c1968983223422d01d1b571b5ef0983b19")
Error: transaction indexing is in progress
	at web3.js:6367:9(39)
	at send (web3.js:5101:62(29))
	at <eval>:1:19(3)

> debug.traceTransaction("0x6d82842cf80d3c1a35cb03e5e52b98c1968983223422d01d1b571b5ef0983b19")
Error: transaction indexing is in progress
	at web3.js:6367:9(39)
	at send (web3.js:5101:62(29))
	at <eval>:1:23(3)

> eth.getTransactionReceipt("0x6d82842cf80d3c1a35cb03e5e52b98c1968983223422d01d1b571b5ef0983b19")
Error: transaction indexing is in progress
	at web3.js:6367:9(39)
	at send (web3.js:5101:62(29))
	at <eval>:1:26(3)

> eth.getRawTransaction("0x681ce931a6897452e27c1fe3754e046991c7834b93c77038e4ae534c50854da7")
Error: transaction indexing is in progress
	at web3.js:6367:9(39)
	at send (web3.js:5101:62(29))
	at <eval>:1:22(3)

Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from one minor flaw (?), this looks good to me. I have only eyed through it, not activel tested it.

core/blockchain.go Show resolved Hide resolved
return tx, blockHash, blockNumber, index, nil
lookup, tx, err := b.eth.blockchain.GetTransactionLookup(txHash)
if err != nil {
return nil, common.Hash{}, 0, 0, fmt.Errorf("tx is not existent or not indexed, %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is a behavorial change for the eth_getTransactionByHash RPC, which previously returned null in this case. Which has a couple ramifications that should be considered:

  • I imagine fair number of upstream projects will be "surprised" by this RPC now potentially returning errors. For example I'm pretty sure most wallets use this method to determine whether a newly broadcast tx has reached the public mempool.
  • The error "leaks" configuration settings of the node (the index head and tail).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, @fjl do you have any suggestion about the RPC behavioral change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example I'm pretty sure most wallets use this method to determine whether a newly broadcast tx has reached the public mempool.

If the tx is in the txpool, then it will be found for sure,
If the tx is not in the txpool yet, the error will be returned instead of returning nil, as you said.

I will ask team about their opinions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can't have this error. getTransactionByHash has to return null when the tx is not known to the node. It can only return an error if there is an internal error accessing the database or something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, it is an internal error from one perspective: the indexing is not done so the results are not reliable. From that perspective, the error message feels correct.

Tho I wouldn't add that much info since it's hard to interpret. Probably "transactions still indexing" or something along those lines is enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possible solution we could do is to make indexing part of the sync. I.e. after snap sync is "done", we could have a next phase with indexing, whilst eth_syncing does not return false.

Still, the quirk is that the current block currently kind of means that everything that's needed has been done. On the other hand, IMO a syncing node should not be just dropped into prod serving RPC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main point is, we can't just return an error when the tx doesn't exist. We can return an error while we are still indexing, but the code here doesn't do that. If we decide to add an error during indexing, it would have to be identifiable by error code. This a decision affecting all API consumers and also proxy/LB implementations like Infura's. They all need to change their behavior to retry the lookup later or on a different node.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed the code.

The error is only returned if the transactions are not fully indexed.

it would have to be identifiable by error code

It's on my todo list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I still don't like about it, is that this would be a geth-specific error that all API consumers must pay attention to if they want to react to it properly. It really only makes sense if we can make eth_syncing return a sync status at the same time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@ryanschneider
Copy link
Contributor

FYI I confirmed here that this fixes the original issue: #28673 (comment)

@fjl fjl added this to the 1.13.10 milestone Jan 10, 2024
@holiman holiman modified the milestones: 1.13.10, 1.13.11 Jan 12, 2024
eth/api_backend.go Outdated Show resolved Hide resolved
@holiman
Copy link
Contributor

holiman commented Jan 16, 2024

Triage discussion

  • We should make it return an error iff the tx is not indexed, and indexing is ongoing
  • We should delay settingsyncing=false until indexing is finished

core/blockchain_reader.go Outdated Show resolved Hide resolved
Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@holiman holiman merged commit 78a3c32 into ethereum:master Jan 22, 2024
3 checks passed
Copy link
Contributor

@ryanschneider ryanschneider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Dergarcon pushed a commit to specialmechanisms/mev-geth-0x2mev that referenced this pull request Jan 31, 2024
…#28703)

This change simplifies the logic for indexing transactions and enhances the UX when transaction is not found by returning more information to users.

Transaction indexing is now considered as a part of the initial sync, and `eth.syncing` will thus be `true` if transaction indexing is not yet finished. API consumers can use the syncing status to determine if the node is ready to serve users.
return tx.MarshalBinary()
}

// GetTransactionReceipt returns the transaction receipt for the given transaction hash.
func (s *TransactionAPI) GetTransactionReceipt(ctx context.Context, hash common.Hash) (map[string]interface{}, error) {
tx, blockHash, blockNumber, index, err := s.b.GetTransaction(ctx, hash)
if tx == nil || err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could geth team reconsider this change?

It breaks hardhat.

An unexpected error occurred:

ProviderError: transaction indexing is in progress
    at HttpProvider.request (/workspace/node_modules/hardhat/src/internal/core/providers/http.ts:96:21)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async EIP1193JsonRpcClient.getTransactionReceipt (/workspace/node_modules/@nomicfoundation/ignition-core/src/internal/execution/jsonrpc-client.ts:569:22)
    at async Promise.all (index 1)
    at async monitorOnchainInteraction (/workspace/node_modules/@nomicfoundation/ignition-core/src/internal/execution/future-processor/handlers/monitor-onchain-interaction.ts:110:28)
    at async FutureProcessor.processFuture (/workspace/node_modules/@nomicfoundation/ignition-core/src/internal/execution/future-processor/future-processor.ts:116:9)
    at async ExecutionEngine._executeBatch (/workspace/node_modules/@nomicfoundation/ignition-core/src/internal/execution/execution-engine.ts:153:30)
    at async ExecutionEngine.executeModule (/workspace/node_modules/@nomicfoundation/ignition-core/src/internal/execution/execution-engine.ts:114:25)
    at async Deployer.deploy (/workspace/node_modules/@nomicfoundation/ignition-core/src/internal/deployer.ts:194:25)
    at async SimpleTaskDefinition.action (/workspace/node_modules/@nomicfoundation/hardhat-ignition/src/index.ts:302:24)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also unhappy with this, and there have been quite a few reports from across the ecosystem where this PR caused subtle issues. However, we ultimately decided to go ahead with the PR because we also changed eth_syncing to take indexing into account. Not sure if that works for hardhat...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better way is to return nil as before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must to distinguish two different scenarios here:

  • The requested transaction/receipt is existent but not indexed yet
  • The requested transaction/receipt is unknown to node

Therefore, we explicitly return an error if the background indexing is still in progress
and the requested transaction is not indexed yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so.

We must

It's a breaking change, geth is a dominant ethereum client, every breaking change needs to consider backward compatibility.

The requested transaction/receipt is existent but not indexed yet

The code just checks if the index is running, it means geth can't know if the transaction is indexing or not, what did I miss?

	tx, blockHash, blockNumber, txIndex := rawdb.ReadTransaction(bc.db, hash)
	if tx == nil {
		progress, err := bc.TxIndexProgress()
		if err != nil {
			return nil, nil, nil
		}
		// The transaction indexing is not finished yet, returning an
		// error to explicitly indicate it.
		if !progress.Done() {
			return nil, nil, errors.New("transaction indexing still in progress")
		}
		// The transaction is already indexed, the transaction is either
		// not existent or not in the range of index, returning null.
		return nil, nil, nil
	}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In both scenarios mentioned, no transaction index will be found. The error will simply state, Please try the RPC later once background indexing is fully complete.

Since Geth is a dominant Ethereum client, every breaking change must consider backward compatibility.

That's why we've modified the eth_syncing endpoint. Transaction indexing is also taken into account and will only return "synced" once all indexing is complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants