Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/ethapi: optimize & clean up EstimateGas #27710

Merged
merged 1 commit into from
Aug 18, 2023

Conversation

roberto-bayardo
Copy link
Contributor

@roberto-bayardo roberto-bayardo commented Jul 12, 2023

Optimizations:

  • Currently if a transaction is reverting, EstimateGas will exhibit worst-case behavior and binary search up to the max gas limit (~40 state-clone + tx executions). This PR allows EstimateGas to return after only a single unconstrained execution in this scenario.
  • Uses the gas used from the unconstrained execution to bias the remaining binary search towards the likely solution in a simple way that doesn't impact the worst case. For a typical contract-invoking transaction, this reduces the median number of state-clone+executions from 25 to 18 (28% reduction).

Cleanup:

  • added & improved function + code comments
  • correct the EstimateGas documentation to clarify the gas limit determination is at latest block, not pending, if the blockNr is unspecified.

The performance analysis was conducted over 2300 contract-invoking transactions randomly sampled (via tx hash starting with "00") from the mempool over the past 24 hours. These txs were put through the new implementation & old implementation with the exact same state when the tx hit the mempool, and the # of executions for each recorded. The output from both calls was also compared to ensure they were equivalent. There were a few (<< 1%) instances where the values differed, upon further investigation these were due to transactions that monitor gas-remaining and attempt to adjust computation accordingly (such as batch NFT minting, example).

// If the error is not nil(consensus error), it means the provided message
// call or transaction will never be accepted no matter how much gas it is
// assigned. Return the error directly, don't struggle any more.
if mid > lo*2 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is likely room for further average-case improvement by tweaking (lowering) the constant multiplier here, but this would worsen the worst-case and also introduces new boundary conditions that would need to be checked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of an acceptance that there are a much larger number of transactions on the low size, 21k - 500k gas?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bit of both I guess: most txs don't need much higher gas limit than their gas used, and most txs don't require near the full block limit of gas. I guess this trick takes advantage of both facts. (FWIW EstimateGas doesn't seem to be called for simple 21k gas transfers by wallets like MetaMask so I excluded those from the analysis)

@roberto-bayardo roberto-bayardo marked this pull request as ready for review July 12, 2023 20:55
internal/ethapi/api.go Outdated Show resolved Hide resolved
internal/ethapi/api.go Outdated Show resolved Hide resolved
// probably don't want to return a lowest possible gas limit for these cases anyway.
lo = gasUsed - 1

// Binary search for the smallest gas limit that allows the tx to execute successfully.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really so much binary search if you adjust the mid point yeah?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a pure binary search after 1-2 doublings of 'lo' in the typical case, lg(n) in the worst. Wasn't sure it's worth clarifying that detail, but I certainly could if it helps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's still basically a binary search, it divides up the range of possible outcomes every iteration, and discards one of the two subranges. The trick here is that the point where the range is divided is not on on the arithmetic middle.

Some clarifying might be good, e.g. what you wrote below, something like

most txs don't need much higher gas limit than their gas used, and most txs don't require near the full block limit of gas, so the selection of where to bisect the range is skewed to favour the low side.

Copy link
Contributor Author

@roberto-bayardo roberto-bayardo Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added, though at the line where the midpoint is adjusted rather than here.

// If the error is not nil(consensus error), it means the provided message
// call or transaction will never be accepted no matter how much gas it is
// assigned. Return the error directly, don't struggle any more.
if mid > lo*2 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of an acceptance that there are a much larger number of transactions on the low size, 21k - 500k gas?

internal/ethapi/api.go Outdated Show resolved Hide resolved
@karalabe
Copy link
Member

Whilst I understand that this code might make things a bit faster, but they also make it a lot more complicated. Do we have a specific use case where we need that extra drop of speed?

@karalabe
Copy link
Member

Another interesting optimization we could consider is whether it's worth it to hone in on the exact gas usage needed, or if something "close enough" is enough.

I.e. By binary searching until hi == lo + 1, we're wasting a lot of executions to get the exact to-the-gas accuracy. Most wallets will just add some random number on top to account for network conditions. And I don't think it's relevant in the grand scheme of things when trying to fit txs into a block.

Lowering to accuracy from 1 gas to 1024 gas for example would save 10 iterations of the binary search.

@roberto-bayardo
Copy link
Contributor Author

roberto-bayardo commented Jul 13, 2023

Whilst I understand that this code might make things a bit faster, but they also make it a lot more complicated. Do we have a specific use case where we need that extra drop of speed?

My main goal was to improve the handling around handling tx revert, which contributed to a recent issue I encountered that ending up fixed here.

Bulk of the change to this end is around moving the unconstrained execution up front instead leaving it to the end, which I felt makes the impl more understandable.

The additional complexity I did end up adding is ~4 LOC (L1228 initialization of lo, and the the midpoint adjustment in the main loop) -- seemed worth the savings given this is a very expensive call. Figured RPC providers that make use of this would benefit.

@roberto-bayardo
Copy link
Contributor Author

Another interesting optimization we could consider is whether it's worth it to hone in on the exact gas usage needed, or if something "close enough" is enough.

I.e. By binary searching until hi == lo + 1, we're wasting a lot of executions to get the exact to-the-gas accuracy. Most wallets will just add some random number on top to account for network conditions. And I don't think it's relevant in the grand scheme of things when trying to fit txs into a block.

Lowering to accuracy from 1 gas to 1024 gas for example would save 10 iterations of the binary search.

Agree that the "lowest possible" aspect of this function is probably unnecessary.


// ErrorCode returns the JSON error code for a revertal.
// See: https://github.com/ethereum/wiki/wiki/JSON-RPC-Error-Codes-Improvement-Proposal
func (e *revertError) ErrorCode() int {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ErrorCode and ErrorData methods are used by the RPC server and will be returned as part of the JSON-RPC response. So they need to be kept here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahah, restored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahah, restored.

Is it? I don't see it??

Copy link
Contributor Author

@roberto-bayardo roberto-bayardo Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird, I wonder if I undid it when I merged with upstream. let me try again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got it this time :/

@fjl fjl changed the title optimize & clean up EstimateGas internal/ethapi: optimize & clean up EstimateGas Jul 18, 2023
@fjl
Copy link
Contributor

fjl commented Jul 18, 2023

It'd be nice to have the script you used to sample the transactions. That way we could also repeat your analysis.

@roberto-bayardo roberto-bayardo force-pushed the estimate-gas branch 2 times, most recently from 54f14c3 to aac3cdf Compare July 18, 2023 13:53
@roberto-bayardo
Copy link
Contributor Author

roberto-bayardo commented Jul 18, 2023

It'd be nice to have the script you used to sample the transactions. That way we could also repeat your analysis.

there is no standalone script unfortunately. the analysis was over logs from a hacked client which invoked both old & new implementations for each call and logged the # of iterations of both. I then added a hook in the txpool to call EstimateGas for each sampled transaction, and let it run over night before extracting the log data and dumping into a spreadsheet.

@antonydenyer
Copy link

antonydenyer commented Jul 31, 2023

There's a reasonably good test in besu that covers some funny edge cases with the 1/64th rule TestDepth.sol

    depth(1) == 45554
    depth(2) == 47387
    depth(3) == 49249
    depth(4) == 51141
    depth(5) == 53063
    depth(10) == 63139
    depth(65) == 246462

https://github.com/hyperledger/besu/blob/main/acceptance-tests/tests/src/test/java/org/hyperledger/besu/tests/acceptance/jsonrpc/EthEstimateGasAcceptanceTest.java#L54

Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, some minor nits, and I wonder if @fjl's comments were really addressed?

Comment on lines 1174 to 1175
err = overrides.Apply(state)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err = overrides.Apply(state)
if err != nil {
if err := overrides.Apply(state); err != nil {

I nkow you just lifted it up, but still...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// probably don't want to return a lowest possible gas limit for these cases anyway.
lo = gasUsed - 1

// Binary search for the smallest gas limit that allows the tx to execute successfully.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's still basically a binary search, it divides up the range of possible outcomes every iteration, and discards one of the two subranges. The trick here is that the point where the range is divided is not on on the arithmetic middle.

Some clarifying might be good, e.g. what you wrote below, something like

most txs don't need much higher gas limit than their gas used, and most txs don't require near the full block limit of gas, so the selection of where to bisect the range is skewed to favour the low side.

@fjl fjl removed the status:triage label Aug 15, 2023
@fjl
Copy link
Contributor

fjl commented Aug 15, 2023

My concerns were adressed. I can't really comment on the binary search thing, seems alright though.

@holiman holiman added this to the 1.13.0 milestone Aug 18, 2023
Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@holiman holiman merged commit 950ccdd into ethereum:master Aug 18, 2023
1 check passed
devopsbo3 pushed a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
Optimizations:

- Previously, if a transaction was reverting, EstimateGas would exhibit worst-case behavior and binary search up to the max gas limit (~40 state-clone + tx executions). This change allows EstimateGas to return after only a single unconstrained execution in this scenario.
- Uses the gas used from the unconstrained execution to bias the remaining binary search towards the likely solution in a simple way that doesn't impact the worst case. For a typical contract-invoking transaction, this reduces the median number of state-clone+executions from 25 to 18 (28% reduction).

Cleanup:

- added & improved function + code comments
- correct the EstimateGas documentation to clarify the gas limit determination is at latest block, not pending, if the blockNr is unspecified.
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
sduchesneau pushed a commit to streamingfast/go-ethereum that referenced this pull request Feb 26, 2024
 Conflicts:
  internal/ethapi/api.go:
We modified call to DoCall, upstream moved it to an internal function,
kept our modifications in the new place.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants