Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth/tracers: abort evm execution when trace is aborted #23580

Merged
merged 1 commit into from
Sep 16, 2021

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Sep 15, 2021

This PR tries to fix #23397. Geth goes OOM after
heavy transaction tracing.

The tracer has a Stop method, which sets a flag. This flag causes the tracer to stop doing work, as in no longer invoking
the javascript engine for each opcode. However, it does not actually stop the evm execution.

I tested with a little traceCall where we just run in a loop, burning 10M gas:

> a=eth.accounts[0];for(i=0;i<10; i++){try{debug.traceCall({from:a, input: "0x5b600056", gas:"0x989680"}, "latest", {"tracer": "callTracer", "timeout": "1ms"})  }catch(e){}}

If I run this on master, with some added printout when the trace ends:

WARN [09-15|10:17:23.124] Served debug_traceCall                   reqid=63 t=62.092057ms err="execution timeout"
INFO [09-15|10:17:23.168] Trace done                               elapsed=37.435613ms err=0xbd92a0
WARN [09-15|10:17:23.170] Served debug_traceCall                   reqid=64 t=44.964981ms err="execution timeout"
INFO [09-15|10:17:23.213] Trace done                               elapsed=37.365314ms err=0xbd92a0
WARN [09-15|10:17:23.215] Served debug_traceCall                   reqid=65 t=44.787581ms err="execution timeout"
INFO [09-15|10:17:23.258] Trace done                               elapsed=37.015968ms err=0xbd92a0
WARN [09-15|10:17:23.259] Served debug_traceCall                   reqid=66 t=44.329423ms err="execution timeout"
INFO [09-15|10:17:23.303] Trace done                               elapsed=37.243615ms err=0xbd92a0
WARN [09-15|10:17:23.305] Served debug_traceCall                   reqid=67 t=44.622817ms err="execution timeout"
INFO [09-15|10:17:23.349] Trace done                               elapsed=38.016897ms err=0xbd92a0
WARN [09-15|10:17:23.350] Served debug_traceCall                   reqid=68 t=45.465781ms err="execution timeout"
INFO [09-15|10:17:23.395] Trace done                               elapsed=38.352121ms err=0xbd92a0
WARN [09-15|10:17:23.396] Served debug_traceCall                   reqid=69 t=45.750501ms err="execution timeout"
INFO [09-15|10:17:23.440] Trace done                               elapsed=37.428625ms err=0xbd92a0
WARN [09-15|10:17:23.442] Served debug_traceCall                   reqid=70 t=44.716995ms err="execution timeout"
INFO [09-15|10:17:23.486] Trace done                               elapsed=37.261927ms err=0xbd92a0
WARN [09-15|10:17:23.488] Served debug_traceCall                   reqid=71 t=45.757262ms err="execution timeout"
INFO [09-15|10:17:23.534] Trace done                               elapsed=39.421814ms err=0xbd92a0
WARN [09-15|10:17:23.535] Served debug_traceCall                   reqid=72 t=46.78345ms  err="execution timeout"
INFO [09-15|10:17:27.567] Trace done                               elapsed=38.01212ms  err=0xbd92a0
WARN [09-15|10:17:27.568] Served debug_traceCall                   reqid=74 t=60.048715ms err="execution timeout"
INFO [09-15|10:17:27.612] Trace done                               elapsed=37.354918ms err=0xbd92a0
WARN [09-15|10:17:27.614] Served debug_traceCall                   reqid=75 t=44.806464ms err="execution timeout"
INFO [09-15|10:17:27.659] Trace done                               elapsed=38.760404ms err=0xbd92a0
WARN [09-15|10:17:27.660] Served debug_traceCall                   reqid=76 t=46.200019ms err="execution timeout"
INFO [09-15|10:17:27.705] Trace done                               elapsed=37.843866ms err=0xbd92a0
WARN [09-15|10:17:27.706] Served debug_traceCall                   reqid=77 t=45.441603ms err="execution timeout"
INFO [09-15|10:17:27.752] Trace done                               elapsed=39.506866ms err=0xbd92a0
WARN [09-15|10:17:27.754] Served debug_traceCall                   reqid=78 t=47.262987ms err="execution timeout"
INFO [09-15|10:17:27.798] Trace done                               elapsed=37.453001ms err=0xbd92a0
WARN [09-15|10:17:27.799] Served debug_traceCall                   reqid=79 t=44.974827ms err="execution timeout"
INFO [09-15|10:17:27.843] Trace done                               elapsed=37.502113ms err=0xbd92a0
WARN [09-15|10:17:27.845] Served debug_traceCall                   reqid=80 t=45.10805ms  err="execution timeout"
INFO [09-15|10:17:27.891] Trace done                               elapsed=39.174589ms err=0xbd92a0
WARN [09-15|10:17:27.892] Served debug_traceCall                   reqid=81 t=46.773815ms err="execution timeout"
INFO [09-15|10:17:27.936] Trace done                               elapsed=37.509592ms err=0xbd92a0
WARN [09-15|10:17:27.937] Served debug_traceCall                   reqid=82 t=45.046033ms err="execution timeout"
INFO [09-15|10:17:27.982] Trace done                               elapsed=38.273ms    err=0xbd92a0
WARN [09-15|10:17:27.983] Served debug_traceCall                   reqid=83 t=45.614921ms err="execution timeout"

So even if we call Stop() after ~1ms, it takes ~40ms for the execution to complete.
Whereas running it on this PR (with the added printout):

INFO [09-15|10:12:39.352] Trace done                               elapsed=1.037024ms
WARN [09-15|10:12:39.354] Served debug_traceCall                   reqid=37 t=18.875244ms err="execution timeout"
INFO [09-15|10:12:48.072] Trace done                               elapsed=1.176966ms
WARN [09-15|10:12:48.074] Served debug_traceCall                   reqid=38 t=18.597476ms err="execution timeout"
INFO [09-15|10:12:48.853] Trace done                               elapsed=1.202925ms
WARN [09-15|10:12:48.855] Served debug_traceCall                   reqid=39 t=16.350088ms err="execution timeout"
INFO [09-15|10:12:49.550] Trace done                               elapsed=1.175699ms
WARN [09-15|10:12:49.552] Served debug_traceCall                   reqid=40 t=22.687372ms err="execution timeout"
INFO [09-15|10:14:12.953] Trace done                               elapsed=1.156763ms
WARN [09-15|10:14:12.954] Served debug_traceCall                   reqid=41 t=17.97693ms  err="execution timeout"
INFO [09-15|10:14:12.962] Trace done                               elapsed=1.238879ms
WARN [09-15|10:14:12.964] Served debug_traceCall                   reqid=42 t=8.683642ms  err="execution timeout"
INFO [09-15|10:14:12.972] Trace done                               elapsed=1.213763ms
WARN [09-15|10:14:12.974] Served debug_traceCall                   reqid=43 t=10.025599ms err="execution timeout"
INFO [09-15|10:14:12.982] Trace done                               elapsed=1.206566ms
WARN [09-15|10:14:12.984] Served debug_traceCall                   reqid=44 t=9.40188ms   err="execution timeout"
INFO [09-15|10:14:12.991] Trace done                               elapsed=1.200257ms
WARN [09-15|10:14:12.993] Served debug_traceCall                   reqid=45 t=9.11482ms   err="execution timeout"
INFO [09-15|10:14:13.003] Trace done                               elapsed=1.197946ms
WARN [09-15|10:14:13.004] Served debug_traceCall                   reqid=46 t=9.383057ms  err="execution timeout"
INFO [09-15|10:14:13.011] Trace done                               elapsed=1.210468ms
WARN [09-15|10:14:13.013] Served debug_traceCall                   reqid=47 t=8.535531ms  err="execution timeout"
INFO [09-15|10:14:13.020] Trace done                               elapsed=1.184903ms
WARN [09-15|10:14:13.022] Served debug_traceCall                   reqid=48 t=8.600543ms  err="execution timeout"
INFO [09-15|10:14:13.030] Trace done                               elapsed=1.163352ms
WARN [09-15|10:14:13.032] Served debug_traceCall                   reqid=49 t=9.858417ms  err="execution timeout"
INFO [09-15|10:14:13.042] Trace done                               elapsed=1.173591ms
WARN [09-15|10:14:13.043] Served debug_traceCall                   reqid=50 t=11.201256ms err="execution timeout"
INFO [09-15|10:14:37.894] Trace done                               elapsed=1.025216ms
WARN [09-15|10:14:37.896] Served debug_traceCall                   reqid=52 t=18.575947ms err="execution timeout"
INFO [09-15|10:14:37.903] Trace done                               elapsed=1.22123ms
WARN [09-15|10:14:37.905] Served debug_traceCall                   reqid=53 t=9.035504ms  err="execution timeout"
INFO [09-15|10:14:37.913] Trace done                               elapsed=1.178614ms
WARN [09-15|10:14:37.914] Served debug_traceCall                   reqid=54 t=8.956603ms  err="execution timeout"
INFO [09-15|10:14:37.922] Trace done                               elapsed=1.20833ms
WARN [09-15|10:14:37.923] Served debug_traceCall                   reqid=55 t=8.753149ms  err="execution timeout"
INFO [09-15|10:14:37.931] Trace done                               elapsed=1.212588ms
WARN [09-15|10:14:37.932] Served debug_traceCall                   reqid=56 t=8.385952ms  err="execution timeout"
INFO [09-15|10:14:37.940] Trace done                               elapsed=1.215262ms
WARN [09-15|10:14:37.942] Served debug_traceCall                   reqid=57 t=9.881463ms  err="execution timeout"
INFO [09-15|10:14:37.950] Trace done                               elapsed=1.202438ms
WARN [09-15|10:14:37.952] Served debug_traceCall                   reqid=58 t=9.149201ms  err="execution timeout"
INFO [09-15|10:14:37.959] Trace done                               elapsed=1.201389ms
WARN [09-15|10:14:37.960] Served debug_traceCall                   reqid=59 t=8.370293ms  err="execution timeout"
INFO [09-15|10:14:37.968] Trace done                               elapsed=1.209285ms
WARN [09-15|10:14:37.970] Served debug_traceCall                   reqid=60 t=9.475413ms  err="execution timeout"
INFO [09-15|10:14:37.978] Trace done                               elapsed=1.186982ms
WARN [09-15|10:14:37.979] Served debug_traceCall                   reqid=61 t=8.945892ms  err="execution timeout"

With the call to env.Cancel, the evm actually aborts after max 1000 more steps. If the cause of the OOMs are
lingering evm executions piling up, then this PR should fix it (or at least make it a lot more robust).

@holiman
Copy link
Contributor Author

holiman commented Sep 15, 2021

Actually, while I think this change is correct and good, I don't think 'lingering' evm executions can happen. The RPC call will only ever return once the evm execution is finished.

Copy link
Member

@MariusVanDerWijden MariusVanDerWijden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, this change is probably good regardless of whether it fixes the underlying problem

@holiman holiman merged commit 7ada89d into ethereum:master Sep 16, 2021
@holiman holiman added this to the 1.10.9 milestone Sep 16, 2021
maoueh pushed a commit to streamingfast/go-ethereum that referenced this pull request Jan 31, 2022
* eth/tracers: implement debug.intermediateRoots (ethereum#23594)

This PR implements a new debug method, which I've talked briefly about to some other client developers. It allows the caller to obtain the intermediate state roots for a block (which might be either a canon block or a 'bad' block).
Signed-off-by: wenbiao <delweng@gmail.com>

* core, rpc: disable memory output by default in traces (ethereum#23558)

* core: cmd: invert disableMemory

* core: fix missed inversion

* cmd/evm: preserve Flags but change default value

* Apply suggestions from code review

Co-authored-by: Martin Holst Swende <martin@swende.se>

Co-authored-by: Martin Holst Swende <martin@swende.se>
Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: abort evm execution when trace is aborted (ethereum#23580)

Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: avoid unsyncronized mutations on trie database (ethereum#23632)

This PR fixes an issue in traceChain, where the statedb Commit operation was performed asynchronously with dereference-operations agains the underlying trie.Database instance. Due to how the reference counting works within the trie database (where parent count is recursively updated when new parents are added), doing dereferencing in the middle of Commit can cause the refcount to become wrong, leading to an inconsistent state. 

This was fixed by doing Commit/Deref from the same routine.  
Signed-off-by: wenbiao <delweng@gmail.com>

* core,eth: call frame tracing (ethereum#23087)

This change introduces 2 new optional methods; `enter()` and `exit()` for js tracers, and makes `step()` optiona. The two new methods are invoked when entering and exiting a call frame (but not invoked for the outermost scope, which has it's own methods). Currently these are the data fields passed to each of them:

    enter: type (opcode), from, to, input, gas, value
    exit: output, gasUsed, error

The PR also comes with a re-write of the callTracer. As a backup we keep the previous tracing script under the name `callTracerLegacy`. Behaviour of both tracers are equivalent for the most part, although there are some small differences (improvements), where the new tracer is more correct / has more information.

Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: re-write of 4byte tracer using enter/exit (ethereum#23622)

* eth/tracers: add re-write of 4byte tracer using enter/exit

* eth/tracers: fix 4byte indent
Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: tx.BaseFee not implemented

Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: do the JSON serialization via .js to capture C faults

Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: fix callTracer fault handling (ethereum#23667)

* eth/tracers: fix calltracer fault handling

* eth/tracers: fix calltracer indentation
Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: invoke enter/exit on 0-value calls to inex accounts (ethereum#23828)

Signed-off-by: wenbiao <delweng@gmail.com>

* eth: make traceChain avoid OOM on long-running tracing (ethereum#23736)

This PR changes long-running chain tracing, so that it at some points releases the memory trie db, and switch over to a fresh disk-backed trie.
Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: expose contextual infos (block hash, tx hash, tx index)

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: redefine Context

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: support for golang tracers + add golang callTracer (ethereum#23708)

* eth/tracers: add basic native loader

* eth/tracers: add GetResult to tracer interface

* eth/tracers: add native call tracer

* eth/tracers: fix call tracer json result

* eth/tracers: minor fix

* eth/tracers: fix

* eth/tracers: fix benchTracer

* eth/tracers: test native call tracer

* eth/tracers: fix

* eth/tracers: rm extra make

Co-authored-by: Martin Holst Swende <martin@swende.se>

* eth/tracers: rm extra make

* eth/tracers: make callFrame private

* eth/tracers: clean-up and comments

* eth/tracers: add license

* eth/tracers: rework the model a bit

* eth/tracers: move tracecall tests to subpackage

* cmd/geth: load native tracers

* eth/tracers: minor fix

* eth/tracers: impl stop

* eth/tracers: add native noop tracer

* renamings

Co-authored-by: Martin Holst Swende <martin@swende.se>

* eth/tracers: more renamings

* eth/tracers: make jstracer non-exported, avoid cast

* eth/tracers, core/vm: rename vm.Tracer to vm.EVMLogger for clarity

* eth/tracers: minor comment fix

* eth/tracers/testing: lint nitpicks

* core,eth: cancel evm on nativecalltracer stop

* Revert "core,eth: cancel evm on nativecalltracer stop"

This reverts commit 01bb908.

* eth/tracers: linter nits

* eth/tracers: fix output on err

Co-authored-by: Martin Holst Swende <martin@swende.se>
Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: make native calltracer default (ethereum#23867)

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: package restructuring (ethereum#23857)

* eth/tracers: restructure tracer package

* core/vm/runtime: load js tracers

* eth/tracers: mv bigint js code to own file

* eth/tracers: add method docs for native tracers

* eth/tracers: minor doc fix

* core,eth: cancel evm on nativecalltracer stop

* core/vm: fix failing test

Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: ethapi.TransactionArgs was not merged

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: fix the api_test with ErrInsufficientFunds to ErrInsufficientFundsForTransfer

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: check posa before statedb.Prepare in IntermiateRoots api

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: make js calltracer default, compatible with old version

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: fix the default callTrace name of callTracerJs

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* Revert "eth/tracers: fix the default callTrace name of callTracerJs"

This reverts commit 62a3bc215d9f07e422a4c659289bb3ba4f9ed2fa.

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* Revert "eth/tracers: make js calltracer default, compatible with old version"

This reverts commit 85ef42c0ea651f0b228d4209b1b2598b24e12f1f.

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: fix the variable race condition

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

Co-authored-by: Martin Holst Swende <martin@swende.se>
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
Co-authored-by: Sina Mahmoodi <1591639+s1na@users.noreply.github.com>
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
yongjun925 pushed a commit to DODOEX/go-ethereum that referenced this pull request Dec 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants