Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

erigon sepolia crashes with "fatal error: traceback did not unwind completely" #8517

Closed
ldeffenb opened this issue Oct 18, 2023 · 5 comments
Closed

Comments

@ldeffenb
Copy link

System information

Erigon version: erigon version 2.53.0-dev-9e42b705

OS & Version: Ubuntu

Commit hash: 9e42b70

Erigon Command (with flags/config):

./github/erigon/build/bin/erigon --chain=sepolia \
	--datadir=./sepolia \
	--http=true --http.addr=0.0.0.0 \
	--http.port=8545 \
	--http.api=net,eth,web3 \
	--private.api.addr 0.0.0.0:8546 \
	--internalcl \
	--snapshots=true \
	--torrent.download.slots=4 --torrent.download.rate=128mb --torrent.upload.rate=4mb \
	--torrent.download.slots=4 \
	--torrent.download.rate=128mb \
	--torrent.upload.rate=4mb \
	--verbosity trace \
	--log.console.verbosity trace \
	--torrent.verbosity 5 \
	--port 36363 \
        --metrics \
        --metrics.addr 0.0.0.0 \
        --metrics.port 6060

Consensus Layer: internal

Consensus Layer Command (with flags/config): n/a

Chain/Network: sepolia

Expected behaviour

erigon would continue running

Actual behaviour

erigon randomly crashes

Steps to reproduce the behaviour

Not sure.

Backtrace

Captured with 2>&1 | tee erigon-sepolia-44.out
Two log files with Debug verbosity:
erigon-sepolia-41.log
erigon-sepolia-42.log

Two log files with Trace verbosity
erigon-sepolia-43.log
erigon-sepolia-44.log

@ldeffenb
Copy link
Author

This is still happening on erigon version 2.53.0-dev-ec59be22 - I was hoping that #8562 might have fixed it.
erigon-sepolia-101.txt

@AskAlexSharov
Copy link
Collaborator

AskAlexSharov commented Oct 24, 2023

it's golang bug. upgrade golang
golang/go#62182

@ldeffenb
Copy link
Author

Interesting. I have 2 Ubuntu machines doing this. One that is working is go version go1.20.1 linux/amd64. The one that is failing is go version go1.21.1 linux/amd64. Just pulled the working erigon over to the failing machine and (knock on wood) it is working so far. I'll update again when it either completes sync or fails.

@ldeffenb
Copy link
Author

erigon built with go version go1.20.1 linux/amd64 has successfully finished syncing sepolia on the machine that was failing.

Installed go version go1.21.3 linux/amd64 and make erigon and running that version now on the machine that was failing and recently completed the sync.

@ldeffenb
Copy link
Author

Apparently @AskAlexSharov was correct. This issue was caused by golang/go#62182 even though that issue describes openbsd/arm64 and I'm running on ubuntu/amd64. It's some sort of timing or native code issue as mentioned in that golang issue:

"This causes the runtime to crash when an assembly function that modifies SP grows the stack or is preempted for GC. This should be pretty rare, but the crash is difficult to work around."

Given that, I can see the CI tests for #8288 and #8008 not showing the problem.

But as long as you use golang 1.21.3 or later, apparently this issue is fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants