-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rejected header marked as bad #11473
Comments
Are you running thorax/erigon:2.60.5-amd64 by any chance? We had an issue as well and found out that the integration cmd is not built in anymore. Maybe someone here can explain what's the background is on this change? @AskAlexSharov |
I am running [erigon_3.0.0-alpha1_linux_arm64.tar.gz] |
Does this still happen with v3.0.0-alpha2? |
yes. and it's duplicate of #10734 |
…12404) relates to #11387, #11473, #10734 tried to simulate the OOM using #11799 What I found was infinitely growing alloc of headers when receiving new header messages in sentry's `blockHeaders66` handler (check screenshot below). It looks like this is happening because in the case of a bad child header: we delete it from the `links` map, however its parent link still holds a reference to it so the deleted link & header never get gc-ed. Furthermore if new similar bad hashes arrive after deletion they get appended to their parent header's link and the children of that link can grow indefinitely ([here](https://github.com/erigontech/erigon/blob/main/turbo/stages/headerdownload/header_algos.go#L1085-L1086)). Ie confirmed with debug logs (note link at 13450415 has 140124 children): ``` DBUG[10-21|18:18:05.003] [downloader] InsertHeader: Rejected header parent info hash=0xb488d67deaf4103880fa972fd72a7a9be552e3bc653f124f1ad9cb45f36bcd07 height=13450415 children=140124 ``` <br/> The solution for this is to remove the bad link from its parent child list [here](https://github.com/erigontech/erigon/blob/main/turbo/stages/headerdownload/header_algos.go#L544) so that 1) it gets gc-ed and 2) the children list does not grow indefinitely. <br/> ![oom-heap-profile2](https://github.com/user-attachments/assets/518fa658-c199-48b6-aa2d-110673264144)
Currently facing the following:
The node itself has been running for quite some days without issues up until today.
It keeps spamming the following logs, restarting the node does not seem to work.
From reading previous issues i attempt to
integration state_stages --unwind=10 --datadir="/root/erigon/.local/share/erigon/bor-mainnet" --chain="bor-mainnet"
However it does not recognize 'integration'
The text was updated successfully, but these errors were encountered: