-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blocking messages in mpool might cause lagging synchronization #10518
Comments
Let me add some details I now have two nodes, the bigger one with about 7TB /chain Dir But I still don't want to reimport because it causes the service to be unavailable for up to hours I just restart it with raname/move the chain dir to chainold dir. The other node is imported from a brand new snapshot. let me name the node nodeA and nodeB i move the miner (still growing power) to nodeB NodeB is experiencing more severe and frequent synchronization lag problems(Dozens of times per hour,behind about 1-3min,50-500 msgs in mpool sometimes ) without hotstoreGC with hotstoreGC, the nodeA is worse,it behind up to 40min,Node B is significantly less affected, it behind about 2-6 min. I saw the discussion from the team,I think the impact of "msgs in Mpool" may have come from Miner's call to State By the way, the way I track sync lag is the same way in lotus-miner info, I push an MSG to phone when it is behind |
I have the same problem. |
@SBudo observed the same behaviour when he pushes 223 commit message to the node and msg got stuck due to base fee, the node sync freezes /slow down |
Getting the logs, will attach them here shortly Issue might not be related to this one, so will open a separate one (can merge them later if needed) |
Checklist
Latest release
, the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.Lotus component
Lotus Version
Repro Steps
...
Describe the Bug
After the last upgrade time, I ran into some sync lagging issues
I'm using an older version of the datastore, and it's taking up about 7TB of space
Daemon Node suddenly couldn't keep up with the chain,I thought this was a problem with the database being too large, so I was ready to resync from snap import
At the same time I noticed the splitstore feature, which I used to think was immature, after all this time, it should be time to use it, and according to his characteristic, he will maintain a smaller hotstore, so I may get two advantages
So I turned on the splitstore feature with the following configuration
After restarting, it seems that the sync is getting better,But nodes still lag behind from time to time
I searched for a lot of solutions and ended up adding the following configuration
At the same time, I also found that when some messages in mpool are blocked, the synchronization lag problem will be exacerbated ,I suspect that the large number of messages blocked in the mpool due to the rise of basefee also affects chain synchronization
therefore, I deliberately used a single node for the growing miner, so that I could confirm that the messages in the mpool also affected synchronization (As a result of my testing, nodes with local messages(about 50-500 msgs) blocking in the mempool will experience more frequent and severe sync lag issues)
At present, the situation of chain synchronization lagging has been greatly reduced utill i met the hotstore GC, it almost kill my miner because daemon sync behind 30+min during the GC time
@ZenGround0 provide a experimental patch: #10392 ,i will try it if i met the hotstore GC cause sync problem Again .
and @ZenGround0 want me to share my problem,so that someone who know a bit more about that part of the system can try to discover the relationship between synchronization performance and mpool message blocking
Logging Information
The text was updated successfully, but these errors were encountered: