-
Notifications
You must be signed in to change notification settings - Fork 1.7k
unreasonably high memory usage (without crash) and won't shut down #10821
Comments
How did you collect the memory stats showed in the csv? |
So it's RSS, perfect! :) The number of pending txs is pretty high, is that a normal amount in your setup? |
Yeah i'm parsing pending transactions to my DB. Check my start parameters :) |
Other than staying in sync, what is the node doing? I.e. what kind of RPC traffic is it used for? |
I'm using for every new block this RPC's: |
I've been running a recent master build with your params now for ~6h and memory usage seems stable. While it's possible that this has been fixed in master, it is more probable that the leak is somewhere in the RPC layer. I need to set up some kind of load testing script to debug this further. |
@iFA88 Do you have the possibility to confirm my findings by running a node without RPC traffic, just to check that it is indeed the RPC layer causing issues? Also, if you have a load testing script or something similar already written, that'd be helpful too ofc. Thanks! |
In the log I see you must be running with |
I have run the node without any RPC call, but the memory has increased continuously. There is the log, but please ignore the peer and pending TX values: https://www.fusionsolutions.io/doc/memlog2.tar.gz Without any RPC call the shutdown works very fast. |
You have right, tracing was |
Ok, not a problem. It explains why I couldn't repeat it. I'd have to slow sync the whole chain to repeat now I think so I'm going to try using the Goerli testnet and see if I can see the issue there. If you have the means to do so it would be great if you could try on Goerli as well using 2.4.x. Thanks! |
@dvdplm Sadly that not, but if you wish I can set some trace parameter. |
Is your node synched? |
Ofc, and you have seen that in the logs. |
Yeah, still synching Kovan here with traces. Goerli is synched and after 12+ hours show no signs of memory leaks. |
@dvdplm are you testing this on macOS? The problem could be related to heapsize, which uses jemallocator only on macOS. |
@ordian yes, and yes it is possible that this is a platform issue, but we'll see. For now I'm trying to rule out the obvious stuff. I'm not sure how long it takes to slow-sync mainnet with tracing on, but judging how long it takes on Kovan I think it could take weeks so I was hoping to find an easier way to reproduce this. |
@ordian I will upgrade my parity to https://github.com/paritytech/parity-ethereum/releases/tag/v2.4.9 I see that this build has the commit. I need to KILLSIG the process, because they dont shut down. |
@ordian Thats not the commit?: |
@iFA88 you're comparing v2.4.9 with master, so it shows you the difference, i.e. the commits that are in master and not in 2.4.9. |
@ordian yes, i was wrong! If you can build the current master branch for linux, then I can use that, sadly i don't have any build tools now. |
@iFA88 I think you can download a recent nightly from here (click the "Download" button on the right). It would be great if you could repeat the problem using that. An update on my end: Goerli is synched and does not leak any memory. Kovan is still synching (and has been really stable, but that is irrelevant here). |
@dvdplm Alright, I ran now that binary. Idk why, but the classic chain works flawless. I have a trace about the shutdown, please look at it: |
You mean running with
That is 2.4.6 so the latest fixes for shutdown problems are not included. Best would be to debug this further using the latest releases (or master builds). For shutdown issues it'd be good to enable |
@dvdplm Yes, i have a classic node which runs in archive trace mode and the RES usage does not goes up as ~1.3gb, even not with I let the shutdown trace parameter on now and running |
Sadly the new parity ( |
Ok, and just to be clear: you ran it with mainnet with tracing on just like before, same settings except for shutdown logging? Did you also experience shutdown problems with |
@dvdplm yes and yes :( |
Ok, so @ordian, this that tells us that this is not related to jemalloc, do you agree? |
The
|
Ouch that doesn't sound good. When you say "crashed" do you mean that the process hung in some way or did it actually crash? I mean, you write that you could still query the node over RPC right? I am still synching mainnet, am about half-way through but I anticipate it'll take a long while still. I wonder if there's anyway you could share your database with us to speed up the investigation? |
I called it crashed, because the logging and the syncing has stopped. Maybe the main thread has been hanged?! Yeah i have queried the block number to check for the sync works or not.
I would glad to help, but i don't see any possibility how can we speed up this. If you wish i can set some parameter for the party. If you have any ideas share it. |
There is anything what I can do? The two parity which runs on main network eats my all of my RAM after 1-2 days. Daily restart is not the best solution :( |
I am facing similar issues with latest Parity releases. I used to be able to sync easily and run other applications, however now after an hour or two of syncing consumes all my RAM and running other applications is not possible, even Parity alone causing the computer to lock up. Parity used to be faster to sync and lighter on the RAM than Geth, but now I can control the RAM usage in Geth, so am looking to switch back. |
I suggest the shutdown problem comes true when i send a shutdown signal to the node, but the node still accepts RPC calls and that prevents the shutdown process.. |
I have discovered, when I don't use the |
Hey @dvdplm ! Can you please check my last comment with the |
@iFA88 apologies for the late answer. I have not been able to reproduce the problem with ram usage and |
@dvdplm Do we have any command to get cache statuses (usable/limit) or any debug level/trace? |
No, not that I know of. It would be quite useful. |
I'm running now with |
The node uses now 9150mb RES after 2 days with the above parameters. |
So I think I'm seeing something similar here: omitting the |
I can not measure the importing speed because every block has very different EVM calls. I will try now using the |
Ok, It seems the issue is somehow solved, When I'm using |
Greetings, sadly my
Parity-Ethereum/v2.4.6-stable-94164e1-20190514/x86_64-linux-gnu/rustc1.34.1
node eats unreasonably high memory.Node log and process statistics in CSV : https://www.fusionsolutions.io/doc/memlog.tar.gz
Start parameters are:
The memory usage will not be higher as 12gb.
On
16:20:20
have killed the process withKILLSIG
, this is the only way that I can shut down the process.I glad help with any trace parameters or statistics.
The text was updated successfully, but these errors were encountered: