-
Notifications
You must be signed in to change notification settings - Fork 1.7k
broken parity 2.0.5-stable - refuses to accept new transactions after ~500 tx [possibly a multi-threading issue, see new comments] #9582
Comments
@drandreaskrueger I think there is a same issue with geth also, can you check it?? |
no there is not. Or rather, more precise: No, I did not see a similar issue with geth. |
@drandreaskrueger Have you checked that geth purges transactions on mining node's txpool? Here is my issue for stale transactions on geth ethereum/go-ethereum#17679 It can be happened by sending a lot of transactions from other node to mining node |
Thanks for bumping up my issue @eosclassicteam but ... second time now: You tend to create confusion and chaos with your posts in other people's issue threads. Your post is unrelated to this issue here. I don't have to "purge all transactions on local txpool" because I am always starting from scratch again. |
@drandreaskrueger 😂 Sorry for bothering you, do you have any contact info? Are you online on discord or Gitter? |
@tomusdrw any ideas? |
Also tried with -currently newest stable-
and then a communication test (deploying contract, reading it, writing a state change):
which worked fine. Then the benchmarking viewer:
and the send-20000-transactions task:
Good newsIt starts of considerably faster than any of the older v1.x.y parity versions (see e.g. run15-run17). See below. Bad newsBut then it fails to digest any transaction after ~2000 transactions:
|
similar with the -currently newest beta-
|
In contrast, v1.11.11 keeps on accepting all transactions: "run 13" https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#run-13
|
I have new information for you. I have tried again with your v2.2.0-nightly of 3 days ago. It is actually working fine but only when single threaded hammering at the node. It is not working (as described above) when multi-threaded hammering at the node. Will post more details soon. |
|
for how to replicate this, see the post above this. no-problem vs. problem = single-threaded versus multi-threadedCompare the following two runs: (1) single-threaded RPC calls ...... are actually working fine !!! All 20k transactions enter the chain if the chainhammer is started with
See:
but: (2) multi-threaded RPC calls with 23 workers ...... i.e. if the chainhammer is started with
then that leads to a refusal after a few thousand transactions:
i.e. after 5529 transactions ... no new ones can enter the chain. Then afterwards the node does not accept anything, also not if I restart my experiments. Only thing that helps is Ctrl-C and restart Then e.g. the single-threaded chainhammering works fine again, I did a 2nd attempt - and again all 20k transactions go through. (Just run my stuff yourself, it takes you only a few minutes.) hope that helps to give you an idea how to fix this. |
P.S.: Problem did not exist yet in parity v1 that is why I have run all my recent experiments with v1.11.11 |
@ddorgan - you had seen this too when I visited you in the Berlin office. Any new ideas how to fix this? Then we could finally move up from v1.11.11 to parity v2.x.y ... ? |
does this happen with 2.2.3? cc @tomusdrw |
chainhammer v55You can now try it all yourself, the newest version is so automated, that you run & analyze a whole experiment with one or two CLI lines. That should help you to find the flaw. Because of more general interest, and useful for several open issues, I have created a new issue: |
Thanks for sharing. I would advise using random accounts to hammer the chain, this will prevent nonce hiccups. |
You're welcome. If my work can help to fix bugs, and make parity faster, that would be great.
yes, for some cases, this might be the way to go, for others not. However, IMHO it makes sense to benchmark the general case, not the special edge case. And geth, or quorum don't need such a workaround.
Perhaps reopen this issue, until the seen problem (of parity v2.x.y just refusing to accept any more transactions after a few thousand tx) is actually fixed? |
@drandreaskreuger perhaps if it is still apparent after #10375 is merged? |
@joshua-mir yes and esp the title of #10344 really sounds like what I had been seeing there. |
@drandreaskrueger great to hear about a new chainhammer version, will check it out. |
yes, it is muuuuuuch more automated. Looking forward to your feedback, I ran it a few times yesterday, with parity v2.2.3 - and parity refused to acccept new transactions even when firing just single-threaded (with my "sequential" not the "threaded2 20" algo). So ... completely switching back to v1.11.11 for now: drandreaskrueger/chainhammer@f8f2760 but with the new chainhammer version v55, I am optimistic that one of you will find the cause, and can run enough tests easily, to get rid of this #9582 issue soon. Good luck ;-) |
I still have issues on hanged TXs. I need investigate more to gather useful information but from user perspective if I set too much parallelism then I get multiple timeouts for everything (even for read-only eth_call!). Thanks to this thread, I'm going to try 1.11.11 |
@Pzixel have you checked out the recommendations here: #10382 (comment) |
Yep, I've bookmarked them. In a nutshell
|
@Pzixel There is no scheduler for the calls, so if you spam the node with huge amount of requests most of them will simply timeout on the client side. |
I actually didn't because when I was writing this code it wasn't supported, so I could check it with the new version. However, I don't really think 1-5k rps is a big amount that could remain responseless in 20 seconds. I may be wrong, so I'd like to hear your opinion if party can handle this rps. |
Hey @Pzixel are you still having that issue? I am experiencing something similar in my 4 nodes network with this configuration: #10382 (comment) If I spam the nodes with 1k req/s (simple RPC eth_sendTransaction over HTTP) I see a big degradation of de throughput and a lot of request simply get ignored. @tomusdrw Do you have any updates on that? |
Under huge load parity eventually stops processing transacitons. It just takes as many as specified in max queue parameter in toml and then hangs, in terms it never generates the block with those transactions whilst new txs gets rejected with "queue limit exceeded" error message. We didn't find any better approach than just reboot all the mining nodes when this happens. Parity tech seems to be focused on substrate so parity itself doesn't get much attention or support. Considering how bad ethereum is it leads to pretty sad development process. |
I had to downgrade to
parity v1.11.11
becauseparity 2.0.5
is brokenPlease see the two runs summarized below
actual behavior / v2.0.5-"stable"
https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#run-11-on-amazon-t2large
There are 19495 more transactions hammered at the node, but none of them go through.
expected behavior / v1.11.11
https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#how-to-downgrade-to-a-stable-stable-version
v1.11.11 works just fine.
reproduce
Reproduce this with my public Amazon AMI image ... quickly in ~10 minutes:
https://gitlab.com/electronDLT/chainhammer/blob/master/reproduce.md#readymade-amazon-ami
The text was updated successfully, but these errors were encountered: