-
Notifications
You must be signed in to change notification settings - Fork 1.7k
accelerate parity aura TPS #10382
Comments
Hey @drandreaskrueger, Chainhammer installation issues (Ubuntu 18.04)There is a bunch of things missing in the installation script, afair it was:
That later caused issued killing containers, because of apparmor, so I disabled it without investigation.
Results?Test machine was Scaleway's START1-M with Ubuntu 18.04 ALL tests were run with 50k transactions and
Note the results were just run once, and have no statistical significance whatsoever. How?Parity flags:
Chainhammer modifications:
Aura block time: Explanations
OpinionThe results are pretty similar between multiple implementations, cause clearly transaction signing is the limitting factor. So the benchmark currently is able to measure how fast modern processors can sign secp256k1, and a little bit of RPC performance. Note that since all transactions come from a single sender importing to the queue has to be done sequentailly for all clients (Quorum might be doing some batch importing thanks to their Suggestions for improvementsI'd say that it might be worth to specify more detailed objectives of what we want to compare between implementations. Make sure that blocks are always fullCurrently the results are heavily dependent on how the transactions get distributed in the blocks and how long is the block time. Import pre-signed transactionsTo avoid testing the signing speed we can prepare a pre-signed transacitons and submit them via Send transactions from multiple accounts.This will emulate real world a bit more, that's also what transaction pools are optimized for (note that Parity was failing for you due to incorrect Consider testing network behaviour not local node's pool and RPCThe idea for this test is to:
Consider submitting transactions to multiple nodesAt some point the RPC might become a bottleneck, to test the nodes communication it might be better to issue requests to multiple nodes and see how they consolidate the transaction pools. Consider running test code on a separate machine than the test network.Currently the test code might affect the execution (spawning 20 threads), it might be worth separating the two. |
Closing since I believe there are no issues on our side. |
Hooray, you did it! Congrats! I am very happy. That you found a way - and that I kept on believing in parity. In all those 185 days since I submitted this problem for the first time, I could not believe that parity should be slower than geth; that is why I was so persistent. And now you have finally proven the point. Thanks a million. Very good. In the next posts, I am going through your suggestions. I think you have mainly solved it, but there are some remaining -hopefully minor- problems. |
install on Ubuntu
Thanks a lot. Those issues must be Ubuntu-related, because on Debian, I have never seen them. For now, I have simply put a note to your instructions into docs/FAQ.md#install-on-Ubuntu - please click. Perhaps you could contribute those scripts? Thanks. your resultsGreat work, well done!
That is looking very good. I knew it ... So parity is just not configured optimally when run out-of-the-box. your settings merged into a code branchYour suggested code & CLI changes are now in this branch "issues/parity10382" https://github.com/drandreaskrueger/chainhammer/compare/issues/parity10382 - please have a look. your CLI switches
Thanks. THAT is exactly what I had been hoping for, in the last 6 months. Great. Well done! Those switches, together with parity version v2.3.4 seem to do the trick that parity keeps on accepting transactions, so this #9582 is probably solved. Most of the time, not always though (on a 1 CPU machine, I have still seen runs where it got stuck). your CLI switches with instantsealTwo issues remaining when trying instantseal with "threaded2 20": Once it happened that parity instantseal was stalling again, even with your new CLI switches. Plus, I got a serious new problem now: not all transactions ended up in the chain! Repeatedly send.py reported this:
Is instantseal perhaps not sealing the very last block? When you debug that, you can use the terminator script - then you always also see the output For now, I keep instantseal with "sequential": run-all_large.sh and run-all_small.sh but for the aura-runs it is now changed to "threaded2 20". unlockI assume
The syntax of the I have now introduced this parity specific part into the unlockAccount() call. Thanks for that. Explanations
Thanks a lot! Very helpful.
I tried without it. But it did not work, So I have re-introduced it, keeping all your switches, but adding
Yes. Afri told me to always use that switch.
Yes, and apart from other problems that I saw today when ommitting it ... I also would not be able to get the final 10 empty blocks, for better plotting of the diagrams.
oh, oops. I suggest you change that part in your aura algorithm then.
I see. Better find a way to uncouple the good effects of that switch from those unintended side effects.
Nice one, thanks.
Do you really think that would make a huge difference? (and if 10s is better, then why not directly go for e.g. 30s?) You can now run a whole batch of experiments easily, to dis/prove that point, have a look here, I made that for you: scripts/run-parity-vary-blocktime.sh - I tried a bit, but quickly ran into issues with gas-full blocks, so the gas limit has to be increased, which in turn seems to cause problems when running with short blocktimes (and ddorgan's parity-deploy provides no option to simply configure a different gas limit so it needs patching) ... but that new script (Me, I have no time for testing & debugging all that right now ... sorry.).
With 2000 transactions that is a considerable difference, but with 20000 or 50000 transactions submitted, i.e. many blocks in that experiment, the difference should become negligible, right? Plus the other clients (quorum, geth) are facing the exact same situation anyways, right? Suggestions
Great thanks. I have linked to it in docs/TODO.md#other-peoples-suggestions so that it won't get forgotten. |
Now even though you have found this optimization, I still do not think that you want to change the default behaviour of parity, right? What about this instead: idea: Profiles
Your 8 CLI switches to configure parity optimally ... are just too many, IMHO. And to find this exact combination among the ~100 CLI parity switches ... is almost impossible for an end consumer who just wants to run a fast parity. Even though geth and quorum can perhaps be optimized further, they already do run fast "out of the box", without any such clever CLI switches (and waiting 185 days lol). So, my suggestion for you: What about creating "profiles" that combine many different CLI switches ? Example: I would get all your 8 switches enabled in one go, if I simply type this:
Then you can leave the default setup of parity as it currently is, but additionally provide a quickstart for people want to run parity with the fastest possible TPS setup. What do you think about that? |
accelerate parity aura
Chainhammer v55 is fully automated now = benchmark parity with two lines of CLI commands !
The parity aura TPS are still not satisfying, and I am optimistic you find a better combination of CLI switches for parity, to speed it up. Why? In Q2, I am going publish a comparison paper, and it would be nice to have better results by then, no? Please you help now finding better CLI settings for parity. Thanks.
your questions:
actual
slower than some other clients
expected behavior
comparable TPS
versions
steps to reproduce
Dependencies:
Spinning up a t2.medium machine on AWS, using my newest AMI is for sure the safer & easier way. Or alternatively
and accept each step of the installation script (complex, not recommended. Use the AMI.).
Run a whole experiment & analyze results
Then just wait (only perhaps watch the logfile
tail -n 10 logs/network.log
).If all goes well, you are told when the experiment has ended, and you will then have a summary file in
results/runs/
- which includes time series diagrams, and TPS estimates.Variations
You first want to read the script run.sh to be able to understand which (eight or) 10 steps are executed when running one whole experiment. Then:
parity v2.x.y
... should be a bit faster than v1.11.11.
multi-threading
The above "sequential" is hammering transactions at parity in a simple for loop, non-async. Obviously, that is not the fastest possible way. However, unfortunately parity v2.x.y has an unsolved issue with multi-threaded sending of transactions, but you can try this with v1.11.11 where it always worked:
It uses a queue with 20 concurrent multi-threading workers, and --> should result in higher TPS than the "sequential" approach above.
v2.x.y and multi-threading - warning & helper script
When you try to start the latter with v2.2.3 instead of the v1.11.11, it might never reach its planned end, because parity very often just stops accepting new transactions, usually after a few thousand TX. The above mentioned issue.
Then when you are out of patience, and interrupt the experiment manually, you will end up with dangling processes. This script here helps:
Warning: It is rather radical, and e.g. removes all docker containers from that system, so (a) first read the script, and (b) only run it on a disposable virtualbox, or cloud machine. Plus it is not 100% complete yet, so keep you eyes open which other processes might have survived when you manually end the experiment before it fully ran through.
comparison with other clients
Have a look at the new "whole laboratory in one command" scripts
run-all_large.sh
and run-all_large.sh and the instructions in docs/reproduce.md#how-to-replicate-the-results.also see these files in chainhammer
README.md#install-and-run
docs/cloud.md#readymade-amazon-ami
networks/parity-start.sh, networks/parity-stop.sh, and networks/parity-clean.sh.
and perhaps there are remaining parity.md --> issues that can now be solved too, with chainhammer v55 ?
Hope this helps. Please keep me posted. Thanks.
The text was updated successfully, but these errors were encountered: