-
Notifications
You must be signed in to change notification settings - Fork 1.7k
60 TPS ? (parity aura v1.11.11) #9393
Comments
There is definitely a lot of room for improvement :)
|
There are common options to help here. They include:
And also scaling verification via: |
What's the block gas limit and aura block time? Please share config and chain spec. |
Fantastic, thanks for all the hints (and the tweet ;-) ) Answers are all in here, probably mainly below here. The author/issue-answerer of that parity-poa-playground seems grumpy - so I am happy that the parity-deploy.sh team is really responsive & helpful. Will try that again instead, next week. Back in the office on Tuesday. Really looking forward to an optimized run. Have a good weekend, everyone. |
Ok, I see, this is pretty much your answer ^ You have three authorities with one second block time and a gas floor target of 6 billion. This is a very good configuration to test TPS however, it does start with a lower block gas limit as it moves slowly up to the target. Did you consider running it for an extended period of time (hours, days) or simply modify the network configuration to start with a very high block gas limit, yet? |
Great. Is that a parity only setting, or can geth do that too?
I think that is set already high, no? |
(*) Actually, back then it was the energywebfoundation "tobalaba" fork of parity. By the way, I think they left some issues unanswered, perhaps anyone of you guys has ideas; after all it is parity 1.8.0, right? |
don't use the ewf client. parity ethereum now supports chain tobalaba |
Great. Did I loose time on benchmarking their outdated client then? Tobalaba was one big hickup, until they fixed that. Not sure I will get the time now to repeat all Tobalaba benchmarking. But feel free to do that yourself, chainhammer is not difficult to use. (Then please pull request, and I include that into chainhammer. Thanks.)
Is it completely integrated into parity now, with all its EWF added functionality? Is tobalaba PoA also Aura? |
It always has been. They just rebranded the client.
Yes |
Good. Still, if you know a better setup than that, I am happy to try that next week.
not sure about that. (1) (2) But I might not understand all those parameters yet. Then sorry. What can I read, which parameters are influencing TPS? Most importantly, as my time is limited:
Feel free to (simply run chainhammer yourself, or) send me any other configuration that you think will perform better. Does Thanks a lot. |
... please send me any other configuration authority.toml & chain.json that you think will perform better. Thanks. |
new testsI have tried your suggestions. But no acceleration!See https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#run2 and below. 63 TPS that is slow. ... new ideas please. Thanks. |
new run6 with some more parameters added, see description of the run in https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#run6 --> 65 TPS |
Just replicating your setup now. But maybe a --gas-floor-target of something more realistic would be a good idea ... e.g. maybe 20m ... Also moving the stepDuration to about 3 to 4 seconds would be needed in a real life situation when not just benchmarking everything on one host. |
Great, I am happy. Thanks a lot for your time, @ddorgan @5chdn and @Tbaut. Let's find out how we can get And? Got it running? Which rates are you able to see? Or: Need help?
Thanks. Explanation:
Please tell me about all those parameters which might be able to accelerate the current setup. A warning: This whole benchmarking is admittedly not a "realistic" setup which rates to expect when running a network of nodes distributed on the planet. Internet bandwidth, ping times, etc. will always slow it down; I am looking at it more as an attempt to identify the current upper limit; any realistic setup will always be slower than what I have measured. However, if you want to, you can quite as well create a network with nodes on each continent, and then use chainhammer to benchmark that. So:
Yes. However, I have tested most of the other systems with a fast block rate too; my initial focus was on quorum, and raft consensus has sub-seconds block rate, and quorum IBFT runs without problems with 1 second block times - without the internet, of course. Parity however is not even adhering to its own parameter, I suppose this is a target blocktime of 2 seconds, right? - and the run with 4 nodes then results in 4-8 seconds blocktime ! So aren't we already at the blockspeed that you are suggesting? These are the parameters that I have been adding to the standard https://github.com/paritytech/parity-deploy
because I found them somewhere in an issue about speed. Which of them are irrelevant? |
65 TPSsettings & log of run7: https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#run7 results diagrams: https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#result-run-7 There is a new README.md --> quickstart now ... ... so if you have any intution or knowledge how to accelerate this, please replicate my setup, and then start modifying the parameters of the network of parity nodes, e.g. with parity-deploy - until you get to better TPS rates. Then please alert us how you did it. Thanks. |
I found a new upper limit of 69 TPS, but with instantseal and only 1 node. |
In around June(don't exactly recall which parity version), we made some scripts to transfer ether from one account to another account. This script pushed transactions to 5 server setup(across different geographical regions) and we were able to achieve maximum TPS of 5001.
These results were obtained from not one but multiple blocks. Environment:
Other than this, there were no more changes to Most likely this low TPS that you are able to achieve is because of contract transactions. Cause:Contract transactions are more heavy in terms of amount of gas used. So few transactions are able to completely fill up the block. Maybe increasing gas limit would allow you to get a higher TPS. Regarding instantSeal: This may not be required in benchmarking any ethereum client, as it is not built to be decentralised (the core idea of blockchain). Even if one person was able to get a higher TPS on instantSeal, this blockchain would not be of much use to other willing to be a part of that network. |
I don't have time to reproduce this myself right now @drandreaskrueger - but 65 sounds wrong by orders of magnitude. what @AyushyaChitransh reports sounds more realistic. |
Your TPS benchmarking sounds impressive. Please (someone) publish the exact commands - perhaps in a dockerized form, so that we can replicate it easily within a few minutes. If it can be replicated, I am happy to include it here in my repo.
Yes to "lower" but no to "low" - because I am putting geth, quorum, etc through the exact same task. And: Simple money transfer is not relevant for our use case. My storage I am always using the same transaction, on geth, quorum, parity, etc. - so the TPS values are comparable, right? I have never benchmarked simple money transfer, because that is not what we do. Instead we need to choose the fastest client for smart contract transactions, and that is currently quorum IBFT (with over 400 TPS) or geth Clique (with over 300 TPS), and they were measured in the exactly same way as I measured parity. Please see quorum-IBFT.md and geth.md. It is a pity, because for years we have always preferred parity and it would mean that we have to revise quite a bit of our inhouse code - but we simply cannot ignore a 6 times faster TPS. @AyushyaChitransh please you repeat your benchmarking with a simplemost smart contract transaction - storing one value in a contract (or doing one multiplication and one addition - I still want to do that, but haven't had the time yet, see TODO.md). See the call and the contract.
Done that, been there. Please have a look at the bottom right diagram in each of my measurements, then you can see Thanks for all your answers, but please spend some time looking at my stuff first, thanks.
No, it doesn't. I left mine set to 2 seconds, but almost always it ended up to be 4-8 seconds.
That is a more realistic setup, to include the effects of the internet. For now, I am benchmarking the client itself, and all my 4-7 nodes are running on the same one machine, a 2016 desktop. But e.g. CPU is not maxing out, it stays around 50% during the whole benchmarking.
I know. Of course. But it is the most simplistic thing I can ask parity to do, and then I have still seen less than 70 TPS. Please anyone now replicate that setup of run 8, with the now new quickstart manual of chainhammer. Or -EDIT- follow the exact prescription below. (@AyushyaChitransh , please publish your experimental setup in a similar way, with the exact commandline commands to execute, so that others can replicate your 3k - 5k TPS. Thanks.)
I get that, @5chdn . We are all busy. And as you can see in my TODO.md list cited above, I also have some unfinished tasks with this. However, until you our someone else is disproving my findings, it already looks as if we might have our (yes, still preliminary) results: For our purposes
Yes, and I am suprised about that myself. The whole intention of all these interactions here is to get anyone who is more knowledgable about parity than me, to help me find the problem - and fix it. If your team is too small, what about employing more people? Or perhaps there is someone else out there, who would work for no pay? Please help, thanks. |
chainhammerActually, today I tried this again - tested on and optimized for Debian AWS machine ( How to replicate the resultstoolchain
log out and log back in, to enable those usergroup changes
new terminal:
chainhammer
new terminal
or:
everything below here is not necessarynew terminal ( * )
geth( * ) I do not want to install
Please help me with ^ this, thanks. Until that is sorted, I simply install
logout, log back in
please you now try thisAnd about "not having the time" - these 2.5 hours happened on my FREE DAY. I must convince them now that I can take those hours off again. |
Thanks for the idea, @gituser It might explain why I have seen faster rates with the Tobalaba fork (~ similiar to parity version 1.8.0) And how? In https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#run-14 it could be done here:
BUT
they have deleted all older versions. hello parity team - could you please re-create the docker hub image of the most stable 1.8.x version? Thanks a lot. |
weird, from github repository (just cloned freshly):
There are multiple pages in that link you sent! Check - and yes there is no |
Oh fantastic, that should make it easy to test. Pagination, sigh, had not expected that.
so, all older version are still on dockerhub, that is perfect:
thanks! have a good weekend. |
I would probably try Did v1.7 already have aura? Did it have And how? In https://gitlab.com/electronDLT/chainhammer/blob/master/reproduce.md#parity change
or (because Tobalaba)
or (because stable)
|
we don't delete anything. just checkout the tag directly, either on github or on docker
yes. |
yes. thanks. |
hello gituser, Re: your comment above
I had a lot of hope when you said that. But I have tried some older versions now: (run15) v1.7.13 and instantseal https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#run-15 not faster. |
would be surprised if the speed varies across versions |
yes me too. but in lack of any other substantial suggestions, and as a test of his comment
I had to try that. |
I could get contact with CodyBorn from Microsoft, he answered to my tweets. I have summarized what he is revealing about his approach here: https://gitlab.com/electronDLT/chainhammer/blob/master/codyborn.md |
What is clear already: He is "too far out" to be applicable to my simple benchmarking. I don't feel like creating hundreds of sender-accounts just to bypass nonce lookup and then sign my own transactions. And even if that made parity faster, for me it would simply mean that paritytech should revisit that part of your parity code, to accelerate it - and repeatedly re-run my chainhammer, to notice when you got it faster. Because his approach with possibly the most important hint for paritytechit is probably something outside of sendRawTransaction() but within the parity code base which is slowing down transactions by a factor of more than 500%. the most important experiment that CodyBorn could do:replicate his exact approach but instead of EDIT: ... the latter there contains my best result so far: 524 TPS on average when sending 20k tx into 1 node on a quorum-crux-IBFT network of 4 nodes, on an Amazon c5.4xlarge instance. That's it, from me, for now - until anyone makes any better suggestions for parity. |
parity v2.2.0 (single-threaded) seems slightly faster than v1.11.11 (multi-threaded), see these brand new results: #9582 (comment) |
Hey @5chdn @Tbaut @ddorgan @AyushyaChitransh @gituser Greetings from Berlin, web3summit. Actually, who of you are you in Berlin now? We should meet up this week! |
So I just did a test on a c5.xlarge but only using the --geth extra option and I'm seeing this:
Still seems quite single thread heavy though, will try with some options to speed that up. |
Great results. Looks very promising. We have to find out how to prevent this #9582 to always get to those results, consistently. Please let it run to the 20k end - perhaps even with more than 20k transactions, see config.py And: Keep me in the loop whatever you find out together with Tomek. Thanks again for your time on Thursday in the parity office. Great working with you! Greetings from the train to Prague. |
just found this elsewhere:
is that also how parity is internally digesting multi-threaded transaction requests? |
@drandreaskrueger let me come back to you on this. The bottleneck is most probably the signing itself. I think for geth you are basically pre-signing them because the web3 library is too slow, right? I may make a change to your script to do the same for parity so that transactions are signed before being submitted. This would like up with the geth process, right? |
Yes, for low (two digit) TPS it does not make a big difference, only 20% faster. But when I get into the hundreds of TPS, I see considerable gains (twice as fast) when bypassing web3 completely. Please have a quick look at these old experiments: https://github.com/drandreaskrueger/chainhammer/blob/master/log.md#sending-via-web3-versus-sending-via-rpc
Not sure we are talking about the same thing actually. Even when bypassing the web3.py library, I am using the RPC Have a look at these two codepieces: via web3in https://github.com/drandreaskrueger/chainhammer/blob/93c40384a4d178bdb00cea491d15b14046471b72/send.py#L73-L93 while via RPCin https://github.com/drandreaskrueger/chainhammer/blob/93c40384a4d178bdb00cea491d15b14046471b72/send.py#L106-L183 choiceI switch between those two routes here https://github.com/drandreaskrueger/chainhammer/blob/93c40384a4d178bdb00cea491d15b14046471b72/send.py#L201 choice constant
No, no difference between the two. As long as EDIT: Nicer formatting now here in FAQ.md, plus I raised an issue with the web3.py guys ... |
see above. Plus then:
I suggest you compare the tx signing part of the geth go code with the parity rust code. |
chainhammer v55The newest version is fully automated - you run & analyze a whole experiment with one or two CLI lines. I am optimistic that you will now find a clever combination of parity CLI switches to speed it up. Good luck. Because of general interest, I have created this new issue: |
Thanks for sharing 👍 |
So you want to track possible speed improvements in the new issue. Yes, that makes sense. |
I am benchmarking Ethereum based PoA chains, with my toolbox chainhammer.
My initial results for a dockerized network of parity aura v1.11.8 nodes ...
... leaves space for improvements :-)
Initial -unoptimized- run:
More details here: https://gitlab.com/electronDLT/chainhammer/blob/master/parity.md#benchmarking
Please help ...
... by suggesting what we could try, to get this faster than 60 TPS.
(Ideally approx 8 times faster, to beat quorum IBFT.)
Thanks a lot!
Andreas
The text was updated successfully, but these errors were encountered: