-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark variations #24
Comments
Default settings + AIO turned ON and OFF using Citrine --path "/plaintext" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false"
--path "/plaintext" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true"
|
--path "/plaintext" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=true"
|
Is this pipelined? Using non-pipelined gives most visibility to transport changes. |
--path "/plaintext" --arg "-e=epoll" --arg "-t=1" --arg "-s=true" --arg "-r=true" --arg "-c=true" --arg "-a=true"
--path "/plaintext" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=true" --arg "-c=true" --arg "-a=true"
--path "/plaintext" --arg "-e=epoll" --arg "-t=1" --arg "-s=true" --arg "-r=false" --arg "-c=true" --arg "-a=true"
|
I currently can not test
|
where can pipelining be configured? I am using the default settings |
Btw If I increase the threads count the numbers get better: --path "/plaintext" --arg "-e=epoll" --arg "-t=12" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false"
|
By default
With threadcount 1, the thread is the bottleneck. Adding more threads gives more rpms. |
JSON --path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false"
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true"
Which makes me think that when I am using @sebastienros how can I force the client to use pipelining when using custom --repository https://github.com/tmds/Tmds.LinuxAsync.git --project-file "test/web/web.csproj" --path "/plaintext" |
@adamsitnik can you run additional scenarios for JSON with AIO off to on where we vary ThreadCount and ConnectionCount?
|
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --connections 16384
--path "/json" --arg "-e=epoll" --arg "-t=14" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=14" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --connections 16384
--path "/json" --arg "-e=epoll" --arg "-t=28" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=28" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --connections 16384
|
Thanks, Adam. Can you run them also without AIO?
|
perf has 12 cores and citrine 28 if you need to know how many thread to use. This can also be done by command line. Changing the branch and repos should not have any impact on these settings. Two weeks ago we ran iouring benchmarks successfully on the "perf" machines: 16K connections is very high, is it reasonable? TE doesn't go near to that number. |
Yes, thanks! --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --connections 16384
--path "/json" --arg "-e=epoll" --arg "-t=14" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=14" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --connections 16384
--path "/json" --arg "-e=epoll" --arg "-t=28" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=28" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --connections 16384
|
thanks! Now I am going to try the plaintext -p ScriptName=pipeline -p PipelineDepth=16 |
Pipelined: AIO on an off with a single thread: --path "/plaintext" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --connections 256 -p ScriptName=pipeline -p PipelineDepth=16
--path "/plaintext" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --connections 256 -p ScriptName=pipeline -p PipelineDepth=16
There is almost no difference, which is very surprising to me. We should most probably add an option to run the "platform" benchmark and use the bare minimum thing which would reduce the aspnet noise and allow us to get to maximum throughput |
@sebastienros thansk! I've tried to run it on the
@tmds Is |
Putting the results in a table:
AIO batching relies epoll giving it multiple events of different sockets. So there needs to be enough sockets on the same epoll for this to happen. The CC vs TC for these benchmarks is:
We see two effects in the benchmarks:
The pipelining means only 1/16 requests actually needs to read/write to the socket. So 3996922 HTTP requests, is 249807 network requests. This number is far lower than the requests we're making in the JSON benchmark, so there is also lower chance of batching. |
@adamsitnik |
As reference benchmark we can use json with 256 connections and 1 thread (as long as that gives us
Yes, 5.5 has the interesting feature set I'd use as a minimal version. Especially |
@adamsitnik can we run a benchmarks where we check the effect of defersends/deferreceives with aio while keeping dispatch continuations at the default (dispatching to threadpool)?
|
Here you go: --path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=true" --arg "-r=false" --arg "-c=true" --arg "-a=true" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=true" --arg "-c=true" --arg "-a=true" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=true" --arg "-r=true" --arg "-c=true" --arg "-a=true" --connections 256
|
Thanks Adam. The single epoll thread is a bottleneck ( |
Sure! BTW I was unable to get 100% CPU, 99% was max. --path "/json" --arg "-e=epoll" --arg "-t=12" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=12" --arg "-s=true" --arg "-r=false" --arg "-c=true" --arg "-a=true" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=12" --arg "-s=false" --arg "-r=true" --arg "-c=true" --arg "-a=true" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=12" --arg "-s=true" --arg "-r=true" --arg "-c=true" --arg "-a=true" --connections 256
|
99 is fine too. The previous benchmarks had values as low as 39%.
How much do you get with a value of t=2/t=3?
|
How much do you get with a value of t=2/t=3? --path "/json" --arg "-e=epoll" --arg "-t=2" --arg "-s=true" --arg "-r=true" --arg "-c=true" --arg "-a=true" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=3" --arg "-s=true" --arg "-r=true" --arg "-c=true" --arg "-a=true" --connections 256
|
Thanks!
Those numbers are even far lower. I assume increasing the threadcount further won't help because that will reduce the nr of connections per socket per thread even more. TE JSON benchmarks max connections is 512. |
@adamsitnik I wonder how much performance differs if we batch receives by a. running the receiveloop on epoll thread using Can you run the benchmarks, and also share the perftrace files? (
|
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=4" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=4" --arg "-s=false" --arg "-r=true" --arg "-c=true" --arg "-a=true" --arg "-w=false" --connections 256
Do you have any preferences for file shares? I have none. Would g drive be OK? |
Thank you! gdrive works for me. |
@adamsitnik I'd like to know the impact of completing on threadpool (
|
@tmds here you go! --path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=2" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=3" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=4" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=2" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=3" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=4" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --arg "-w=false" --connections 256
|
Thank you!
|
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=2" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=3" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=4" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=false" --arg "-w=false" --connections 256
|
@adamsitnik Based on what we've benchmarked so far, I propose we look at these benchmarks and figure out if there is some things we can optimize further from traces.
I set |
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=2" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=3" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=4" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=5" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=6" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=2" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=3" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=4" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=5" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=true" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=6" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=true" --arg "-w=false" --connections 256
|
@tmds I've uploaded the traces to a subfolder called "19_03_2020" |
Thank you! Let's make a graph. With AIO, continue on threadpool ( So when we batch and defer to threadpool after receive, we see best performance at t=1. Adam, can you also run these benchmarks without AIO?
|
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=2" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=3" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=4" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=5" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=6" --arg "-s=false" --arg "-r=false" --arg "-c=true" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=1" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=2" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=3" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=4" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=5" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=false" --arg "-w=false" --connections 256
--path "/json" --arg "-e=epoll" --arg "-t=6" --arg "-s=false" --arg "-r=false" --arg "-c=false" --arg "-a=false" --arg "-w=false" --connections 256
|
All in one graph: The green line is what is most similar to corefx Socket*. As you can see, when continuing on epoll thread, there need to be enough threads. *: corefx Socket does the socket operation on the threadpool, while we continue on threadpool after the socket operation was performed on epoll thread cc @antonfirsov |
Closing. New benchmark round coming up: #78. |
Default values for benchmarks:
ThreadCount: 1
DeferReceives/DeferSends: false
DispatchContinuations: true
Ordered list:
cc @adamsitnik
The text was updated successfully, but these errors were encountered: