Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libvma performance issues with haproxy #877

Open
scarlet-storm opened this issue Jan 30, 2020 · 1 comment
Open

libvma performance issues with haproxy #877

scarlet-storm opened this issue Jan 30, 2020 · 1 comment

Comments

@scarlet-storm
Copy link

scarlet-storm commented Jan 30, 2020

I have observed the bandwidth and latency improvements in benchmark applications like sockperf and iperf with use of libvma for small TCP message sizes. Hence, I am trying to evaluate the performance improvement of haproxy with libvma for analysing use of libvma in layer 7 load balancing. I have a setup of two machines running nginx servers and haproxy is configured in http mode with round robin load balancing. I am using wrk as a load generator from another machine on the network to benchmark the haproxy setup. Without libvma I have results for the given test from wrk as

wrk --latency http://10.48.114.100:8025 -t 10 -c 12 -d 30
Running 30s test @ http://10.48.114.100:8025
  10 threads and 12 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   339.48us  284.01us  12.93ms   97.42%
    Req/Sec     3.09k   217.10     4.18k    71.79%
  Latency Distribution
     50%  303.00us
     75%  357.00us
     90%  436.00us
     99%  789.00us
  925118 requests in 30.10s, 740.22MB read
Requests/sec:  30735.44
Transfer/sec:     24.59MB

Running with libvma

LD_PRELOAD=libvma.so haproxy -- /etc/haproxy/haproxy.cfg
wrk --latency http://10.48.114.100:8025 -t 10 -c 12 -d 30
Running 30s test @ http://10.48.114.100:8025
  10 threads and 12 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    20.81ms  116.18ms   1.03s    96.55%
    Req/Sec     2.71k   742.46     7.98k    85.83%
  Latency Distribution
     50%  215.00us
     75%  593.00us
     90%    1.02ms
     99%  767.08ms
  771434 requests in 30.10s, 617.25MB read
Requests/sec:  25629.47
Transfer/sec:     20.51MB

Also running with VMA_SPEC=latency

VMA_SPEC=latency LD_PRELOAD=libvma.so haproxy -- /etc/haproxy/haproxy.cfg
wrk --latency http://10.48.114.100:8025 -t 10 -c 12 -d 30
Running 30s test @ http://10.48.114.100:8025
  10 threads and 12 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.26ms    3.64ms  75.16ms   84.54%
    Req/Sec     1.07k   380.69     3.46k    77.74%
  Latency Distribution
     50%  329.00us
     75%    3.30ms
     90%    7.67ms
     99%   12.13ms
  291558 requests in 30.03s, 233.29MB read
  Socket errors: connect 0, read 0, write 0, timeout 1
Requests/sec:   9708.40
Transfer/sec:      7.77MB

Both average latency and total bandwidth stats are lesser with libvma. I have tried following the tuning guide to bind the process to same NUMA node as the NIC and to cores but the results are still worse. This behaviour is strange as vma_stats shows all the packets as offloaded.

Are there any tips to tuning libvma paramters to increase performance for this particular workload?

haproxy config file for reference.

global
	user root 
	group root
	daemon

	# Default SSL material locations
	ca-base /etc/ssl/certs
	crt-base /etc/ssl/private

	# Default ciphers to use on SSL-enabled listening sockets.
	# For more information, see ciphers(1SSL). This list is from:
	#  https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
	# An alternative list with additional directives can be obtained from
	#  https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy
	ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
	ssl-default-bind-options no-sslv3
	nosplice
	# TUNING
	#tune.h2.initial-window-size 1048576

defaults
        timeout connect 50000
        timeout client  500000
        timeout server  500000
	errorfile 400 /etc/haproxy/errors/400.http
	errorfile 403 /etc/haproxy/errors/403.http
	errorfile 408 /etc/haproxy/errors/408.http
	errorfile 500 /etc/haproxy/errors/500.http
	errorfile 502 /etc/haproxy/errors/502.http
	errorfile 503 /etc/haproxy/errors/503.http
	errorfile 504 /etc/haproxy/errors/504.http
	
	http-reuse safe

# My Configuration
frontend fe
	mode http
        bind *:8025
        default_backend be

backend be
        mode http
	balance roundrobin
	#option http-keep-alive
        server s0 10.48.34.122:80 
	server s2 10.48.34.125:80

Config:
VMA_VERSION: 8.9.5-0
OFED Version: MLNX_OFED_LINUX-4.7-3.2.9.0
System: 4.9.0-9-amd64
Architecture: x86_64
NIC: ConnectX-5 EN network interface card

@igor-ivanov
Copy link
Collaborator

igor-ivanov commented May 12, 2020

Hello @LeaflessMelospiza,
Any networking benchmarks can not always allow to reconstruct real world application behavior.
haproxy has own specific and it has not been studied well to have recommended optimal VMA configuration.
You can try to compile VMA using --enable-tso configuration option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants