You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note: This may be related to #9786 or #8836, I am not sure.
System information
Type
Version/Name
Distribution Name
Ubuntu
Distribution Version
20.04 focal (development)
Linux Kernel
5.4.0-14-generic #17-Ubuntu
Architecture
amd64
ZFS Version
0.8.3-1ubuntu3
SPL Version
0.8.3-1ubuntu3
System CPU
AMD EPYC 7401P 24-Core Processor
System Cores
24
System Threads
48
System Memory
256G
AES-NI
Yes
Describe the problem you're observing
Six disk raidz1 of NVMe Enterprise 960GB SSDs. Each SSD can get about 700MB/s with a single thread or 3000MB/s with 16 threads (which is itself weird to me, but this is my first time using NVMe on linux).
I then test tank/none, tank/comp (compression), tank/enc (encryption), tank/both (compression and encryption). I repeated this test with both 1 process and 16 processes, and then formatted the NVMs and repeated with ashift=12,13,14,15,16.
Test settings: iodepth=64, ioengine=libaio, rw=rw, numjobs={1,16},bs=64k,runtime=60.
The results are as follows. Especially surprising are 1099MB/s speeds using just compression across a six disk NVMe on a 24 core machine when a single disk can do 3000MB/s, as well as 35MB/s (!) speeds using encryption. I am also confused as to why a single process can't seem to max out even the single disk bandwidth when there is CPU to spare.
Also confusing to me is why a six-disk raidz1 of 3000MB/s disks can only seem to work at 2700MB/s maximum.
Include any warning/errors/backtraces from the system logs
Nothing anomalous except the following:
[ 5364.173793] perf: interrupt took too long (2501 > 2500), lowering kernel.perf_event_max_sample_rate to 79750
[ 8798.394144] perf: interrupt took too long (3128 > 3126), lowering kernel.perf_event_max_sample_rate to 63750
[14382.183488] perf: interrupt took too long (3945 > 3910), lowering kernel.perf_event_max_sample_rate to 50500
Request
Further troubleshooting steps are appreciated. Should I be building/testing against master? I need to return this box to the colo if I can't get it in shape relatively soon.
I am happy to run any additional tests or experiments that anyone can think of.
The text was updated successfully, but these errors were encountered:
sneak
changed the title
vastly reduced speeds using compression, encryption, or both on 0.8.3-1ubuntu3
vastly reduced speeds using six NVMe zpool as plain, compression, encryption, or both on 0.8.3-1ubuntu3
Feb 27, 2020
sneak
changed the title
vastly reduced speeds using six NVMe zpool as plain, compression, encryption, or both on 0.8.3-1ubuntu3
vastly reduced speeds using six NVMe raidz1 as plain, compression, encryption, or both on 0.8.3-1ubuntu3
Feb 27, 2020
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
Note: This may be related to #9786 or #8836, I am not sure.
System information
#17-Ubuntu
Describe the problem you're observing
Six disk raidz1 of NVMe Enterprise 960GB SSDs. Each SSD can get about 700MB/s with a single thread or 3000MB/s with 16 threads (which is itself weird to me, but this is my first time using NVMe on linux).
Single-disk example:
In raidz1 (using the whole disk) I have created child datasets like so:
I then test tank/none, tank/comp (compression), tank/enc (encryption), tank/both (compression and encryption). I repeated this test with both 1 process and 16 processes, and then formatted the NVMs and repeated with ashift=12,13,14,15,16.
Test settings: iodepth=64, ioengine=libaio, rw=rw, numjobs={1,16},bs=64k,runtime=60.
The results are as follows. Especially surprising are 1099MB/s speeds using just compression across a six disk NVMe on a 24 core machine when a single disk can do 3000MB/s, as well as 35MB/s (!) speeds using encryption. I am also confused as to why a single process can't seem to max out even the single disk bandwidth when there is CPU to spare.
Also confusing to me is why a six-disk raidz1 of 3000MB/s disks can only seem to work at 2700MB/s maximum.
ashift=12
ashift=13
ashift=14
ashift=15
ashift=16
The complete log output (the above lines are just a summary) can be found at:
https://gist.github.com/838611b86ac54a138f68f59ba44f10fb
Describe how to reproduce the problem
Precise steps to reproduce are located in this testing script:
https://gist.github.com/sneak/3e14a2a3fa04fdc73ca51c72a105a884
And the full test output:
https://gist.github.com/838611b86ac54a138f68f59ba44f10fb
Include any warning/errors/backtraces from the system logs
Nothing anomalous except the following:
Request
Further troubleshooting steps are appreciated. Should I be building/testing against master? I need to return this box to the colo if I can't get it in shape relatively soon.
I am happy to run any additional tests or experiments that anyone can think of.
The text was updated successfully, but these errors were encountered: