-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase default volblocksize from 8KB to 16KB. #12406
Conversation
6289205
to
4988fb6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Increasing this default makes good sense to me. Let's just make sure to review the logic in the default_volblocksize()
function to make sure it still works as intended. This code was added as part of the dRAID changes to make sure a reasonable default volblocksize
was used. It looks like it should be fine but I didn't do any actual manual testing.
@behlendorf I don't see how this change can make
|
It makes sense to me that most zvols should use volblocksize=16K or more. As you probably know, we originally defaulted to volblocksize=8K, compression=off, and refreservation=~volsize to minimize surprises when using it as replacement for existing volume managers. volblocksize=8k matched the pagesize of the hardware (SPARC) and default blocksize of the predominant filesystem (Solaris UFS) that might be used on zvols. The most important use cases for zvols has changed a lot since then. My understanding is that they are used primarily either for iSCSI (or FC?) targets, or for (local) VM disk images, and the space is thin-provisioned. For this use case you probably want a sparse/thin-provisioned volume, with compression ( My only concern about this change is in the potentially surprising impact of changing the default. I think that changing it as proposed here is OK, but maybe we could do even better by leaving |
@ahrens remain non-modern variant as default has a major drawback - you need to know more than
If someone really wants small volblocksize - they should just set it. IMHO nowadays 8k doesn't have any pros against 16k, even on NVMEs. Best what we can is to run bare benchmarks with 8k and 16k volblocksizes with comparison for safety. |
@ahrens I agree with @gmelikov on all points that it would be an overkill. Sure, we have 8KB mentioned in many materials published over the years, but I don't think it is good enough reason to go that complicated and publish more materials. On top of that I am not aware of any software in mentioned iSCSI/FC/VM realm of TrueNAS that would strongly depend on 8KB "physical sector size" reporting, not allowing 16KB. Solaris on SPARC was pretty unique in its page size. There is some that prefer 4KB (MS SQL, partially VMware), from which we have to hide the truth to make them live happily, but nothing I know prefers the 8KB more than "optimization". |
Many things has changed since previous default was set many years ago. Nowadays 8KB does not allow adequate compression or even decent space efficiency on many of pools due to 4KB disk physical block rounding, especially on RAIDZ and DRAID. It effectively limits write throughput to only 2-3GB/s (250-350K blocks/s) due to sync thread, allocation, vdev queue and other block rate bottlenecks. It keeps L2ARC expensive despite many optimizations and dedup just unrealistic. In FreeNAS/TrueNAS we for years default to at least 16KB volblocksize for mirror pools and even bigger (32-64KB) for RAIDZ, and so far we can find very few scenarios (not synthetic benchmarks) when smaller blocks would show sufficient benefits. Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Many things has changed since previous default was set many years ago. Nowadays 8KB does not allow adequate compression or even decent space efficiency on many of pools due to 4KB disk physical block rounding, especially on RAIDZ and DRAID. It effectively limits write throughput to only 2-3GB/s (250-350K blocks/s) due to sync thread, allocation, vdev queue and other block rate bottlenecks. It keeps L2ARC expensive despite many optimizations and dedup just unrealistic. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes openzfs#12406
Many things has changed since previous default was set many years ago. Nowadays 8KB does not allow adequate compression or even decent space efficiency on many of pools due to 4KB disk physical block rounding, especially on RAIDZ and DRAID. It effectively limits write throughput to only 2-3GB/s (250-350K blocks/s) due to sync thread, allocation, vdev queue and other block rate bottlenecks. It keeps L2ARC expensive despite many optimizations and dedup just unrealistic. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes openzfs#12406
Many things has changed since previous default was set many years ago.
Nowadays 8KB does not allow adequate compression or even decent space
efficiency on many of pools due to 4KB disk physical block rounding,
especially on RAIDZ and DRAID. It effectively limits write throughput
to only 2-3GB/s (250-350K blocks/s) due to sync thread, allocation,
vdev queue and other block rate bottlenecks. It keeps L2ARC expensive
despite many optimizations and dedup just unrealistic.
In FreeNAS/TrueNAS we for years default to at least 16KB volblocksize
for mirror pools and even bigger (32-64KB) for RAIDZ, and so far we
can find very few scenarios (not synthetic benchmarks) when smaller
blocks would show sufficient benefits.
It was discussed on today's OpenZFS meeting and got no objections.
Types of changes
Checklist:
Signed-off-by
.