-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lz4fast compression #5927
lz4fast compression #5927
Conversation
@n1kl, thanks for your PR! By analyzing the history of the files in this pull request, we identified @behlendorf, @edillmann and @mkjorling to be potential reviewers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest using separate enums for the blkptr_t
's compression field, and the property value's compression field, as is done in #3908
.TE | ||
|
||
\fBlz4fast\fR adds an acceleration n factor to \fBlz4\fR, | ||
reducung the time to compress by about 1.03^n. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: reducing
{ "lz4fast-70", ZIO_COMPRESS_LZ4FAST_70 }, | ||
{ "lz4fast-80", ZIO_COMPRESS_LZ4FAST_80 }, | ||
{ "lz4fast-90", ZIO_COMPRESS_LZ4FAST_90 }, | ||
{ "lz4fast-100", ZIO_COMPRESS_LZ4FAST_100 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that not all values of N
between 1 and 100 are supported for compression=lz4fast-N
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After N=20 I have only supported values in steps of 10 because of the low impact on compression ratio. I can extend the table when required. Keeping values up to 100 will give more flexibility when tuning #define COMPRESSIONLEVEL 12
for CPUs with big L1 cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only looked at dsl_dataset code. It looks good with a couple of nits.
dsl_dataset_actv_lz4fast_compress_sync, (void *)ddname, 0, | ||
ZFS_SPACE_CHECK_RESERVED); | ||
|
||
if (error == EALREADY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to hit this condition? It seems like this might unnecessary as I don't see dsl_dataset_actv_lz4fast_compress_check() returning EALREADY.
char *ddname = (char *)arg; | ||
dsl_pool_t *dp = dmu_tx_pool(tx); | ||
dsl_dataset_t *ds; | ||
spa_feature_t f = SPA_FEATURE_LZ4FAST_COMPRESS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. It might be better to use SPA_FEATURE_LZ4FAST_COMPRESS instead of introducing the variable f. It's helpful for tools like grok that might show a line. It's also consistent with the use of SPA_FEATAURE_* in this file.
@allanjude may be interested in this |
@n1kl will You remove conflicts and make suggested corrections? We would be happy to see lz4fast in ZFS. |
I noticed that this PR hasn’t been updated in some time. We would like to see this feature added, but it seems that it isn’t quite complete, so we’re going to close it for now. When you have time to revisit this work, feel free to reopen this PR or open a new one. Thanks for your contribution to OpenZFS and please continue to open PR’s to address issues that you encounter! |
As part of my master thesis at University of Hamburg I have targeted to improve ZFS through compression. Now I would like to share my 3 feature branches with the community.
This patch updates the lz4 *1 code to version 1.7.3 to make use of lz4 fast compression.
The lz4 code is based on a seperate project for updating lz4 inside the linux kernel.
There a few changes were made for an clean implementation and to improve speed that are currently in review *2.
*1: https://github.com/lz4/lz4
*2: https://patchwork.kernel.org/patch/9574745/
Description
LZ4-fast capability is now available.
zfs set compression=lz4fast-[1-20,30,+10*n,100]
Higher values result in improved compression speed and less ratio.
Motivation and Context
Lz4 fast trades in compression ratio for speed. This gives us more flexibility in environments with either low computational power or fast and many SSDs/HDDs where the lz4 is the limiting factor.
Autocompression and qos can also be improved by adding lz4fast algorithms.
How Has This Been Tested?
Checksums were made to proof full compatibility between the old and new lz4 compressed files.
Benchmark
Copy file from Tempfs to ZFS (ZFS also in Tempfs for high disk throughput simulation).
Types of changes
Branch overlapping changes (feature, compress values)
The patch has read-only backward compatibility by using the new SPA_FEATURE_LZ4FAST_COMPRESS feature. The feature activation procedure is equivalent to my other code branches.
Regarding the limited namespace of BP_GET_COMPRESS() (128 values), the
zio_compress enum's first part is for block pointer & dataset values, the second part for dataset values only. Or should I make use of a new property? This is an alternative suggestion to #3908.
Checklist: