-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] gzip compression heuristic #7481
Conversation
Codecov Report
@@ Coverage Diff @@
## master #7481 +/- ##
==========================================
- Coverage 76.58% 76.57% -0.01%
==========================================
Files 336 335 -1
Lines 107137 107107 -30
==========================================
- Hits 82047 82020 -27
+ Misses 25090 25087 -3
Continue to review full report at Codecov.
|
@allanjude may be interested in this |
d_len = s_len - 1; | ||
} else { | ||
d_len = s_len; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this to be worthwhile we need to save at least a full sector since that's what it will be padded out too. Let's go ahead and assume a 4k sector size since the actual ashift isn't available, and this is only a heuristic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making the assumption of 4K sectors would mean that gzip is mostly disabled on small-block files/zvols, right? That seems like an unnecessary regression in functionality. Could we somehow have the spa_t accessible here, so that we could check the spa_min_ashift?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be good, and after a quick glance it does look like it's available in all the callers so it can be passed in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@behlendorf I'm not sure, this is a safe asumption. If the data contains all zeros it might be possible to compress some data larger than one block, down to a very small size, small enought to fit in the block pointer, by the embedded_data feature. If we don't add an additional check, to avoid this from happening, we completely break the embedded_data feature with the check you purpose, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RubenKelevra If the data is all zeros, it will be caught by the zio_compress_zeroed_cb() check on line 112, and written as a hole rather than as data.
The point @behlendorf was trying to make is that the SAVINGS needs to be at least one sector, else doing any compression is a waste, since it won't actually consume any less space on disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kpande I don't feel comfortable with your replies anymore. Just stop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@allanjude well, my point was: if the data is just one sector, no sector can be saved anymore. So the check will always return zero sectors saved. If the data can be compressed very well, but is also very small uncompressed, it won't be compressed anymore and thus the embedded_data feature isn't triggered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kpande: I think that @RubenKelevra is engaging in a technical discussion about the code in question. One needn't be an expert to ask questions and try to understand the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everyone is welcome to contribute to the technical discussion as long as it's done respectfully.
@RubenKelevra the embedded block pointers do complicate things slightly. For small data blocks there may be a benefit in doing the compression even if a full sector isn't saved. This is actually also true for meta data blocks. My comment, which @allanjude clarified nicely, was only for the common medium-large block case. There are quite a few cases here which need to be considered.
This is an interesting idea. I wonder how we might make it more generic so it can be used got zstd, and future compression algorithms as well. Like an 'is_slow' flag on any compression type/level that is slower than 100mb/s on an average core or something. |
@allanjude maybe we should test all compression algorithms on load and flag those which are slow depending on the used system? GZIP should be fast when using a extension card, I wonder if those tests are useful in this case. |
All, thanks for the helpful comments. I'm trying some new ways of building my benchmark test environment for this and it's been taking a while to get it all working. @behlendorf I'm hoping some of the benchmarking will help tease out a good lz4 threshold to use. The squash compression benchmark https://quixdb.github.io/squash-benchmark/ has some interesting comparisons between algorithms. If you choose the "x-ray" file and "s-desktop" processor, it shows that lz4 level 1 gets a compression ratio of 1.00 with a speed over 700 MB/s. gzip level 6 gets a ratio of 1.40 at a speed of 20 MB/s. It also lists an lz4f algorithm, I'm not sure how that compares to zfs's lz4. Based on that and other data on the squashfs site, I think there are 3 cases to consider for the heuristic based on how well gzip is able to compress a file. "Very compressible" case. If gzip can compress a given file greater than a ratio of, say, 1.8 (a rough guess for now), lz4 should also be able to compress the file (but less than gzip). The heuristic will decide the file is "very compressible" and invoke gzip to do full compression. Users see full gzip compression and a minimal speed reduction compared to without the heuristic. "Incompressible" case. If gzip is unable to compress a file to a ratio higher than the zfs minimum of 1.14 (aka 87.5%), lz4 will likely provide no compression on the file at all. In this case, the heuristic will skip gzip altogether and save signficant CPU time. Users see a significant speedup compared to not using the heuristic. This case is the big win and the reason to use the heuristic. "Somewhat compressible" case. These files have a gzip compression ratio between 1.8 and 1.14. Ideally, we would find an lz4 compression ratio, say greater than 1.01, that predicts that gzip would achieve better than the zfs minimum of 1.14. If the ratio we use for lz4 is too high compared to the ideal, users get faster and faster speed with less compression as we increase the ratio. If the ratio is too low, users get slower and slower speed (due to more invocations of gzip) but with maximal compression. If we find that tuning the compression heuristic just doesn't work very well for these somewhat-compresible files, we could consider putting it back on the user and giving them "gzip6" and "gzip6+lz4" compression options. Or we could try to find a better predictor of gzip compressibility. |
@rbrewer123 I agree with your analysis. Can you gather some data on the "somewhat compressible" cases, that could help us choose the lz4 ratio cutoff, or show us that there is no reasonable cutoff value? |
@ahrens, life has been getting in the way lately but I do intend to collect some data to inform what to do next. |
I like your approach on fine tuning those values, but why not just autotune those values based on the actual data? We can start with your numbers on boot. We keep a sliding average of the best value for the compress/no compress threshold by simply pick some random blocks like 1 on 10,000 and do a lz4 and a gzip compression to have additional data points. |
What's happening with this? I'd like to use this feature. How can we move this forward? |
@softwarecreations in the meanwhile, you should watch the ZSTD PR (#9024), which might be really close these days. |
@softwarecreations @ColinIanKing and @softwarecreations |
@Ornias1993 i had the same idea already, but never implemented it since zstd alone is much faster than gzip and lz4 would just waste cpu time here |
to check if a block is incompressible is also not that hard without lz4. a statistical approach by checking gaussian distribution for instance works too |
@BrainSlayer I wasn't suggesting something, just pointed them to a changed link ;) That being said: |
Okey, is this PR actually still relevant... Simply put: ZSTD is able to give GZIP level compression @ 50-75% of LZ4 speed, adding any type of LZ4 heuristic compression on top would be useless. I think this van be quite safely closed. |
IMHO it's fair to treat gzip as deprecated for purposes other than interop. |
|
A note, that QAT accelerated gzip likely has similar problems to ZSTD re: recompression. If a block was originally compressed with software gzip, QAT gzip may compress it differently and result in a different checksum. This may cause L2ARC issues (with compressed_arc disabled), but also cause problems with nopwrite etc. It is likely it would "inherit" the solution we come up with for ZSTD future-proofing. |
@allanjude This PR has nothing to do with GZIP in general, please add your note elsewhere because it's not relevant for this PR at all. |
I noticed that this PR hasn’t been updated in some time. It seems like this might still have some benefit even with zstd? But it seems that it isn’t quite complete, so we’re going to close it for now. When you have time to revisit this work, feel free to reopen this PR or open a new one. Thanks for your contribution to OpenZFS and please continue to open PR’s to address issues that you encounter! |
Description
This change uses lz4 as a heuristic to decide if a block is worth compressing with gzip levels 6-9. If lz4 provides any compression at all, gzip is attempted. Otherwise, the block is assumed incompressible and gzip is skipped.
Motivation and Context
The goal of this change is to improve gzip compression speed for incompressible data, without significantly affecting performance on compressible data. #7373 provides additional discussion and rationale.
How Has This Been Tested?
I haven't tested this yet. I will add benchmarks here as I test this. I encourage others with particular workloads to post their own benchmarks.
This change does not properly deal with the QAT hardware compression yet. It might be that if this heuristic check is viable, it should go in gzip_compress(}. The issue with that is
d_len
has already been reduced by 12.5%. I could uses_len
instead, but it feels error prone. For now I'm posting this for benchmarking purposes.Types of changes
Checklist:
Signed-off-by
.