-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add Adaptive Compression (Rework/Revamp) #11002
[WIP] Add Adaptive Compression (Rework/Revamp) #11002
Conversation
b5a9bc7
to
dbc6d3e
Compare
0915b74
to
8c2ba80
Compare
This comment has been minimized.
This comment has been minimized.
9d138e9
to
7109bc3
Compare
ac971ee
to
bf1d54f
Compare
Linux-next 20200929 build and work fine.
|
@AndyLavr Yeah I expected that much, but I need it to pass all tests on the whole test-suite (so also all OS combi's) before i'm going to add the additional tests to the test suite. I've already have the new tests ready to be pushed, just need the old ones to pass first. @behlendorf got some buildbot issues over here... |
@Ornias1993 I sorted out some CI issues yesterday, please go ahead and force update the PR and it should run. |
df275c7
to
82b2fa3
Compare
I'm open for suggestions to fix on FreeBSD: In:
|
For reference, The only test that is failing due to adaptive compression is: @AndyLavr
|
0fb43f7
to
fe3c78e
Compare
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
@gmelikov So TLDR: This would not be compatible with dedupe right? |
Okey, with adaptive compression added to the test suite, it still only fails on one test: |
sounds like a performance bootleneck to me if everything is compressed before dedup takes place |
looks like same blocks with different compression methods will be different for dedup, if you mean that.
Unfortunately, it's not that simple. For example: Pros from checksums on processed data:
Cons:
So, ZFS is better at writes, it looks like not a bad compromise for me. But dedup does have this problem, yes. And, it's not a main bottleneck for dedup now:) |
Yes, it would only be slightly problematic with updating (although not major) and it would just mean adaptive compression is not really well suited for dedupe, which both isn't a big issue imho. |
@BrainSlayer I've added removal of the feature flag to the TODO list :) Can someone give me some insight why that one test is so problematic btw? |
mmh i'm using compression and dedup :-). my system has alot of sourcecodes stored and i compile these sourcecode trees parallel for different cpu architectores. so i have alot of very big cloned sourcetrees. perfect application for dedup and compression. just some cents |
Thats NOT what i'm saying. |
@Ornias1993 updates might be less compatible. but this affects only new written blocks and so all new written blocks will dedup in the same way as before (except if there is a reference to older blocks of course). the reason why i'm talking about it is just the thought if there is any solution for that problem. |
What solution is possible if the same block can be written (hashed) in multiple ways depending on the phase of the moon? It just means that dedup efficiency will be much lower and unpredictable. @Ornias1993 is right. |
@IvanVolosyuk That means that any(!) compression algorithm that is dynamic will inherently not be very compatible with dedupe. Which is fine: It's not a requirement for merging that everything should be fully compatible with dedupe. |
59c2a87
to
4286ec5
Compare
@BrainSlayer I squashed your changes and the tests I added into the first commit. I also went ahead and tried to cut the feature flag related code, as this is not my specialty: Could you check if I cut enough or not enough or if I made any mistakes there? (it's in the last commit) Its also rebased on master (again) Edit
|
@RubenKelevra You know this code best. i know you don't have a lot of time, but would you mind writing a short one-two line comment for every function in compres_adaptive.c to explain what it's doing? Doesn't have to be much or perfect, I'm perfectly fine in fixing it up a little, but it helps a lot. |
718e013
to
d45acd6
Compare
yes. but later. i'm working on another project right now. my head is full of assembly right now |
Good luck! 👍 | edit
|
By @BrainSlayer
Well, you can happily continue to use this combination. Adaptive compression is just not well suited for your use-case, because there's no guarantee that it will select the same compression level for both blocks with the same data. That said, it might hunt for the highest level of compression all the time, since the lookup for dedup will delay the write to the disk and thus reduce the throughput. So it might be more compatible that it seems on the first glance. By @Ornias1993
Yes I have some free time tomorrow, will look into this in the next like 24 hours.
The point of the feature flag was IIRC that the 'adaptive compression' setting value need to be understood by ZFS, or it might map to a different compression algorithm on an older version or to an out of range value. I think we need to decide between backwards compatibility and complexity here - since there are many different old versions of the ZFS out there. I don't know how they all would react to an out of range value - so we need to be cautious here to not make everything unreadable when loaded with an old version of ZFS. But on the other hand, when we move from. GZIP to ZSTD only very very recent versions could open it anyway, since we would require the ZSTD feature for operation. I think using a feature flag is the safe way here, since we loose very little compatibility anyway without a 'fallback' and no feature flag, since ZSTD is a requirement anyway. So KISS. ;)
I would add all levels of ZSTD, this way the algorithm can choose the optimal compression setting, and don't 'jump' between two if the optimal one is right in the middle. There's nothing to loose here, adding more compression methods, just a tiny amount of memory is used to keep track of them. Switching between compression levels shouldn't change that much of the ZSTD code, too - so cache misses (cache trashing) on the CPU by constantly switching compression levels is unlikely.
This test could be difficult, since it's not guaranteed that adaptive compression will ever choose a different compression algorithm. A simple way to (nearly) guarantee this, could be to store some |
Awesome! 👍
If it maps to default, that wouldn't be an issue. if we figure this out it wouldn't mater I think.
Nope thats wrong imho.
My solution would be (having done extensive testing on it and selecting the 3 levels supported by TrueNAS in GUI): That being said:
I agree that the switching isn't an issue for ZSTD to handle.
Well, I don't think impossible (although it's not a priority at this stage). One would need to create two tests:
To be clear: I think this is a thing to add at the latest stage of the project... Not really relevant, But I think there are at least some things we can look into. All this considered: Biggest TODO for now would be:
I'll try to keep this all organized and rebased in the mean time :) |
accec0d
to
109bb95
Compare
- This Commit rebases and squashes openzfs#7560 - Add tests for Adaptive compression - Add some commenting to compress_adaptive Co-authored-by: n1kl (bunge) <n1kl@users.noreply.github.com> Co-authored-by: Ruben Kelevra <RubenKelevra@users.noreply.github.com> Co-authored-by: Andy Lavr <andy.lavr@gmail.com> Co-authored-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Co-authored-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl> Signed-off-by: n1kl (bunge) <n1kl@users.noreply.github.com> Signed-off-by: RubenKelevra <ruben@vfn-nrw.de> Signed-off-by: Andy Lavr <andy.lavr@gmail.com> Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
This removes the adaptive compression feature flag Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
109bb95
to
3f8bd88
Compare
I'm currently still reading the code, it's been a while :)
Yeah, but only if we can guarantee that older versions really don't screw things up.
I know that they are very CPU intense, my point was, the more levels the algorithm can choose from, the closer it can stay to the current write delay. When you give it only one option which is way too slow and one which is way to fast it will choose both and switch between them all the time to meet the average delay in the middle. You don't overload your system with the adaptive compression, since if there's too much load, the write performance will be hurt and the algorithm will reduce the compression ratio. That's the whole point of using it. Additionally I don't think there's a CPU out there which can keep up with the usual write speeds on something like ZSTD 15, so it won't ever be chosen - that's why it's safe to include into the list.
Well, in this case I'm insane ;D I just don't care if system updates are slow, since I'm away from the system anyway. And decompression speed is fast on all compression levels.
So following my arguments above there should be no issue to just include all ZSTD-levels like we included all GZIP-levels before - the performance should even be better since there's no constant switching between gzip-2 and lz4.
Thanks for your work. :) |
I'm going to close this as it's basically STALE and the people originally working/interested in this have all but disappeared by now. |
Intro
(By @Ornias1993 )
This is a rework of #7560 by @RubenKelevra which itself is successor of the auto compression PR (#5928) by @n1kl.
Adaptive compression has been long due for a rebase and some basic rework.
While I don't personally have the skill to carry this, I hope this enables someone with the right skillset to finish it, based on ZSTD.
Description
(by @RubenKelevra)
I did some performance measurements and it wasn't meeting my expectations. So I tweaked the algoritm and excluded the off-compression as an option since the algorithm actually isn't able to determine the additional latency resulting of the larger data size when no compression is applied. I added gzip-2 to gzip-9 as options for the algorithm to choose from.
The algorithm should adapt to different CPU load situations, since it's measuring the latency introduces over the last 1000 compression cycles (one cycle one block). If the load of the system change over time, it might choose different compression algorithms.
In the light of zstd, the adaptive compression keyword might be a good choice for an adaptive zstd mode in the future selecting different zstd compression rates and relying on the same mechanism to select those.
Motivation and Context
(By @n1kl)
Which compression algorithm is best for high throughput? The answer to this depends on the type of hardware in use.
If compression takes long then the disk remains idle. If compression is faster than the writing speed of the disk then the CPU remains idle as compression and writing to the disk happens in parallel.
Auto compression tries to keep both as busy as possible.
The disk load is observed through the vdev queue. If the queue is empty a fast compression algorithm like lz4 with low compression rates is used and if the queue is full then gzip-[1-9] can require more CPU time for higher compression rates.
The already existing zio_dva_throttle might conflict with the concept described above. Therefore it is recommended to deactivate zio_dva_throttle.
TODO list
Done
To Be Done
How Has This Been Tested?
(by @RubenKelevra)
I ran a simple benchmark on a single HDD with different scenarios:
All ZFS settings was set to default, except for checksum, which was edonr.
System specs:
I understands that this test results might not be valid for a typical server application, but it should be a good measurement for an average notebook user. A use case for ZFS where latency and thruput is important too.
The workload scenario was a synthetic CPU/memory bound only user space program, with one thread per logical CPU core. The programm used for this was BOINC, with seti@home work units.
The load of the system was measured 75 seconds into the copy (on runs which was completed in less than 75 s the load value is somewhat inaccurate). Overall this value isn't really a hard prove, that one test result is better than a different one. I just wanted to show that the load of the system doesn't skyrocket, when using adaptive instead of lz4 or a gzip level.
In the original PR the author explained that dva_throttle might interfere with this adaptive compress algorithm selection. And I can confirm this, it might result in slightly less performance in compression ratio, but I can not find a distinctive drop in I/O performance which would hinder an inclusion into the master. Furthermore, with all compression algorithms, the performance impact was mixed with and without dva_throttle.
Overall those performance numbers for adaptive compression often look pretty good. @RubenKelevra wasn't expecting a performance better than plain LZ4 compression, but it performed better in some scenarios.
I also like to point out that he used the filesystems without any parameters natively on the zvols. In my test the physical sector size is set by zfs to the recordsize, so the filesystems are aware of this (ext4 at least) and might use some (automatic) optimizations for those large physical sector sizes. This might lead to different results than in VMs, where the physical block size is usually 512 or 4096 for the filesystems inside the VM.
adaptive compression stats.pdf
adaptive compression stats zvol.pdf
Types of changes
Checklist:
Signed-off-by
.Signed-off-by: Kjeld Schouten-Lebbing kjeld@schouten-lebbing.nl