Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add Adaptive Compression (Rework/Revamp) #11002

Conversation

PrivatePuffin
Copy link
Contributor

@PrivatePuffin PrivatePuffin commented Sep 29, 2020

Intro

(By @Ornias1993 )
This is a rework of #7560 by @RubenKelevra which itself is successor of the auto compression PR (#5928) by @n1kl.
Adaptive compression has been long due for a rebase and some basic rework.

While I don't personally have the skill to carry this, I hope this enables someone with the right skillset to finish it, based on ZSTD.

Description

(by @RubenKelevra)
I did some performance measurements and it wasn't meeting my expectations. So I tweaked the algoritm and excluded the off-compression as an option since the algorithm actually isn't able to determine the additional latency resulting of the larger data size when no compression is applied. I added gzip-2 to gzip-9 as options for the algorithm to choose from.

The algorithm should adapt to different CPU load situations, since it's measuring the latency introduces over the last 1000 compression cycles (one cycle one block). If the load of the system change over time, it might choose different compression algorithms.

In the light of zstd, the adaptive compression keyword might be a good choice for an adaptive zstd mode in the future selecting different zstd compression rates and relying on the same mechanism to select those.

Motivation and Context

(By @n1kl)
Which compression algorithm is best for high throughput? The answer to this depends on the type of hardware in use.
If compression takes long then the disk remains idle. If compression is faster than the writing speed of the disk then the CPU remains idle as compression and writing to the disk happens in parallel.
Auto compression tries to keep both as busy as possible.
The disk load is observed through the vdev queue. If the queue is empty a fast compression algorithm like lz4 with low compression rates is used and if the queue is full then gzip-[1-9] can require more CPU time for higher compression rates.
The already existing zio_dva_throttle might conflict with the concept described above. Therefore it is recommended to deactivate zio_dva_throttle.

TODO list

Done

To Be Done

  • FreeBSD compatibility
  • Add ZSTD support
  • Replace GZIP with ZSTD (Make it scale between a few levels of zstd, instead of gzip and lz4)
  • Add comments where there are none
  • Make this tuneable: WIP: Adaptive compression [was: auto compression] #7560 (comment)
  • Add a test to test if adaptiveness is actually working
  • Backwards compatibility: Importing a adaptive-compression dataset in an older pool (without adaptive-compression), should revert to default compression level (needs to be checked)

How Has This Been Tested?

(by @RubenKelevra)
I ran a simple benchmark on a single HDD with different scenarios:

  • with and without load
  • with some common block sizes
  • with dva_throttle on and off
  • for xfs and ext4 on zvols
  • My corpus is /usr/lib from my system (5.9 G with 117,920 files in 17,777 folders), copied with cp -ax from an SSD to an HDD.

All ZFS settings was set to default, except for checksum, which was edonr.

System specs:

  • Intel i3 2330M @ 2.20GHz (2 physical / 4 logical cores)
  • 12 G DDR3 memory
  • 2.5" 750 G Samsung HDD (as destination)
  • Intel SSD 320 (as source)

I understands that this test results might not be valid for a typical server application, but it should be a good measurement for an average notebook user. A use case for ZFS where latency and thruput is important too.

The workload scenario was a synthetic CPU/memory bound only user space program, with one thread per logical CPU core. The programm used for this was BOINC, with seti@home work units.

The load of the system was measured 75 seconds into the copy (on runs which was completed in less than 75 s the load value is somewhat inaccurate). Overall this value isn't really a hard prove, that one test result is better than a different one. I just wanted to show that the load of the system doesn't skyrocket, when using adaptive instead of lz4 or a gzip level.

In the original PR the author explained that dva_throttle might interfere with this adaptive compress algorithm selection. And I can confirm this, it might result in slightly less performance in compression ratio, but I can not find a distinctive drop in I/O performance which would hinder an inclusion into the master. Furthermore, with all compression algorithms, the performance impact was mixed with and without dva_throttle.

Overall those performance numbers for adaptive compression often look pretty good. @RubenKelevra wasn't expecting a performance better than plain LZ4 compression, but it performed better in some scenarios.

I also like to point out that he used the filesystems without any parameters natively on the zvols. In my test the physical sector size is set by zfs to the recordsize, so the filesystems are aware of this (ext4 at least) and might use some (automatic) optimizations for those large physical sector sizes. This might lead to different results than in VMs, where the physical block size is usually 512 or 4096 for the filesystems inside the VM.

adaptive compression stats.pdf

adaptive compression stats zvol.pdf

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

Signed-off-by: Kjeld Schouten-Lebbing kjeld@schouten-lebbing.nl

@PrivatePuffin PrivatePuffin force-pushed the adaptive_compression_rebased2 branch 2 times, most recently from b5a9bc7 to dbc6d3e Compare September 29, 2020 13:48
@PrivatePuffin PrivatePuffin force-pushed the adaptive_compression_rebased2 branch 3 times, most recently from 0915b74 to 8c2ba80 Compare September 29, 2020 14:57
@PrivatePuffin

This comment has been minimized.

@PrivatePuffin PrivatePuffin force-pushed the adaptive_compression_rebased2 branch 5 times, most recently from 9d138e9 to 7109bc3 Compare September 29, 2020 17:24
@PrivatePuffin PrivatePuffin changed the title [WIP] try and rebase adaptive compression [WIP] Add Adaptive Compression (Rework/Revamp) Sep 29, 2020
@PrivatePuffin PrivatePuffin force-pushed the adaptive_compression_rebased2 branch 6 times, most recently from ac971ee to bf1d54f Compare September 29, 2020 18:35
@AndyLavr
Copy link

Linux-next 20200929 build and work fine.

# zpool get all zdata
NAME   PROPERTY                       VALUE                          SOURCE
zdata  size                           83G                            -
zdata  capacity                       72%                            -
zdata  altroot                        -                              default
zdata  health                         ONLINE                         -
zdata  guid                           10843200388275897646           -
zdata  version                        -                              default
zdata  bootfs                         -                              default
zdata  delegation                     on                             default
zdata  autoreplace                    off                            default
zdata  cachefile                      -                              default
zdata  failmode                       wait                           default
zdata  listsnapshots                  off                            default
zdata  autoexpand                     off                            default
zdata  dedupratio                     1.00x                          -
zdata  free                           22.6G                          -
zdata  allocated                      60.4G                          -
zdata  readonly                       off                            -
zdata  ashift                         12                             local
zdata  comment                        -                              default
zdata  expandsize                     -                              -
zdata  freeing                        0                              -
zdata  fragmentation                  51%                            -
zdata  leaked                         0                              -
zdata  multihost                      off                            default
zdata  checkpoint                     -                              -
zdata  load_guid                      1869430285257613056            -
zdata  autotrim                       off                            default
zdata  feature@async_destroy          enabled                        local
zdata  feature@empty_bpobj            active                         local
zdata  feature@lz4_compress           active                         local
zdata  feature@multi_vdev_crash_dump  enabled                        local
zdata  feature@spacemap_histogram     active                         local
zdata  feature@enabled_txg            active                         local
zdata  feature@hole_birth             active                         local
zdata  feature@extensible_dataset     active                         local
zdata  feature@embedded_data          active                         local
zdata  feature@bookmarks              enabled                        local
zdata  feature@filesystem_limits      enabled                        local
zdata  feature@large_blocks           enabled                        local
zdata  feature@large_dnode            active                         local
zdata  feature@sha512                 enabled                        local
zdata  feature@skein                  enabled                        local
zdata  feature@edonr                  enabled                        local
zdata  feature@userobj_accounting     active                         local
zdata  feature@encryption             enabled                        local
zdata  feature@project_quota          active                         local
zdata  feature@device_removal         enabled                        local
zdata  feature@obsolete_counts        enabled                        local
zdata  feature@zpool_checkpoint       enabled                        local
zdata  feature@spacemap_v2            active                         local
zdata  feature@allocation_classes     enabled                        local
zdata  feature@resilver_defer         enabled                        local
zdata  feature@bookmark_v2            enabled                        local
zdata  feature@redaction_bookmarks    enabled                        local
zdata  feature@redacted_datasets      enabled                        local
zdata  feature@bookmark_written       enabled                        local
zdata  feature@log_spacemap           active                         local
zdata  feature@livelist               enabled                        local
zdata  feature@device_rebuild         enabled                        local
zdata  feature@zstd_compress          active                         local
zdata  feature@compress_adaptive      enabled                        local
# zpool get feature@compress_adaptive zdata
NAME   PROPERTY                   VALUE                      SOURCE
zdata  feature@compress_adaptive  enabled                    local

@PrivatePuffin
Copy link
Contributor Author

PrivatePuffin commented Sep 30, 2020

@AndyLavr Yeah I expected that much, but I need it to pass all tests on the whole test-suite (so also all OS combi's) before i'm going to add the additional tests to the test suite. I've already have the new tests ready to be pushed, just need the old ones to pass first.

@behlendorf got some buildbot issues over here...

@behlendorf
Copy link
Contributor

@Ornias1993 I sorted out some CI issues yesterday, please go ahead and force update the PR and it should run.

@PrivatePuffin PrivatePuffin force-pushed the adaptive_compression_rebased2 branch 2 times, most recently from df275c7 to 82b2fa3 Compare September 30, 2020 17:19
@PrivatePuffin
Copy link
Contributor Author

I'm open for suggestions to fix on FreeBSD:
cast from 'const char *' to 'void *' drops const qualifier

In:
int

dsl_dataset_activate_compress_adaptive(const char *ddname)
{
	int error;

	error = dsl_sync_task(ddname, dsl_dataset_actv_compress_adaptive_check,
	    dsl_dataset_actv_compress_adaptive_sync, (void *)ddname, 0,
	    ZFS_SPACE_CHECK_RESERVED);

	return (error);
}

@PrivatePuffin
Copy link
Contributor Author

PrivatePuffin commented Sep 30, 2020

For reference, The only test that is failing due to adaptive compression is:
zpool_create_encrypted

@AndyLavr
So we currently have two issues that need work most urgently:

  • The test failure I listed above
  • FreeBSD not building (as explained earlier). Both should be relatively easy to fix.

@PrivatePuffin PrivatePuffin force-pushed the adaptive_compression_rebased2 branch 3 times, most recently from 0fb43f7 to fe3c78e Compare September 30, 2020 21:44
PrivatePuffin referenced this pull request in BrainSlayer/zfs Oct 1, 2020
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
@PrivatePuffin
Copy link
Contributor Author

@gmelikov So TLDR: This would not be compatible with dedupe right?

@PrivatePuffin
Copy link
Contributor Author

Okey, with adaptive compression added to the test suite, it still only fails on one test:
FAIL cli_root/zpool_create/zpool_create_encrypted (expected PASS)

@BrainSlayer
Copy link
Contributor

JFYI we compress before checksums generation https://github.com/openzfs/zfs/blob/master/module/zfs/zio.c#L4788 , so any compression change is very invasive.

sounds like a performance bootleneck to me if everything is compressed before dedup takes place

@gmelikov
Copy link
Member

gmelikov commented Oct 2, 2020

@gmelikov So TLDR: This would not be compatible with dedupe right?

looks like same blocks with different compression methods will be different for dedup, if you mean that.

sounds like a performance bootleneck to me if everything is compressed before dedup takes place

Unfortunately, it's not that simple. For example:

Pros from checksums on processed data:

  • scrub
  • read and check data, we won't even try to decompress/etc invalid block
  • encrypted data doesn't need a key to be scrubbed

Cons:

  • dedup after different compression on same block
  • you can't just check uncompressed raw data

So, ZFS is better at writes, it looks like not a bad compromise for me. But dedup does have this problem, yes. And, it's not a main bottleneck for dedup now:)

@PrivatePuffin
Copy link
Contributor Author

And, it's not a main bottleneck for dedup now:)

Yes, it would only be slightly problematic with updating (although not major) and it would just mean adaptive compression is not really well suited for dedupe, which both isn't a big issue imho.

@PrivatePuffin
Copy link
Contributor Author

PrivatePuffin commented Oct 2, 2020

@BrainSlayer I've added removal of the feature flag to the TODO list :)

Can someone give me some insight why that one test is so problematic btw?

@BrainSlayer
Copy link
Contributor

And, it's not a main bottleneck for dedup now:)

Yes, it would only be slightly problematic with updating (although not major) and it would just mean adaptive compression is not really well suited for dedupe, which both isn't a big issue imho.

mmh i'm using compression and dedup :-). my system has alot of sourcecodes stored and i compile these sourcecode trees parallel for different cpu architectores. so i have alot of very big cloned sourcetrees. perfect application for dedup and compression. just some cents

@PrivatePuffin
Copy link
Contributor Author

And, it's not a main bottleneck for dedup now:)

Yes, it would only be slightly problematic with updating (although not major) and it would just mean adaptive compression is not really well suited for dedupe, which both isn't a big issue imho.

mmh i'm using compression and dedup :-). my system has alot of sourcecodes stored and i compile these sourcecode trees parallel for different cpu architectores. so i have alot of very big cloned sourcetrees. perfect application for dedup and compression. just some cents

Thats NOT what i'm saying.
I'm not saying dedupe and compression isn't compatible.
I'm saying THIS PR isn't compatible with dedupe very well AND ZSTD updates are LESS compatible.

@BrainSlayer
Copy link
Contributor

@Ornias1993 updates might be less compatible. but this affects only new written blocks and so all new written blocks will dedup in the same way as before (except if there is a reference to older blocks of course). the reason why i'm talking about it is just the thought if there is any solution for that problem.

@IvanVolosyuk
Copy link
Contributor

What solution is possible if the same block can be written (hashed) in multiple ways depending on the phase of the moon? It just means that dedup efficiency will be much lower and unpredictable. @Ornias1993 is right.

@PrivatePuffin
Copy link
Contributor Author

What solution is possible if the same block can be written (hashed) in multiple ways depending on the phase of the moon?

@IvanVolosyuk That means that any(!) compression algorithm that is dynamic will inherently not be very compatible with dedupe. Which is fine: It's not a requirement for merging that everything should be fully compatible with dedupe.

@PrivatePuffin PrivatePuffin force-pushed the adaptive_compression_rebased2 branch 2 times, most recently from 59c2a87 to 4286ec5 Compare October 7, 2020 23:17
@PrivatePuffin
Copy link
Contributor Author

PrivatePuffin commented Oct 7, 2020

@BrainSlayer I squashed your changes and the tests I added into the first commit.

I also went ahead and tried to cut the feature flag related code, as this is not my specialty: Could you check if I cut enough or not enough or if I made any mistakes there? (it's in the last commit)

Its also rebased on master (again)


Edit

  • It does build on FreeBSD and Linux now, which is an improvement.
  • Completely reworked the PR description (including the past descriptions and tests by @RubenKelevra and @n1kl )

@PrivatePuffin
Copy link
Contributor Author

@RubenKelevra You know this code best. i know you don't have a lot of time, but would you mind writing a short one-two line comment for every function in compres_adaptive.c to explain what it's doing? Doesn't have to be much or perfect, I'm perfectly fine in fixing it up a little, but it helps a lot.

@PrivatePuffin PrivatePuffin force-pushed the adaptive_compression_rebased2 branch from 718e013 to d45acd6 Compare October 8, 2020 13:26
@BrainSlayer
Copy link
Contributor

@BrainSlayer I squashed your changes and the tests I added into the first commit.

I also went ahead and tried to cut the feature flag related code, as this is not my specialty: Could you check if I cut enough or not enough or if I made any mistakes there? (it's in the last commit)

Its also rebased on master (again)

Edit

  • It does build on FreeBSD and Linux now, which is an improvement.
  • Completely reworked the PR description (including the past descriptions and tests by @RubenKelevra and @n1kl )

yes. but later. i'm working on another project right now. my head is full of assembly right now

@PrivatePuffin
Copy link
Contributor Author

PrivatePuffin commented Oct 8, 2020

my head is full of assembly right now

Good luck! 👍 |


edit
Current failure on FreeBSD:

16:39:31.81 link_elf_obj: symbol compress_calc_Bps undefined
16:39:31.81 linker_load_file: /boot/modules/openzfs.ko - unsupported file type

@RubenKelevra
Copy link

RubenKelevra commented Oct 8, 2020

By @BrainSlayer

And, it's not a main bottleneck for dedup now:)

Yes, it would only be slightly problematic with updating (although not major) and it would just mean adaptive compression is not really well suited for dedupe, which both isn't a big issue imho.

mmh i'm using compression and dedup :-). my system has alot of sourcecodes stored and i compile these sourcecode trees parallel for different cpu architectores. so i have alot of very big cloned sourcetrees. perfect application for dedup and compression. just some cents

Well, you can happily continue to use this combination. Adaptive compression is just not well suited for your use-case, because there's no guarantee that it will select the same compression level for both blocks with the same data.

That said, it might hunt for the highest level of compression all the time, since the lookup for dedup will delay the write to the disk and thus reduce the throughput.

So it might be more compatible that it seems on the first glance.

By @Ornias1993

@RubenKelevra You know this code best. i know you don't have a lot of time, but would you mind writing a short one-two line comment for every function in compres_adaptive.c to explain what it's doing? Doesn't have to be much or perfect, I'm perfectly fine in fixing it up a little, but it helps a lot.

Yes I have some free time tomorrow, will look into this in the next like 24 hours.

  • Backwards compatibility: Importing a adaptive-compression dataset in an older pool (without adaptive-compression), should revert to default compression level (needs to be checked)

The point of the feature flag was IIRC that the 'adaptive compression' setting value need to be understood by ZFS, or it might map to a different compression algorithm on an older version or to an out of range value.

I think we need to decide between backwards compatibility and complexity here - since there are many different old versions of the ZFS out there. I don't know how they all would react to an out of range value - so we need to be cautious here to not make everything unreadable when loaded with an old version of ZFS.

But on the other hand, when we move from. GZIP to ZSTD only very very recent versions could open it anyway, since we would require the ZSTD feature for operation.

I think using a feature flag is the safe way here, since we loose very little compatibility anyway without a 'fallback' and no feature flag, since ZSTD is a requirement anyway. So KISS. ;)

  • Replace GZIP with ZSTD (Make it scale between a few levels of zstd, instead of gzip and lz4)

I would add all levels of ZSTD, this way the algorithm can choose the optimal compression setting, and don't 'jump' between two if the optimal one is right in the middle.

There's nothing to loose here, adding more compression methods, just a tiny amount of memory is used to keep track of them. Switching between compression levels shouldn't change that much of the ZSTD code, too - so cache misses (cache trashing) on the CPU by constantly switching compression levels is unlikely.

  • Add a test to test if adaptiveness is actually working

This test could be difficult, since it's not guaranteed that adaptive compression will ever choose a different compression algorithm.

A simple way to (nearly) guarantee this, could be to store some /dev/urandom data and then some /dev/zero data. It's extremely unlikely that both types of data will be compressed equally fast, thus a variation of the compression level is nearly guaranteed.

@PrivatePuffin
Copy link
Contributor Author

PrivatePuffin commented Oct 8, 2020

Yes I have some free time tomorrow, will look into this in the next like 24 hours.

Awesome! 👍

I don't know how they all would react to an out of range value - so we need to be cautious here to not make everything unreadable when loaded with an old version of ZFS.

If it maps to default, that wouldn't be an issue. if we figure this out it wouldn't mater I think.
Which would be kiss in itself.

I would add all levels of ZSTD, this way the algorithm can choose the optimal compression setting, and don't 'jump' between two if the optimal one is right in the middle.

Nope thats wrong imho.

  • Some levels of zstd create an unrealisticly high load, we just added them because maybe in the far-far future they could be used. ZSTD 11 is about as fast as GZIP-9, you really don't want to push much byond it.
  • ZSTD compression ratio doesn't really relyable scale between one level higher or lower. "the one in the middle" might actually be just about the same as one lower. There is a lot of testing proving it. Basically: 1 version lower or higher often almost falls within the margin of error.

My solution would be (having done extensive testing on it and selecting the 3 levels supported by TrueNAS in GUI):
1, 3, 5, 7, 9, 11 (and possibly 13 if you want to go really insane)

That being said:
We can try both options in practice as soon as we get it implemented, changing the few rows of code to change it would be no fuzz :)

Switching between compression levels shouldn't change that much of the ZSTD code, too - so cache misses (cache trashing) on the CPU by constantly switching compression levels is unlikely.

I agree that the switching isn't an issue for ZSTD to handle.

This test could be difficult, since it's not guaranteed that adaptive compression will ever choose a different compression algorithm.

Well, I don't think impossible (although it's not a priority at this stage). One would need to create two tests:

  1. Test if the latency change measurement is done right
  2. Create a codepath to inject a fake "override" latency change measurement

To be clear: I think this is a thing to add at the latest stage of the project... Not really relevant, But I think there are at least some things we can look into.


All this considered:
Adaptive compression passed all Linux compression tests!
we do need someone with FreeBSD experience to look into the error on the FreeBSD side though...

Biggest TODO for now would be:

  • Add ZSTD support
  • Add comments where there are none

I'll try to keep this all organized and rebased in the mean time :)

@PrivatePuffin PrivatePuffin force-pushed the adaptive_compression_rebased2 branch 2 times, most recently from accec0d to 109bb95 Compare October 8, 2020 22:05
AndyLavr and others added 2 commits October 9, 2020 00:11
- This Commit rebases and squashes openzfs#7560
- Add tests for Adaptive compression
- Add some commenting to compress_adaptive

Co-authored-by:  n1kl (bunge) <n1kl@users.noreply.github.com>
Co-authored-by:  Ruben Kelevra <RubenKelevra@users.noreply.github.com>
Co-authored-by: Andy Lavr <andy.lavr@gmail.com>
Co-authored-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Co-authored-by:  Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Signed-off-by: n1kl (bunge) <n1kl@users.noreply.github.com>
Signed-off-by: RubenKelevra <ruben@vfn-nrw.de>
Signed-off-by: Andy Lavr <andy.lavr@gmail.com>
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
This removes the adaptive compression feature flag

Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
@PrivatePuffin PrivatePuffin force-pushed the adaptive_compression_rebased2 branch from 109bb95 to 3f8bd88 Compare October 8, 2020 22:14
@RubenKelevra
Copy link

Yes I have some free time tomorrow, will look into this in the next like 24 hours.

Awesome!

I'm currently still reading the code, it's been a while :)

I don't know how they all would react to an out of range value - so we need to be cautious here to not make everything unreadable when loaded with an old version of ZFS.

If it maps to default, that wouldn't be an issue. if we figure this out it wouldn't mater I think.
Which would be kiss in itself.

Yeah, but only if we can guarantee that older versions really don't screw things up.

I would add all levels of ZSTD, this way the algorithm can choose the optimal compression setting, and don't 'jump' between two if the optimal one is right in the middle.

Nope thats wrong imho.

  • Some levels of zstd create an unrealisticly high load, we just added them because maybe in the far-far future they could be used. ZSTD 11 is about as fast as GZIP-9, you really don't want to push much byond it.
  • ZSTD compression ratio doesn't really relyable scale between one level higher or lower. "the one in the middle" might actually be just about the same as one lower. There is a lot of testing proving it. Basically: 1 version lower or higher often almost falls within the margin of error.

I know that they are very CPU intense, my point was, the more levels the algorithm can choose from, the closer it can stay to the current write delay. When you give it only one option which is way too slow and one which is way to fast it will choose both and switch between them all the time to meet the average delay in the middle.

You don't overload your system with the adaptive compression, since if there's too much load, the write performance will be hurt and the algorithm will reduce the compression ratio. That's the whole point of using it.

Additionally I don't think there's a CPU out there which can keep up with the usual write speeds on something like ZSTD 15, so it won't ever be chosen - that's why it's safe to include into the list.

My solution would be (having done extensive testing on it and selecting the 3 levels supported by TrueNAS in GUI):
1, 3, 5, 7, 9, 11 (and possibly 13 if you want to go really insane)

Well, in this case I'm insane ;D

I just don't care if system updates are slow, since I'm away from the system anyway. And decompression speed is fast on all compression levels.

$ mount -l | grep 'on / '
/dev/sda1 on / type btrfs (rw,noatime,thread_pool=1,compress-force=zstd:15,ssd_spread,space_cache=v2,subvolid=256,subvol=/@) [root]

That being said: We can try both options in practice as soon as we get it implemented, changing the few rows of code to change it would be no fuzz :)

Switching between compression levels shouldn't change that much of the ZSTD code, too - so cache misses (cache trashing) on the CPU by constantly switching compression levels is unlikely.

I agree that the switching isn't an issue for ZSTD to handle.

So following my arguments above there should be no issue to just include all ZSTD-levels like we included all GZIP-levels before - the performance should even be better since there's no constant switching between gzip-2 and lz4.

This test could be difficult, since it's not guaranteed that adaptive compression will ever choose a different compression algorithm.

Well, I don't think impossible (although it's not a priority at this stage). One would need to create two tests:

  1. Test if the latency change measurement is done right
  2. Create a codepath to inject a fake "override" latency change measurement

To be clear: I think this is a thing to add at the latest stage of the project... Not really relevant, But I think there are at least some things we can look into.

All this considered:
Adaptive compression passed all Linux compression tests!
we do need someone with FreeBSD experience to look into the error on the FreeBSD side though...

Biggest TODO for now would be:

  • Add ZSTD support
  • Add comments where there are none

I'll try to keep this all organized and rebased in the mean time :)

Thanks for your work. :)

@PrivatePuffin
Copy link
Contributor Author

I'm going to close this as it's basically STALE and the people originally working/interested in this have all but disappeared by now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants