-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
auto compression #5928
auto compression #5928
Conversation
This is a neat idea. I don't see how it conflicts with zio_dva_throttle. The allocation throttle should be configured to send at least enough i/os to each device to fill its queue ( I would suggest using separate enums for the |
man/man5/zpool-features.5
Outdated
pool. | ||
|
||
This feature becomes \fBactive\fR as soon as it is used on one dataset and will | ||
return to being \fBenabled\fB once all filesystems that have ever had their compression set to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo, should be enabled\fR
df94bbc
to
7cd574f
Compare
@ahrens For performance it is important that the vdev queue never gets empty.
If the compression factor is >4 there would have to be more than 100 ZIOs in the queue, the zio_dva_trottle becomes active and we would stick to lz4. An alternative approach is to monitor |
Is that a similar feature like the smart compression (https://reviews.csiden.org/r/266/) ? btw does anyone know why this was discsrded? |
@jumbi77 From reading the description of smart compression the compression is turned off for a while when data is classified incompressible to save computational resources. This is a further improvement that can also be added for this feature in the future. |
Thanks for taking the time and effort to implement this, I bet this will majorly improve the usability of compression on desktop computers. Did you test if this improves the system load while saving files of the system is busy? This is one of the major blockers IMHO for compression. Does it support more than gzip-1 at the moment? Usually gzip-5 or even gzip-7 often gives a lot better compression on stuff like text files or software libraries. This questions are just out of curiosity and shall not block any attempts of merging this, when rebase is completed. |
So, I digged in your code to answer the question. We have support for gzip compression offloading, so there might be machines out there which can handle much faster compression with gzip than before and gzip1 might not fully utilize the cards. Is it possible and reasonable to add more than one gzip compression level? How do you think about renaming this feature, I feel auto compression might need some more explanation than for example adaptive compression. What do you (all) think about changing the default for new pools to enable this feature? It should increase the performance without hurting system performance, CPU wise, much. What is your opinion on this, @behlendorf? |
superseded by #7560 |
As part of my master thesis at University of Hamburg I have targeted to improve ZFS through compression. Now I would like to share my 3 feature branches with the community.
Description
This patch adds auto as ZFS compression type.
zfs set compression=auto
Motivation and Context
Which compression algorithm is best for high throughput? The answer to this depends on the type of hardware in use.
If compression takes long then the disk remains idle. If compression is faster than the writing speed of the disk then the CPU remains idle as compression and writing to the disk happens in parallel.
Auto compression tries to keep both as busy as possible.
The disk load is observed through the vdev queue. If the queue is empty a fast compression algorithm like lz4 with low compression rates is used and if the queue is full then gzip-[1-9] can require more CPU time for higher compression rates.
The already existing zio_dva_throttle might conflict with the concept described above. Therefore it is recommended to deactivate zio_dva_throttle.
Benchmark
Copy file from Tempfs to ZFS
8 Cores:
1 Core:
Types of changes
Branch overlapping changes (feature, compress values)
The patch is has read-only backward compatibility by using the new introduced SPA_FEATURE_COMPRESS_AUTO feature. The feature activation procedure is equivalent to my other code branches.
Regarding the limited namespace of BP_GET_COMPRESS() (128 values), the
zio_compress enum's first part is for block pointer & dataset values, the second part for dataset values only. This is an alternative suggestion to #3908.
Checklist: