-
-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate modern optimized zlib #416
Comments
Also pigz (a parallel version of zlib): |
I've heard good things about pigz (Chris Rorden is a big fan). I'm initially a bit concerned that the source code is only available as a tarball from 2 years ago. It also feels like it may work well in posix environments, but I don't see how well it is supported across platforms. |
You might find the my comparison of different GZ compression strategies relevant. For that test I intentionally used a slow hard disk. With zlib you can write a compressed file direct to disk. Traditionally, with pigz we need to write the raw data to disk and then pigz loads this file and compresses it. Therefore, one needs to wonder if the parallel performance of pigz can be offset by the disk IO. The link also shows that if you have a modern version of pigz on Unix you can use a named-pipe to avoid writing the whole file to a slow disk. In my experience, pigz works fine on Windows, though I don't think you can use the named-pipe trick. The fact that it has not changed much in the last years may just be a sign it is mature. The original deflate/gz format dates back to a time when memory was limited and multiple cores was exotic. It has achieved widespread use, but innovations will really come from new standards that leverage modern computers. zstd is extremely impressive, in particular for medical datasets when it is combined with a byte-shuffling filter. In my testing, I always found Cloudflare's zlib faster than zlib-ng. However, this may have changed or may have been due to compiler settings. Cloudflare seems stuck at an older version of zlib (1.2.8). |
Discussion regarding cloudflare's version compared to zlib-ng: zlib-ng/zlib-ng#42 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Ping, still relevant |
@gdevenyi the page you cite compares zlib-ng to Cloudflare. It is useful, but I think it is a little out of date. More recent patches to Cloudflare provide good MacOS support. When I looked into this (perhaps a year ago) the benefit of Cloudflare was that it is faster. However, Cloudflare has two limitations: it will fail on old CPUs (more than 10 years old) and it has different licensing. Specifically, to get optimal performance from CloudFlare you must build with a a bit of [GPL code]. (https://github.com/cloudflare/zlib/blob/gcc.amd64/contrib/amd64/crc32-pclmul_asm.S) which makes it incompatible with some projects. If you must retain the pure zlib license, you should stick with zlib-ng, build CloudFlare without requiring crc32-pclmul_asm.S or make a clean-room replacement for that code. The final option would not be too hard, as Intel provided open source sample code available on Github when they introduced these instructions. |
Indeed @neurolabusc zlib-ng intends to retain the zlib licence, and pulls from all available zlib forks where licencing allows (intel, cloudflare, chromium, others?) |
This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions. |
@gdevenyi I found that the Chrome BSD-licensed CRC SIMD code works at least as well as the GPL-license CRC. If you want, you can test my CloudFlare zlib clone where the GPL code has been replaced. If others validate this, I will make a pull request for the main repository. I think this will provide the performance of Cloudflare without the restrictive GPL license. This would allow us to use these libraries in many non-GPL tools in our field, like FSL. I looked at zlib-ng, and it now also includes a SIMD CRC implementation. So I think it is time to do another comparison to see whether zlib-ng has pulled ahead of CloudFlare zlib. This benefits all gz compression, but the effects are most profound in pigz, since CRC must be done in serial, the parallel pigz is fighting Amdahl's law. |
Wow this is great. The zlib-ng project is definitely putting together infrastructure for a nice clean modern zlib with multi-arch support and continuous integration. I haven't had a chance to look at your implementation details @neurolabusc, but I did dig up zlib-ng's CRC implementation commit, zlib-ng/zlib-ng@3684659 It looks like the implementation comes from an intel patched version: https://github.com/jtkukunas/zlib but they're both based on the same whitepaper. I think the most important thing to check is if the intel version works properly on AMD... |
I'm going to fork/update this: https://github.com/jsnell/zlib-bench add your version, and generate some comparisons. |
Results:
|
Followup: |
@gdevenyi - curious, the E5645 should be the first generation to support CLMUL instructions. I had realized that my CMake needed a bit more work (e.g. 32-bit operating systems), but your system is perfect as it is right on the cusp of supporting these functions. Since my solution uses the Chrome browser code, I will look at their code to detect this. I agree, zlib-ng has a great CMake. At the very least, we should improve the CloudFlare CMake, though it may prove that recent updates make zlib-ng a better overall choice than Cloudflare. 1.) Can you include the output of the zlib-ng CMake on your E5645, in particular the following (I am interested to see if it enables PCLMULQDQ). Also, can you tell me if your E5645 operating system is set up as 32 or 64-bit:
|
|
|
|
Has the same illegal instruction issue.
|
I would have thought Westmere will report failure to the last line. |
What's the timeframe for 5.2? |
Weeks. Currently scheduled for Jan 29th. |
@gdevenyi we are working on the first ITK 5.2 release candidate, but we will have multiple release candidates -- it is reasonable to integrate this for the 5.2 release. |
I haven't started trying yet, but this will probably bug us out as well, zlib-ng/zlib-ng#827 |
This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions. |
Perhaps this is addressed by:
|
Yes, should probably try to investigate this again since 2.0.4 has been released. |
After digging into this today, the zlib module in ITK uses the git subtree feature to track changes in a such a way where it is beyond my ability to successfully switch to zlib-ng. I always end up at:
I suspect this might be due to the current ITK zlib being a diverged fork of the official zlib while zlib-ng is a different diverged fork. |
Perhaps we should create a new root commit for |
Yes, if the project has a disjoint Git commit history from the original zlib. |
@gdevenyi please push your current changes to |
There's essentially nothing done, but here it is: |
That's a good start:
|
I had tried that and it hadn't sorted the overlaps :)
Good catch 👍
This is probably what actually solves the overlaps problem :)
What exactly is the purpose of purging all these extra files? I know zlib-ng has now arch-specific files and such that I can't directly test the need for building, but may mean ITK can't build this properly on other arches. Thanks very much for your help! |
We exclude those to minimize the size of the imported content. We aren't developing zlib-ng or building it for external distribution, so none of the associated files is needed. Tests and documentation are not needed. Packaging files are not needed. Etc. |
An exclude option for the UpdateFromUpstream sounds pretty nice right now :) |
With third party modules, do we typically build with tests? |
We typically remove tests entirely when importing a third-party library (as Brad King suggested above). So we don't test third party libraries, we assume they are fine. |
Related to #348 zlib-ng is a project attempting to modernize the zlib codebase:
https://github.com/zlib-ng/zlib-ng
The text was updated successfully, but these errors were encountered: