-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add zlib-ng library, optimized for ARM64 #61883
Conversation
Tagging subscribers to this area: @dotnet/area-system-io-compression Issue Detailshttps://github.com/zlib-ng/zlib-ng From the readme:
Note that Mark Adler is one of the top contributors to that repo: https://github.com/zlib-ng/zlib-ng/graphs/contributors MotivationWe currently consume two zlib versions:
LicenseThe zlib-ng library is provided with the zlib License: https://github.com/zlib-ng/zlib-ng/blob/develop/LICENSE.md TestingI tested my changes in both my x64 and arm64 machines. PerformanceThe perf comparison in my arm64 machine zlib-ng is faster than zlib for most cases: DeflateStream
GZipStream
ZLibStream
We didn't have a ZLibStream benchmark. I created one so I could use it for this change. dotnet/performance#2152
|
Need help getting the 3rd commit 88b80e5 reviewed closely, since I had to make some changes in cmake, Also, I am including their license file, the original readme file from the zlib-ng repo. I am imitating one of the other two zlibs, since we also included some non-compilable files. |
Some of these changes look like a good candidate for upstreaming into the zlib-ng repo. |
Can we delete this copy? The readme for zlib-ng says that it includes all Intel optimizations. |
You shouldn't need zlib as a fallback. Zlib-ng has optimizations for several architectures (including x86/x64/arm/aarch64). We have runtime cpu feature detection and use non-optimized functions if the hardware doesn't support it. In other words, zlib-ng has its fallbacks for all other architectures. |
This is windows only, right? i.e. on Unix, we still use zlib development package from package management? Please add a version file similar to:
|
It would be better to use zlib-ng on all platforms because then you get the speed benefits on all platforms. |
Taking this a step further, this is now the third set of zlib files we have in libraries. Can we delete the other two and just use this one?
Historically we shied away from this, preferring to use the zlib inbox; we had to ship it for Windows because Windows didn't have one, but the Linux distros we target mostly do. However, at this point, with the improvements from Intel and CloudFlare, covering both x64 and ARM, with more platform targets varying in what they provide, and with the consistency benefits that come from doing the same thing across platforms, we should consider changing course and having the sources built into S.IO Compression.Native on all relevant platforms.
+1. In addition to the general goodness of contributing back, for our own maintainability benefits we want to minimize our diff from upstream. |
@@ -65,6 +65,13 @@ if (${CLR_CMAKE_HOST_ARCH} STREQUAL "x86") | |||
add_compile_options(/Gz) | |||
endif () | |||
|
|||
if(${CLR_CMAKE_HOST_ARCH} STREQUAL arm OR ${CLR_CMAKE_HOST_ARCH} STREQUAL arm64) | |||
add_compile_options(-DZLIB_COMPAT=0) # Must be explicitly set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want a zlib compatible API? What does this actually affect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,19 @@ | |||
(C) 1995-2013 Jean-loup Gailly and Mark Adler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably we also need to update https://github.com/dotnet/runtime/blob/main/THIRD-PARTY-NOTICES.TXT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work so far.
@@ -10,7 +10,33 @@ if (GEN_SHARED_LIB) | |||
include (GenerateExportHeader) | |||
endif() | |||
|
|||
if(${CLR_CMAKE_HOST_ARCH} STREQUAL x86 OR ${CLR_CMAKE_HOST_ARCH} STREQUAL x64) | |||
if(${CLR_CMAKE_HOST_ARCH} STREQUAL arm OR ${CLR_CMAKE_HOST_ARCH} STREQUAL arm64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit should probably be broken into two. One that only adds the zlib-ng files. And another that connects dotnet runtime build system to use them. It would make it easier to review because then we can see what build system changes are needed for zlib-ng. But I realize at this point it would be difficult to split it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll do that.
zlib-ng/gzlib.c | ||
zlib-ng/gzread.c | ||
zlib-ng/gzwrite.c | ||
# Exclude files to fix error LNK2019: unresolved external symbol 'gzFile' open referenced in functions gz_open, gz_close, gz_load, gz_comp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should find out why it is giving that linker error. If dotnet does not use any of the gz_* functions then you don’t need those sources and can also set -DWITH_GZFILEOP=0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use add_subdirectory to add our CMake project?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking into this again now that we're back from the Thanksgiving holidays.
I'm exploring the use of add_subdirectory
and have some questions, but I'd like to first rebase this PR with the latest bits in main, because there was a big refactoring of these folders. I'll ask you the questions when I refresh the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first rebase this PR with the latest bits in main
Maybe you can put zlib-ng under src/native/external as part of this rebase as discussed in
#61883 (comment), so that we do not have to move it later?
|
||
/* Added ZLIB_COMPAT check to choose between the two available zconf files */ | ||
#if defined(ZLIB_COMPAT) | ||
# include "zconf.h.in" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should be zconf.h. The zconf.h.in file is input to CMake and configure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We typically don’t conditionally include zconf.h. Instead we conditionally include zlib.h/zlib-ng.h in our projects based on ZLIB_COMPAT.
Bypassing zlib-ng's own build system will disable most of the architecture-specific optimizations. This pull request is also missing most of the essential files. |
It's great to see the improvements in your Arm64 numbers, but some of the regressions are significant. For Optimal | TestDocument.pdf, is the output size the same (not sure whether the compression level corresponding to Optimal is part of the spec or not). If not, can we repro this regression without .NET in the picture - perhaps that would be worth an issue opened in the upstream repo. If we become a consumer of it, we should certainly try to help improve it. |
It is worth noting that the compression levels used in zlib-ng are not 1-1 with zlib. This is because zlib-ng uses different/Intel compression algorithms that are not available in zlib. In some cases speed is favored over compression as in level 1. Here are some zlib-ng benchmarks for anybody who is interested: As @mtl1979 mentioned above, that by bypassing zlib-ng's CMake, it might not have been built with some optimizations configured. |
Preferred way of including external CMake projects is using Mixing "static" libraries and "dynamic" libraries is discouraged as on most Unix-systems those use different compiler and linker flags and on Windows, the linker can find matching symbols in unexpected places. This also means that order of the libraries (static vs. dynamic/shared) passed to linker is significant. Build system of zlib-ng currently doesn't support compiling static library as position-independent code (PIC), but the compiler parameters can be passed manually if needed. This works only if no other library in the project (or parent projects) includes or requires zlib-ng. |
Marking this PR as draft while I address the feedback. @mtl1979 @nmoinvaz thank you for your help. I am not very experienced in cmake, so your expertise is extremely valuable here so we can get it to build properly. |
@carlossanlop If you need something added upstream to zlib-ng, you can ask me or @nmoinvaz as I have been contributor for zlib-ng since early days and @nmoinvaz has reviewed a lot of my commits and vice versa... |
In PR #61958, I was made aware of a zlib implementation located in @lambdageek, @akoeplinger, do you know if it's possible making mono consume the source code related to this PR? |
@carlossanlop Like I said earlier, when a library is own project or target, it can be consumed by multiple dependent targets. cmake will automatically add dependency and link to correct files when using the target name instead of library name or source files. At runtime, the full path to the shared/dynamic library must be known by the library or program requiring it. If not known at runtime, the default library search path is traversed to find matching library. For static libraries, runtime search obviously is not necessary. |
@carlossanlop I think it should be fine to replace the one in the mono build with zlib-ng. as far as I know (@akoeplinger please correct me) we just have some old copy of upstream zlib that we update at irregular intervals. |
Thanks for confirming, @lambdageek. My next question: Is |
It should be src/native (after the other PR merges). |
@carlossanlop, basically it's am11@4cbed64; which will shave off ~11K lines from the repo. |
I am wondering whether it would make sense to put all vendored unmanaged libraries in dedicated directory, for example src/native/external. It is a common practice in projects out there and it would make reusing zlib in mono a bit more natural. |
Makes sense to put all vendor code under a dedicated directory. We have brotli, libunwind, rapidjson, and zlib in the runtime repo (and perhaps few others in runtimelab). If it is not urgent, I can take a look once this PR is completed. |
Draft Pull Request was automatically closed for inactivity. Please let us know if you'd like to reopen it. |
I'd like us to explore the possibility of consuming zlib-ng, either as one of our zlib implementations, or as the sole one.
https://github.com/zlib-ng/zlib-ng
From the readme:
Note that Mark Adler is one of the top contributors to that repo: https://github.com/zlib-ng/zlib-ng/graphs/contributors
Motivation
We currently consume two zlib versions:
The zlib-ng library is optimized for arm and arm64, so we can take advantage of it, and leave the zlib original library as fallback for all other architectures.
License
The zlib-ng library is provided with the zlib License: https://github.com/zlib-ng/zlib-ng/blob/develop/LICENSE.md
It's similar to the Booster licenses from madler/zlib (original zlib flavor): https://github.com/madler/zlib/contrib/dotzlib/LICENSE_1_0.txt
And from jtkukunas/zlib (intel): https://github.com/jtkukunas/zlib/contrib/dotzlib/LICENSE_1_0.txt
Testing
I tested my changes in both my x64 and arm64 machines.
All unit test passed.
Performance
The perf comparison in my arm64 machine zlib-ng is faster than zlib for most cases:
DeflateStream
GZipStream
ZLibStream
We didn't have a ZLibStream benchmark. I created one so I could use it for this change. dotnet/performance#2152