-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for RISC-V Vector Extension #3
base: main
Are you sure you want to change the base?
Conversation
The long awaited RISC-V Vector Extension (RVV) is finally in ratification.[1] With this we add support for RVV optimized png filter types to libpng. Development and performance evaluation of the filter type implementations (up, sub, avg and paeth) was done on Allwinner D1 with RVV 0.7.1 using RVVRadar. Details on that can be found in an article published on github.[2] The integration in libpng was done very similarly as for other architectures (ARM NEON, MIPS MSA, ...): There exists a configure switch "riscv-vector" which can be set to no/off, check, api, yes/on. And there is also run-time checking code (like for ARM NEON, MIPS MSA, ...). More important are the differences: 1. We introduce a separate configure switch "riscv-vector-compat" 2. We use inline assembler instead of compiler intrinsics Both is done for the same reason: To support multiple RVV versions. As mentioned above, RVV 1.0 is under ratification now. But there are earlier draft versions which are sporadically already implemented in real existing hardware. One example is the Allwinner D1 with RVV 0.7.1 which was used for performance evaluation and optimization of these filter implementations.[2] We therefore decided to support the following RVV versions: 0.7.1, 0.8, 0.9, 0.10 and 1.0. Unfortunately, there is no easy way to detect the RVV version supported by the toolchain. It was therefore necessary to add an additional configure switch "riscv-vector-compat" with following behavior: * not-set/off(default): RVV release 1.0 * set to "0.10": RVV draft 0.10 * set to "0.9": RVV draft 0.9 * set to "0.8": RVV draft 0.8 * set to "0.7.1": RVV draft 0.7.1 Furthermore compiler intrinsics were not available for all supported RVV versions. For this reason we had to use inline assembler instead of compiler intrinsics. The filter implementations for all supported RVV versions were tested in the following setups: RVV | Branch of | Test system Version | riscv-gnu-toolchain[3] | --------------------------------------------------------------------- 1.0 | "basic-rvv" | qemu_sifive [4] 0.10 | "rvv-intrinsic" | qemu_sifive [4] 0.9 | "rvv-0.9.x" | Allwinner D1 (binary compatible) 0.8 | "rvv-0.8.x" | Allwinner D1 (binary compatible) 0.7.1 | "rvv-0.7.1" | Allwinner D1 (RVV 0.7.1) References: ----------- [1] https://riscv.org/announcements/2021/12/riscv-ratifies-15-new-specifications/ [2] https://github.com/mschlaegl/libpng_rvv-doc/blob/main/README.md [3] https://github.com/riscv-collab/riscv-gnu-toolchain [4] https://github.com/sifive/qemu/tree/rvv-1.0-upstream-v10
This generally looks good to me. I have some questions. What kind of deprecation strategy should we (if any) employ for the older RVV implementations? Is it logical to support the draft RVV implementations - i.e. what's the expected usage? Since there is a maintenance overhead, let's try to focus on the most value for the least effort? Regarding run-time detection, I'd prefer if we make it as easy as possible (automatic) for users to get the best performance. I recently merged a change to do this for NEON since this now appears to be standardised (compared to when it was first introduced several years ago). How much configuration do we need and can we make the default "just work"? Is there any way we can introduce CI testing for this architecture? This repo doesn't have any CI setup yet, I plan to do it over the next few days, but the number of permutations we need to test every configuration might be extensive. |
Thanks! The general motivation for libpng_rvv is to somehow push RVV in FOSS by showing its benefits. RVV is generally still very young. Support for it is not yet included in vanilla toolchains or kernels, but there are already patches waiting to be integrated. It seems that every project is waiting for another to integrate it. So the idea was to start somewhere where it brings benefits. :-) Regarding your questions:
(Finally, in case it's not clear yet, I of course plan to continue taking care of the RVV part of libpng in the future.) |
Update/Correction: I took a closer look at 1fd891b (Add compile-time detection of neon instruction set.) BTW: Your neon change only considers cmake, but there are no corresponding changes in the rest of the build system e.g. autotools (configure.ac, ...) or am I missing something? |
I need to think about this a bit more, the immediate need was I don't think runtime I would like to think about it from the user's point of view. Most users want a default for (1) development build (optimisations off) (2) deployment build (optimisations on). Most users don't care about the specific optimisations being performed, but usually there are some flags to control the specifics. But for a "library" I think we want a convenient default for both (1) and (2) above. Neon is established enough that handling (1) and (2) above should be fairly straight forward. By the way (2) shouldn't mean "turn on all the optimisations" just the way For people building the library, for a given platform on a given OS, we should be able to do reasonably good defaults for a high performance Regarding your changes:
|
@@ -163,6 +163,30 @@ if(CMAKE_SYSTEM_PROCESSOR MATCHES "mipsel*" OR | |||
endif() | |||
endif() | |||
|
|||
# Set definitions and sources for RISC-V. | |||
if(CMAKE_SYSTEM_PROCESSOR MATCHES "^riscv*") | |||
set(PNG_RISCV_VECTOR_POSSIBLE_VALUES check on off) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding a detect option here makes sense - detect the best way for the current compiler, and it should be the default. If it's not possible, I accept we can't do it. But for users, it won't be easy.
It's not clear why this hasn't been merged. RISC-V is an important development in processor architecture and failing to support it is a significant failure of libpng; why is that? This issue has been sitting in the queue for 18 months now, yes, I understand code correctness is important, but honestly, why is this taking so long? All three (or more) build systems are there and are used, so they have to be supported! |
@jbowler do you have time to review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a source-code only review; I have not even done a "gh pr".
This is ONLY a review of the build system changes and is based on my understanding of glennrp/libpng; not libpng. As such the changes seem correct and complete (or, at least, sufficiently complete). Assuming existing CI tests pass (i.e. other architectures) the changes should be safe insofar as they won't break anything other than RISC-V itself.
The PR includes generated files (configure, Makefile.in). These should not be included (the maintainer regenerates them).
The PR does not include changes to scripts/pnglibconf.dfa. These should be provided to allow the hardware stuff to be disabled at compile (build/configure) time via pngusr.dfa or DFA_XTRA
riscv/filter_vector_intrinsics.c includes <stdio.h> and <stdint.h> <stdint.h> is not an ANSI-C header so must not be there (see the recent change to the MIPS files) and stdio.h is not available on standalone (unhosted) systems. Both seem to be spurious.
Reading /proc/cpuinfo sucks; it's not "widely available", the whole body of code is going to be #ifdef disabled on Android (or maybe it will crash; that is what used to happen), MacOS, all the BSDs all legacy Unix systems etc.
Apparently the code won't work at all on most operating systems and, worse, it looks like the commit itself will cause compilation of libpng to fail on a RISC-V BSD. If my analysis is correct this is a MUST fix.
Bottom line: any commit should not break systems which currently build and any system must build without additional configuration. So hardware optimizations must build correctly on every system where they default to be on.
EDIT: stuff might build if linux is included in pngpriv.h, but I'm not sure whether this would be acceptable to the maintainer and it's still the case that the "option" setting would cause a #error if I'm right.
There are a few things in there I didn't notice before, so I requested changes. I'm not sure about my analsysis; RISC-V automated CI tests would help as they would allow automatic test of RISC-V builds on supported operating systems, well, Linux, BSD (I assume RISC-V supports both), Windows (if supported), MacOS (if supported). As it is unless you are running qemu for check-in tests yourself it's only possible to do a cross-build (which would detect the issues I identified). |
Does GitHub Actions CI support RISC-V? - Is it easy to run CI in QEMU? |
From the link you found and from https://docs.github.com/en/actions/automating-builds-and-tests/about-continuous-integration it looks like RISC-V is a work-in-progress but that "self hosted" runners might be provided by RISC-V guys pretty soon now. @mschlaegl might care to comment. I've never implemented any of this stuff; I've just seen what happens to my own pull requests when it is implemented and it struck me as pretty good news from a maintainer point of view. Sometimes tests fail for reasons apparently unrelated to the pull request itself but maintainers can ignore the failures if necessary (so far as I could see from the requestor viewpoint.) I've seen CI tests set up for a mixture of operating systems, Ubuntu, Windows (MinGW), MacOS. I don't know about the underlying architectures used, for example I couldn't tell if MacOS was ARM or x86 and I didn't see documentation of the Ubuntu arch. Ubuntu is certainly supported on multiple architectures that have hardware optimizations in libpng; x86 ARM, ARM64, apparently Riscv64 (not 32), possibly not MIPS.
I've assumed that it is docker-like, so there always is an emulator there (it's not running directly on real hardware). QEMU is just relevant for local testing, docker would work too (I"ve run docker on my x86 NAS to host an ARM implementation of a database). Automating the stuff, however, is pretty much essential; when I was doing libpng development I was also doing testing on a subset of the systems and that was, effectively, a full time job. At that time Glenn was still alive and he was full time too. I don't know how much Cosmin has automated but it doesn't use the github "continuous integration" stuff. |
I am strongly in support of CI, it's more a matter of how we can implement it for RISC-V. |
If you might find it useful, I were able to build GitHub runner for RISC-V. Step command as well as actions which depend on Node.js 20.x works fine (tried |
(corresponds to
https://sourceforge.net/p/png-mng/mailman/message/37402426/ on the png-mng-implement mailing list and
pnggroup/libpng#405 pull request glennrp/libpng)
Hello,
The long awaited RISC-V Vector Extension (RVV) is finally in
ratification.1 This pull request includes two changes:
applying the main patch
It supports RVV versions: 0.7.1, 0.8, 0.9, 0.10 and 1.0.
Development and performance evaluation of the filter type
implementations (up, sub, avg and paeth) was done on Allwinner D1 with
RVV 0.7.1 using RVVRadar. Details on that can be found in an article
published on github 2.
The integration in libpng was done very similarly as for other
architectures (ARM NEON, MIPS MSA, ...): There exists a configure
switch "riscv-vector" which can be set to no/off, check, api, yes/on.
And there is also run-time checking code (like for ARM NEON,
MIPS MSA, ...).
More important are the differences:
Both is done for the same reason: To support multiple RVV versions.
As mentioned above, RVV 1.0 is under ratification now. But there are
earlier draft versions which are sporadically already implemented in
real existing hardware. One example is the Allwinner D1 with RVV 0.7.1
which was used for performance evaluation and optimization of these
filter implementations.2 We therefore decided to support the
following RVV versions: 0.7.1, 0.8, 0.9, 0.10 and 1.0.
Unfortunately, there is no easy way to detect the RVV version
supported by the toolchain. It was therefore necessary to add an
additional configure switch "riscv-vector-compat" with following
behavior:
Furthermore compiler intrinsics were not available for all supported
RVV versions. For this reason we had to use inline assembler instead
of compiler intrinsics.
The filter implementations for all supported RVV versions were tested
in the following setups:
References:
Footnotes
https://riscv.org/announcements/2021/12/riscv-ratifies-15-new-specifications/ ↩
https://github.com/mschlaegl/libpng_rvv-doc/blob/main/README.md ↩ ↩2
https://github.com/riscv-collab/riscv-gnu-toolchain ↩
https://github.com/sifive/qemu/tree/rvv-1.0-upstream-v10 ↩ ↩2