RISC-V's outreach has been moving forward at blinding pace in the past month. In June, we saw the opening of the inaugural RISC-V Summit in Europe, RISC-V Day in Tokyo, the return of SiFive's tour in mainland China, as well as the Chinese Academy of Sciences' first RuyiSDK Open Day. We would also like to take this moment to congratulate Dr. Qiu Ji 邱吉 at the PLCT Lab for becoming a RISC-V Ambassador at RISC-V International.
Still, we are reminded of RISC-V founder Dr. Krste's comment that software ecosystem is of utmost importance of rhte success of an instruction set architecture. We must realise that, despite the unstoppable forward movement of open source software success, commercial and industrial software support for RISC-V still leaves much room for further development. To realise this development, we will congregate more talents and resources. In the next three to five years, we plan to port and optimize top-1000 software solutions from 18 foundational and key industries. Let's work together and make it happen.
For those who joined PLCT for over a year, June marked a moment for farewell. From late June, the PLCT Lab will encourage and instruct young engineers to find new venues for further self-development. The organizational adjustment will affect about half of our employees, lasting into Q3 2023. While there may be pain and sadness, this adjustment also marks for many a ticket into a promising new future.
- Chinese: Successful Conclusion to the Inaugural "RuyiSDK Open Day" and the Launch of RuyiSDK V1
- Chinese: Key Progress Realized for the Institute of Software's Box64 RISC-V Port
- Chinese: Congratulations! Dr. Qiu Ji Becomes RISC-V International's New Ambassador
- Chinese: People | Liu Xin: My Experience as Release Manager for openEuler RISC-V
- Updates from the upstream.
- 4576325: [riscv][builtins] Split CallApiCallback into generic and optimized variants | https://chromium-review.googlesource.com/c/v8/v8/+/4576325
- 4593016: [riscv][wasm-gc] Inlining into JS: Lower traps to conditional jump to trap call | https://chromium-review.googlesource.com/c/v8/v8/+/4593016
- 4583609: [riscv] Fix pointer compression | https://chromium-review.googlesource.com/c/v8/v8/+/4583609
- 4596402: [riscv][simulator] Fix test error of vfwredosum in riscv32 | https://chromium-review.googlesource.com/c/v8/v8/+/4596402
- 4606688: [riscv][builtins] Port HandleApiCall to CSA | https://chromium-review.googlesource.com/c/v8/v8/+/4606688
- 4608527: [riscv][heap] Move age from BytecodeArray into SharedFunctionInfo | https://chromium-review.googlesource.com/c/v8/v8/+/4608527
- 4612048: [riscv][compiler] Add Adapter template argument to InstructionSelector | https://chromium-review.googlesource.com/c/v8/v8/+/4612048
- 4624977: [riscv][SFI] Fix the store size of SharedFunctionInfo::Age field | https://chromium-review.googlesource.com/c/v8/v8/+/4624977
- 4630556: [riscv][compiler] Generalize InstructionSelectorT for Turboshaft (part 1) | https://chromium-review.googlesource.com/c/v8/v8/+/4630556
- New features.
- 4653674: [sandbox][riscv] Port sandbox | https://chromium-review.googlesource.com/c/v8/v8/+/4653674
- 4323697: [riscv32]Implement simd for liftoff and turbofan | https://chromium-review.googlesource.com/c/v8/v8/+/4323697
- Reviewed jdk-mainline pull requests.
- openjdk/jdk#14189 (8308977: gtest:codestrings fails on riscv)
- openjdk/jdk#14138 (8308817: RISC-V: Support VectorTest node for Vector API)
- openjdk/jdk#14166 (8308915: RISC-V: Improve temporary vector register usage avoiding the use of v0)
- openjdk/jdk#14197 (8308997: RISC-V: Sign extend when comparing 32-bit value with zero instead of testing the sign bit)
- openjdk/jdk#14203 (8308765: RISC-V: Expand size of stub routines for zgc only)
- openjdk/jdk#14214 (8303417: RISC-V: Merge vector instructs with similar match rules)
- openjdk/jdk#14256 (8309254: Implement fast-path for ASCII-compatible CharsetEncoders on RISC-V)
- openjdk/jdk#14279 (8309332: RISC-V: Improve PrintOptoAssembly output of vector nodes)
- openjdk/jdk#14299 (8309405: RISC-V: is_deopt may produce unaligned memory read)
- openjdk/jdk#14288 (8308726: RISC-V: avoid unnecessary slli in the vectorized arraycopy stubs for bytes)
- openjdk/jdk#14308 (8309419: RISC-V: Relax register constraint for AddReductionVF & AddReductionVD nodes)
- openjdk/jdk#14309 (8309418: RISC-V: Make use of vl1r_v & vfabs_v pseudo-instructions where appropriate)
- riscv/riscv-crypto#327 (Fix typo in riscv-crypto-vector-element-groups.adoc)
- Reviewed/Merged backported patches for the
riscv-port-jdk17u
repo.- openjdk/riscv-port-jdk17u#56 (8307651: RISC-V: stringL_indexof_char instruction has wrong format string)
- openjdk/riscv-port-jdk17u#57 (8307446: RISC-V: Improve performance of floating point to integer conversion)
- openjdk/riscv-port-jdk17u#58 (8308277: RISC-V: Improve vectorization of Match.sqrt() on floats)
- openjdk/riscv-port-jdk17u#59 (8301628: RISC-V: c2 fix pipeline class for several instructions)
- openjdk/riscv-port-jdk17u#60 (8301852: RISC-V: Optimize class atomic when order is memory_order_relaxed)
- openjdk/riscv-port-jdk17u#61 (8301153: RISC-V: pipeline class for several instructions is not set correctly)
- openjdk/riscv-port-jdk17u#62 (8301818: RISC-V: Factor out function mvw from MacroAssemble)
- openjdk/riscv-port-jdk17u#63 (8305008: RISC-V: Factor out immediate checking functions from assembler_riscv.inline.hpp)
- openjdk/riscv-port-jdk17u#64 (8302289: RISC-V: Use bgez instruction in arraycopy_simple_check when possible)
- openjdk/riscv-port-jdk17u#65 (8305728: RISC-V: Use bexti instruction to do single-bit testing)
- openjdk/riscv-port-jdk17u#66 (8301033: RISC-V: Handle special cases for MinI/MaxI nodes for Zbb)
- openjdk/riscv-port-jdk17u#67 (8308997: RISC-V: Sign extend when comparing 32-bit value with zero instead of testing the sign bit)
- openjdk/riscv-port-jdk17u#68 (8309427: [riscv-port-jdk17u] Remove unused RoundDoubleModeV C2 node)
- JDK committer nomination (Call for Vote):
- https://mail.openjdk.org/pipermail/jdk-dev/2023-June/007916.html (CFV: New JDK Committer: Feilong Jiang)
- https://mail.openjdk.org/pipermail/jdk-dev/2023-June/007917.html (CFV: New JDK Committer: Yadong Wang)
- Submitted and merged JDK-mainline patches.
- openjdk/jdk#14256 | (8309254: Implement fast-path for ASCII-compatible CharsetEncoders on RISC-V)
- openjdk/jdk#14309 | (8309418: RISC-V: Make use of vl1r.v & vfabs.v pseudo-instructions where appropriate)
- openjdk/jdk#14535 | (8310276: RISC-V: Make use of shadd macro-assembler function when possible)
- Backport jdk17u:
- openjdk/riscv-port-jdk17u#63 | (8305008: RISC-V: Factor out immediate checking functions from assembler_riscv.inline.hpp)
- openjdk/riscv-port-jdk17u#65 | (8305728: RISC-V: Use bexti instruction to do single-bit testing)
- openjdk/riscv-port-jdk17u#66 | (8301033: RISC-V: Handle special cases for MinI/MaxI nodes for Zbb)
- Other items:
-XX:+UseUnalignedAccesses
性能调研:https://cr.openjdk.org/~dzhang/TestUseUnalignedAccesses/
- Submitted and merged JDK-mainline patches.
- openjdk/jdk#14279 | 8309332: RISC-V: Improve PrintOptoAssembly output of vector nodes
- openjdk/jdk#14308 | 8309419: RISC-V: Relax register constraint for AddReductionVF & AddReductionVD nodes
- openjdk/jdk#14510 | 8310192: RISC-V: Merge vector min & max instructs with similar match rules
- openjdk/jdk#14702 | 8311074: RISC-V: Fix -Wconversion warnings in some code header files
- Backport jdk17u:
- openjdk/riscv-port-jdk17u#68 | 8309427: [riscv-port-jdk17u] Remove unused RoundDoubleModeV C2 node
- openjdk/riscv-port-jdk17u#67 | 8308997: RISC-V: Sign extend when comparing 32-bit value with zero instead of testing the sign bit
- Upstreamed patches.
- [ValueTracking] Guaranteed well-defined if parameter has a dereferecable_or_null attribute https://reviews.llvm.org/D153945
- [InstCombine] Add !noundef to match behavior of violating assume https://reviews.llvm.org/D153400
- [CVP] Don't process sext or ashr if value state including undef https://reviews.llvm.org/D152774
- [SCCP] Replace new value's value state with removed value's https://reviews.llvm.org/D152337
- [SCCP] Skip computing intrinsics if one of its args is unknownOrUndef https://reviews.llvm.org/D152499
- [LoopIdiom] Freeze BitPos if !isGuaranteedNotToBeUndefOrPoison https://reviews.llvm.org/D151690
- [CodeGenPrepare][RISCV] Remove asserting VH references before erasing the dead GEP https://reviews.llvm.org/D153194
- [RISCV] Add support for XCVmac extension in CV32E40P https://reviews.llvm.org/D152821
- [RISCV] Add support for XCVbitmanip extension in CV32E40P https://reviews.llvm.org/D152915
- [RISCV] Add a pass to merge moving parameter registers instructions for Zcmp https://reviews.llvm.org/D150415
- [RISCV] Fold special case (xor (setcc constant, y, setlt), 1) -> (setcc y, constant + 1, setlt) https://reviews.llvm.org/D152128
- [flang] Rename remaining
__Fortran_PPC_intrinsics
to__ppc_intrinsics
https://reviews.llvm.org/D153703 - [openmp] remove initializeRewriteSymbolsLegacyPassPass https://reviews.llvm.org/D153704
- [Flang] Map
ieee_fma
intrinsic tollvm.fma
https://reviews.llvm.org/D151872
You may find more of our code review by searching under the names of the authors above.
Awaiting next round of review.
- Created repository under the RuyiSDK organization, maintaining GCC 10/13 and Binutils 2.35/2.36/2.40, tracking updates to each extension versions:
- Attmepted to implement RVV 0.7 instructions in Binutils 2.40 with compatibility for RVV 1.0. Users may switch between versions by specifying V extension versions -
-march=rv64gcv0p7
denotes usage for RVV 0.7 and-march=rv64gcv1p0
specifies. The current implementation passes all instruction generation tests. - Resent Zc* patches for GCC and Binutils. Patches for GCC had already passed review, Binutils patches have pending revision.
- Submitted GCC patch for bf16 support, currently under review.
- Submitted patch for RVV GCC tuple type fp16 class support and tests.
- Assembled documentation (Chinese) on RVV Instrinsics changes between RVV 0.7 and RVV 1.0.
- Slide decks from RISC-V GNU Toolchain Bi-Weekly Meetings.
- (6.15) https://docs.google.com/presentation/d/1R9y4EmKi7BrHVLXCy0MaFlszQMJzByKblK3wlZ3Mo6Q/edit?usp=sharing
- (6.29) https://docs.google.com/presentation/d/1yoJKSUrK3bQLFkK6gfKnKBubgII0uupWTBfhABLlbmQ/edit?usp=sharing
- (6.29) https://docs.google.com/presentation/d/1vMQR-EpE-cvJl8CW4-zzCIfOvWwPZrCYo2avfi_nBSk/edit?usp=sharing
Stats: 7843/18808, 41.70% (https://whale.plctlab.org/riscv/support-statistics/)
- A total of 58 keywording commits (include non-PLCT team members): https://whale.plctlab.org/riscv/stats/2023_06.txt
- dev-libs/rinutils: Keyword 0.10.2-r1 riscv gentoo/gentoo@51bd477
- dev-python/tox: Keyword 4.6.2 riscv gentoo/gentoo@cc6a9fa
- dev-ruby/immutable-ruby: Keyword 0.1.0 riscv gentoo/gentoo@c32624b
- dev-util/cvise: Keyword 2.8.0 riscv gentoo/gentoo@5849d65
- net-im/telegram-desktop: Keyword 4.8.3 riscv gentoo/gentoo@186d9f8
- libopus: enable intrinsics only on supported platforms NixOS/nixpkgs#237486
- apcupsd: properly set configureFlags, fix cross compilation NixOS/nixpkgs#238388
- FIL-plugins: fix cross compilation NixOS/nixpkgs#238390
- assimp: fix build for riscv NixOS/nixpkgs#238393
- lrs: set CC, fix cross compilation NixOS/nixpkgs#238394
- blktrace: fix cross compilation NixOS/nixpkgs#238913
- dex: fix cross compilation, set strictDeps NixOS/nixpkgs#238915
- dhcpdump: rework packaging, fix cross compilation NixOS/nixpkgs#238918
- fastJson: cleanup, use autoreconfHook NixOS/nixpkgs#238919
- fbterm: fix cross compilation NixOS/nixpkgs#238920
- gamescope: fix cross compilation, set strictDeps NixOS/nixpkgs#238923
- gnomeExtensions.buildShellExtension: fix cross compilation NixOS/nixpkgs#238924
- libxdg_basedir: 1.2.0 -> 1.2.3, fix cross compilation NixOS/nixpkgs#239137
- Fixed several sporadic failures, https://phabricator.services.mozilla.com/rELMe75f2469605782e7b784569ddf95024bce6514aa
- Enable wasm baseline complier, https://phabricator.services.mozilla.com/D180186
This month, we worked predominantly on Box64 and has therefore did not submit any new pull requests. Those that were submitted in May were reviewed and merged.
Up to this point, the basic infrastructures reached preliminary completion. The next step will be working on utility functions for code emit components. In July, we will re-double our effort on DyanoRIO to realise our goal from back in June - running a Hello World program on DynamoRIO.
The following are the aforementioned pull requests:
- ksco
[WIP] Automated OpenCV Universal Intrinsic code migrator for the RVV backend.
- Detects Universal Intrinsic types.
- Supports rewriting vector lengths and overloaded operators in vector types.
- Git repository: https://github.com/hanliutong/rewriter
The first patch generated by the code migrator has been accepted by the upstream, opencv/opencv#23885
- Upstream work.
- Revised patches.
- [libcxx] <experimental/simd> Add ABI tags, class template simd/simd_mask implementations. Add related simd traits and tests.
- [libcxx] <experimental/simd> Added simd width functions, simd_size traits and related tests
- [libcxx] <experimental/simd> Added aliagned flag types, traits is_simd_flag_type[_v], memory_alignment[_v] and related tests
- Revised patches.
- Other items.
- Created a test libcxx-simd repository for Compiler Explorer. Users may start testing by including the header, see LibCxx SIMD: Single Header Library
- Implemented tests for internal interfaces.
- Rebased code against the LLVM upstream and resolved issues in clang-tidy code.
- Implemented simd/simd_mask classes and explicit type conversion interfaces for its internal storage types.
- Created a test libcxx-simd repository for Compiler Explorer. Users may start testing by including the header, see LibCxx SIMD: Single Header Library
JIT is now largely functional; LuaJIT/LuaJIT-test-cleanup could pass 505/508 cases, on par with LuaJIT on other platform. Programs like Minetest, Scimark, and Sysbench are said to be running without any issues.
There are still some bugs to be fixed, for instance, LuaRocks complains about invalid table metamethods, and NeoVim may segfault while building with a malformed string pointer access.
Additionally, the program doesn't currently have debug information for unwinding. These ought to be fixed soon.
- Interpreter
- JIT
- Fix asm_sparejump_use
- Fix emit_loadk32 corner case handling
- Drop emit_loadk20
- Optimize emit_loadu64
- Fix emit_loadu64 regression
- Fix asm_patchexit with unconditional jump
- Fix asm_tointg
- FFI Callback dispatcher bringup
- Add bcsave definition, include disassembler by Milos Poletanovic from Syrmia.com
- Fix float to int type conversion rounding
No update this month.
- Merged: Add support for BF16 extensions
- Merged: Fix bugs in disassembling code for cm.mva01s/mvsa01 instructions
- Accepted: target/riscv: Fix mstatus related problems
- Merged: target/riscv: Fix initialized value for cur_pmmask
- Accepted: target/riscv: Fix the xlen for data address when MPRV=1
- Under Review: target/riscv: Add support for BF16 extensions
- Under Review: target/riscv: Remove redundant check in pmp_is_locked
- Implemented RVV extension support for the TCG backend, https://github.com/plctlab/plct-qemu/tree/plct-riscv-backend-rvv
This month, we continued to improve the dynamic recompiler for the RV64 JIT backend and continued work on library wrappers. We fixed several complicated bugs, added support for more opcodes, and wrapped more libraries and functions (mainly GUI-related).
Another highlight is that we added experimental support for Wine's WoW64 support. With newer Wine versions, you should now be able to run simple 32-bit Windows applications.
Upstreamed pull requests:
- xctan
- ksco
- Handling of STll struct in FILD/FISTP
- Fixed A0 MOV AL,Ob
- Added support for 32bits
- Added basic 32 bits RV64 support
- Fixed a typo
- Small optim to FLAGS_ADJUST_TO11
- Added more symbols for nss3
- Added more opcodes
- Added more gtk functions
- Added more symbols for openssl wrapper
- Rework on libharfbuzz wrapper
- Added more libicu wrapped functions (for #829)
- Added more opcodes
- Added 66 0F 38 37 PCMPGTQ opcode
- Added some more libc wrappers
- Fixed 9x SETcc opcodes
- Fixed call_c issues
- Fixed native_fprem
- Fixed PUSH rsp when double pushing
- Small optim on GETGX/GETGM helper macros
- Fixed 6B IMUL opcode
- Fixed geted32
- Updated CMO extension support in SAIL, riscv/sail-riscv#137
- Updated CMO extension support for ACT.
Shi Ninging (史宁宁) continues to work on compiling the OpenArkCompiler Weekly, which just published its 167th issue.
You may find new weekly issues of the OpenArkCompiler Weekly on Sundays on...
- GitHub: https://github.com/isrc-cas/arkcompiler-materials
- Zhihu: https://zhuanlan.zhihu.com/openarkcompiler
- Bilibili: https://www.bilibili.com/read/readlist/rl199373
- Mailing list and other channels: https://gitee.com/openarkcompiler/OpenArkCompiler/issues/I1EWAX
- MLIR RVV Support: Exploring approaches to MLIR compiler implementation for RVV.
- Will setting different lmul result in obvious performance difference? chipsalliance/t1#237
- Errors occur when modify axpy-masked.mlir in some ways chipsalliance/t1#232
- MLIR Sparsity: Surveying support status for MLIR Sparsity and begin implementation of RVV support.
- Read SparsificationAndBufferization Pass source code: https://blog.sh1mar.in/post/mlir/sparse-compiler-pass-sparsification-and-bufferization-pass/
- Add FuseTensorRewrite example buddy-compiler/buddy-mlir#161
- Add sparse_tensor.add example: buddy-compiler/buddy-mlir#166
- Add sparse tensor pack example: buddy-compiler/buddy-mlir#167
- Add sparse_tensor.number_of_entries: buddy-compiler/buddy-mlir#168
- Add example for sparse_tensor.coordinates: buddy-compiler/buddy-mlir#170
- Add initial example for binary operation: buddy-compiler/buddy-mlir#172
- Compiler Technologies in Deep Learning Co-Design: A Survey - https://spj.science.org/doi/full/10.34133/icomputing.0040
- From the Press: How computers and artificial intelligence evolve together - https://www.eurekalert.org/news-releases/994272
- Buddy Compiler Homepage - https://buddy-compiler.github.io/
- Buddy Compiler's OSPP 2023 Project Home - https://summer-ospp.ac.cn/org/orgdetail/8d995d4c-b188-4690-9a53-c022dc7c19e3?lang=zh
- Buddy Compiler As A Service(Buddy-CAAS)- https://buddy.isrc.ac.cn/
buddy-mlir
Code repository: https://github.com/buddy-compiler/buddy-mlir
- Polish DAP/DIP dialects, passes, and interfaces.
- [Blog] reading mlir-opt source code: https://blog.sh1mar.in/post/mlir/mlir-opt/
- [Blog] Every problem I met when bumping LLVM version to mainline for buddy-mlir: https://blog.sh1mar.in/post/nix/bump-vector-llvm/
- [Blog] Understanding tensor and bufferization: https://blog.sh1mar.in/post/mlir/tensor/
buddy-benchmark
Code repository: https://github.com/buddy-compiler/buddy-benchmark
- Add efficientnet-quantized benchmark.
- Add validation framework for audio processing cases.
- Add initial Gemmini benchmark.
No update this month.
No update this month.
- lib: sbi: Align system suspend errors with spec
- Introduce and use simple heap allocator
sbi_console
improvements.- OpenSBI logo and copyright update
- OpenSBI v1.3 Released
- Add no-map property for reserved RAM.
- firmware: Fix find hart index
- lib: reset: Move fdt_reset_init into generic_early_init
- lib: sbi: Try to make each domain have boot hart
- lib: sbi: check A2 register in ecall_dbcn_handler
No update this month.
Following up on last month's updates, we surveyed the effect of common optimization flags on SPEC CPU2017 benchmark results, especially on RISC-V platforms. -flto
and -ffast-math
were found to be the most commonly used flags (in combination with -O3
). Below are the performance uplift we observed on x86:
We also tested briefly jemalloc's performance on x86 and RISC-V platforms. While performance gains for memory operations on RISC-V is not significant, on x86, we observed a 25% uplift.
We also explored other flags, see our detailed report.
Here are some updated SPEC CPU2017 figures obtained from Unmatched (GCC 13.1.0, -O3
).
An error may occur while running 657.xz_s due to memory depletion, as the benchmark requires 16GiB of RAM and Unmatched ships with the same amount.
- Implemented and fine-tuned performance benchmarks for LAVA jobs running on an Unmatched board with openEuler riscv64.
- Surveyed the libmicro benchmark and assembled a report
- Completed writing and fine-tuning testcases and for unixbench, stream, libmicro, fio, lmbench, and netperf. These testcases now runs properly on LAVA, see https://gitlab.com/jean9823/lava-testcase
- LTP syscalls testcases now runs on LAVA jobs, see https://gitlab.com/jean9823/lava-testcase/-/blob/main/lava_job/unmatched/sifive-unmatched-ltp-syscalls-test.yaml
- Developed and tested
ruyibuild
and wrote sample build scripts for Qemu.