Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Unsupported CPU on SpacemiT K1 Otca-core X60(RV64GCVB),RVA22, #17508

Closed
JieGH opened this issue Nov 8, 2024 · 3 comments
Closed

[Bug] Unsupported CPU on SpacemiT K1 Otca-core X60(RV64GCVB),RVA22, #17508

JieGH opened this issue Nov 8, 2024 · 3 comments
Assignees
Labels
needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug

Comments

@JieGH
Copy link

JieGH commented Nov 8, 2024


Expected Behavior

After building TVM 0.18.0 with LLVM 19.1.3, I expect TVM to generate RISC-V compatible code that executes without errors related to unsupported CPU types. The build should allow the execution of a basic TVM Python example on a Banana Pi K1 board, with the riscv64-linux-gnu target specified in the configuration.

Actual Behavior

Upon running a simple TVM example with LLVM 19.1.3 and TVM 0.18.0 on the Banana Pi K1, I encounter the following error message:

Unsupported CPU type!
UNREACHABLE executed at /home/jlei/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:1080!

In the TVM logs, there is also a warning that native vector bits are set to 128 for RISC-V, which could be relevant to the issue. The error persists despite multiple rebuilds of both LLVM and TVM, with adjusted configurations and target-specific flags to ensure compatibility with the RISC-V architecture on this board.

The error appears to stem from LLVM’s RuntimeDyldELF.cpp file, and recent threads, such as LLVM Issue #58652 and Halide Issue #7078, mention related problems that were resolved in newer LLVM releases, motivating my decision to upgrade from LLVM 15.0.7 to 19.1.3.

Environment

•	Operating System: Banana Pi K1 OS (version 1.X, latest)
•	LLVM Version: 19.1.3 (Default target: riscv64-linux-gnu; Host CPU: generic-rv64)
•	TVM Version: 0.18.0
•	Target Triple Configuration in TVM: "llvm -mtriple=riscv64-linux-gnu -mcpu=generic-rv64"
•	Architecture Flags: -march=rv64gc -mabi=lp64d
•	Other Configuration Flags:
•	USE_LLVM set to "llvm-config --ignore-libllvm --link-static"
•	GPU backends like CUDA, Vulkan, and OpenCL disabled.
•	Set USE_TVM_RUNTIME ON, USE_PROFILER ON, USE_GRAPH_RUNTIME ON.
•	Profiling, graph runtime, and relevant libraries enabled; unnecessary libraries like MKL and NNPACK disabled.
•	Builds attempted with both RelWithDebInfo and Release build types.

Steps to Reproduce

1.	Compile LLVM 19.1.3 with the following configurations:
•	Ensure the riscv64-linux-gnu target is specified explicitly during the build.
•	Build LLVM with optimized settings, assertions enabled, and set default and target-specific flags for RISC-V compatibility.
2.	Configure and build TVM 0.18.0:
•	Specify the target triple as riscv64-linux-gnu.
•	Set architecture flags for -march=rv64gc -mabi=lp64d.
•	Disable unnecessary backends and enable LLVM and RISC-V-specific configurations.
•	Ensure no additional RISC-V flags are set in the LLVM configuration to isolate any unsupported flag issues.
3.	Run a simple TVM Python example (like matrix multiplication or a basic compute test) on the Banana Pi K1 with the above setup to trigger the CPU error.

Additional Notes and Troubleshooting

•	I have attempted multiple builds of LLVM and TVM with minimal changes each time to pinpoint the issue.
•	Cross-referencing with related issues, like [LLVM Issue #58652](https://github.com/llvm/llvm-project/issues/58652), suggests this might be linked to incomplete support for specific RISC-V targets or configurations.
•	Despite the “Unsupported CPU” error, Python finishes the TVM script execution, but the generated LLVM code fails to execute.
•	Notably, the error does not occur when using an older LLVM version (15.0.7), although it cannot produce LLVM code properly for the required RISC-V target.
@JieGH JieGH added needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug labels Nov 8, 2024
@cbalint13 cbalint13 self-assigned this Nov 12, 2024
@cbalint13
Copy link
Contributor

cbalint13 commented Nov 12, 2024

@JieGH ,

I look into this, let's fix it.

In the meanwhile, latest llvm is a must to have for RISC-V targets (18.x , 19.x is fine), but also could please enable orcjit executor (which is experimental, under -jit=orcjit flag) inside TVM by defining your target like ones below:

  • without RVV extension:
    target = tvm.target.Target("llvm -jit=orcjit -mtriple=riscv64-linux-gnu -mcpu=generic-rv64 -mattr=+a,+c,+d,+f,+m")
  • with RVV extension:
    target = tvm.target.Target("llvm -jit=orcjit -mtriple=riscv64-linux-gnu -mcpu=generic-rv64 -mattr=+a,+c,+d,+f,+m,+v")

@JieGH
Copy link
Author

JieGH commented Nov 12, 2024

Hi @cbalint13 , thanks for the miracle you bring. It works now. I attached the target I used here and the version of my time and llvm. I will test the TVM with a more extensive test later.
Also, I did realize the issue is coming from a flag used for LLVM, yet what I need is a flag.

Privious error message:

Unsupported CPU type!
UNREACHABLE executed at /home/USER/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:1080!
Aborted

Solution: enable LLVM’s Orc JIT (On-Request Compilation) engine

target = "llvm -jit=orcjit -mtriple=riscv64-linux-gnu -mcpu=generic-rv64 -mattr=+a,+c,+d,+f,+m"
Target kind: llvm
Target options: {"mtriple": "riscv64-linux-gnu"}
LLVM config path: /usr/local/bin/llvm-config

llc --version output:
 LLVM (http://llvm.org/):
  LLVM version 19.1.3
  Optimized build with assertions.
  Default target: riscv64-linux-gnu
  Host CPU: generic-rv64

  Registered Targets:
    riscv32 - 32-bit RISC-V
    riscv64 - 64-bit RISC-V

@JieGH JieGH closed this as completed Nov 12, 2024
@cbalint13
Copy link
Contributor

cbalint13 commented Nov 12, 2024

Hi @JieGH ,

Hi @cbalint13 , thanks for the miracle you bring. It works now. I attached the target I used here and the version of my time and llvm. I will test the TVM with a more extensive test later. Also, I did realize the issue is coming from a flag used for LLVM, yet what I need is a flag.

Thanks a lot for your time and the feedback !

It is not a miracle, but I will open a PR to propose promotion of orcjit as TVM default llvm executor, instead of actual deprecated mcjit .

Please let me know any of your performance test, you are welcome to report it here, I am personally interested in the riscv targets. On my personal task list there is a RVV tensorization proposal for metaschedule/autoschedule, a preliminary integration with benchmarks for v0.7.1 and v1.0 RVV variants are here: https://github.com/cbalint13/rvv-kernels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug
Projects
None yet
Development

No branches or pull requests

2 participants