Skip to content
zzmatu edited this page Mar 30, 2017 · 26 revisions

Code generator patches to LLVM for HPC extensions of Fujitsu SPARC64

DO NOT USE THIS FOR SERIOUS APPLICATIONS

This patch adds a support of SIMD extensions of SPARC64 VIIIfx and IXfx (K Computer and Fujitsu FX10). It is for LLVM-3.9.1. Specify the target CPU as "-mcpu=s64fx8". It includes small patches to CLANG to accept new CPU names (see in lib/Target/Sparc). Do not use this for SPARC V9, because the modification should conflict with the original code.

Build

The procedure is in: https://github.com/pf-aics-riken/llvm-sparc64fx/wiki/build

Defects

  • ctors/dtors (or initializers specified by attributes) do not work.
  • C++ exceptions do not work. The patch disables generation of CFI (Call Frame Information), because the assembler (of FX10) does not accept pseudo-instructions of CFI.
  • Floating-point comparisons by FCMP (generating a result in registers) do not properly generate results on orderedness.

LNT (nightly test) Result Summary

report.simple.txt as of 2017-03-09. The text is very wide. "*" means failures, where "CC" column is for compilation, "Exec" column is for execution. Failure count is about 100/500. It is VERY BAD, but it is in some part due to C++ constructors and exceptions.

TODO

  • Make all LNT tests pass (except for ctors/dtors and exceptions).
  • Code cost model.
  • Code reciprocals.
  • Test 4-SIMD instructions.
  • Code masked stores (STFR, STDFR).
  • Code element swap of SIMD FMA.
  • Code integer SIMD instructions.
  • Support SIMD intrinsics.
  • Make the original SPARC V9 work by removing conflicts with the modifications.
  • Assign Dwarf numbers to the extended registers. Note they are incomplete yet for SPARC V9, and also V8 and V9 have incompatible assignments in GCC.

Restrictions

  • It needs to specify the processor as "-mcpu=s64fx8". Code generation for SPARC V9 should be broken by the conflict with the modifications.
  • Vector loads/stores are rarely generated because of the strict alignment requirement (aligned to the vector length). It may need annotations to pointers to generate them.

References

The extensions of the Fujitsu HPC-ACE extension are in summary:

  • A large number of registers: 64 integer, 256 floating-point registers
  • SIMD instructions with 128 vector registers (2-SIMD for ACE1, and 4-SIMD for ACE2)

Fujitsu HPC Platform Documents (English pages are missing):

Specification Documents of Fujitsu SPARC64 HPC-ACE Extensions:

Acknowledgment

The coding was done by ***, under a contract with RIKEN AICS. But, the code was reviewed by zzmatu and all remaining bugs are his responsibility.

Part of the results is obtained by using K Computer at RIKEN AICS.