Skip to content
/ hwtHls Public

LLVM based HLS library for HWToolkit (hardware devel. toolkit)

License

Notifications You must be signed in to change notification settings

Nic30/hwtHls

Repository files navigation

hwtHls

CircleCIPyPI versionCoverage Status

A library for an automatic translation of algorithmic code to a hardware realization based on hwt (hwt is a library for circuit construction) and LLVM (a compiler infrastructure).

This library is build as a tool which lets you write code transformations and provides variety of existing ones (from LLVM/hwt) in order to build efficient code generators.

  • Powerful optimization passes form LLVM/HWT
  • Target specification for common FPGAs
  • Integration with HWT: SystemVerilog/VHDL export, various interfaces and components, verification API

Current state

  • This library is in alpha phase.

  • You can try it online at Binder (From jupyterlab you can also run examples in tests.)

  • Features

    • Python bytecode -> LLVM -> hwt -> vhdl/verilog/IP-exact

      • no exceptions, function calls must be explicitly marked for hw otherwise evaluated compile time
      • only static typing, limited use of iterators
      • (meant to be used for simple things, for the rest there are "statement-like objects")
    • Python statement-like objects -> LLVM -> hwt -> vhdl/verilog/IP-exact

      • Support for multithreaded programs (multiple hls programs with shared resources cooperating using shared memory or streams and with automatic constrains propagation on shared resource)
      • Supports for programs which are using resoruce shared with HDL code (e.g. bus mapped registers where bus mapping is done in HDL (hwt))
    • Support for precise latency/resources tuning

      • FSM/dataflow fine graded architecture (strategy specified as a sequence of transformations)
    • Precise operation scheduling using target device timing characteristics (any Xilinx, Intel and others after benchmark)

    • All optimizations aware of independent slice drivers

      • SsaPassExtractPartDrivers - splits the slices to individual variables to exploit real dependencies, splits also bitwise operations and casts
      • ConstantBitPropagationPass - recursively minimizes the number of bits used by variables
    • Any loop type with special care for:

      • Infinite top loops - with/without internal/external sync beeing involved
      • Loops where sync can be achieved only by data (no speculation, all inputs depends on every output)
      • Polyhedral, affine, unroll and other transformations
      • On demand speculations/out-of-order execution:
        • next iteration speculation before break
        • speculativele execution of multiple loop bodies
        • after loop code speculative execution before break
        • cascading of speculation
        • speculative IO access using LSU (for memory mapped IO) or buffers with confirmation (for IO streams)
    • Support for Handshake/ReadySynced/ValidSynced/Signal streams (= handshake and all its degenerated variants = any single channel interface)

      • arbitrary number of IO operations for any scheduling type
      • support for side channels, virtual channels, multiple packets per clock (e.g. xgmii)
      • explicit blocking, explicit dropping, explicit skipping (e.g. conditional read/write of data, read without consummer)
      • Support for read/write of packet(HStream) types
        • Per channel specific settings
        • Processing of arbitrary size types using cursor or index of limited size
        • Support for headers/footers in HStream
        • incremental packet parsing/deparsing, read/write chunk:
          • may not be alligned to word
          • may cause under/overflow
          • may be required to be end of stream or not
        • Optional check of input packet format (or synchronized by the input packet format which significantly reduce circuit complexity)
  • Not done yet:

    • Complex operation reducing (DSP)
    • All platforms
    • Memory access pattern, partition API between Python and LLVM

How it works?

  • The input code is parsed into SSA objects defined in hwtHls.ssa. (The code is loaded using HlsStreamProc object in hwt component (Unit class), the constraints and interface types are specified as hwt objects.)
  • There are several optimization SSA passes (common subexpression elimination, dead code elimination instruction combining, control optimization, ...). Full list of optimizations is specified in HlsPlatform.
  • Optimized SSA is then converted to a hwtHls.netlist and scheduled to clock cycles. uses HDL objects from hwt.
  • Secheduled netlist is then translated to hwt netlist which handles all SystemVerilog/VHDL/simulator/verification related things.

Why hwtHls is not a compiler?

  • Nearly all HLS synthesizers performing conversion from source language and constraints to a target language. But there are many cases where a complex preprocessor code is required to generate efficient hardware because it is not possible to interfere everything and constraint computation may also be complex. Because of this this library uses python as a preprocessor and the input code is build from statement-like objects. The benefit of Python object is that user can generate/analyze/modify it on demand.

Installation

Linux:

apt install build-essential python3-dev llvm-13-dev
pip3 install -r https://raw.githubusercontent.com/Nic30/hwtHls/master/doc/requirements.txt
pip3 install git+https://github.com/Nic30/hwtHls.git

Related open-source

  • 💀 ahaHLS - 2018-2019, A Basic High Level Synthesis System Using LLVM
  • 💀 augh - c->verilog, DSP support
  • 💀 c-ll-verilog 2017-2017, C++, An LLVM based mini-C to Verilog High-level Synthesis tool
  • 💀 Chips-2.0 - 2011-2019, Python, C->Verilog HLS
  • 💀 COMBA - 2017-2020, C++/LLVM, focused on resource constrained scheduling
  • 💀 ctoverilog ?-2015 - A C to verilog compiler, LLVM
  • 💀 DelayGraph - 2016, C#, register assignment alghorithms
  • 💀 DHLS - 2019-?, C++, A Basic High Level Synthesis System Using LLVM
  • 💀 ElasticC ?-2018 - C++, lightweight open HLS for FPGA rapid prototyping
  • 💀 exprc - 2018-2018, C++, a toy HLS compiler
  • 💀 hg_lvl_syn - 2010, ILP, Force Directed scheduler
  • 💀 hls_recurse - 2015-2016 - conversion of recursive fn. for stackless architectures
  • 💀 kiwi 2003-2017
  • 💀 LegUp (reborn as Microchip SmarthHLS in 2020) - 2011-2015, LLVM based c->verilog
  • 💀 microcoder - ?-2019, Python, ASM like lang. -> verilog
  • 💀 polyphony - 2015-2017, simple python to hdl
  • 💀 Potholes - 2012-2014 - polyhedral model preprocessor, Uses Vivado HLS, PET
  • 💀 Shang - 2012-2014, LLVM based, c->verilog
  • 💀 streamit-hls - 2017, custom lang, based on micro kernels
  • 💀 TAPAS - 2018-2019, c++, Generating Parallel Accelerators fromParallel Programs
  • 💀 xronos git2 - 2012-2016, java, simple HLS
  • ahir - LLVM, c->vhdl
  • abc <2008-?, A System for Sequential Synthesis and Verification
  • blarney
  • calyx - , Rust - compiler infrastructure with custom lang focused on ML accelerators
  • clash-compiler
  • coreir - 2016-?, LLVM HW compiler
  • dynamatic - , C++/LLVM - set of LLVM passes for dynamically scheduled HLS
  • futil - 2020-?, custom lang.
  • gemmini - scala, systolic array generator
  • Hastlayer - 2012-2019, C# -> HW
  • heterocl
  • PandA-bambu - 2003-?, GCC based c->verilog
  • PipelineC - 2018, Python, c -> hdl for a limited subset of c
  • pluto - An automatic polyhedral parallelizer and locality optimizer
  • Slice
  • spatial - , scala
  • tiramisu - 2016-?, C++, A polyhedral compiler
  • utwente-fmt - abstract hls, verification libraries
  • xls - 2020-?, C++ HLS compiler with JIT
  • binaryen - , C++, WebAssembly compiler (implements some similar optimization passes)
  • Light-HLS -, C++/LLVM, experimental HLS framework
  • DASS - combination of dynamic and static scheduling
  • phism - Python/C++/LLVM, Polyhedral High-Level Synthesis in MLIR
  • ICSC - C++/LLVM, systemC compiler
  • Xilinx/Vitis HLS - C++/LLVM, partially opensource
  • circt-hls - C++/LLVM/Python, set of hls libraries for circt
  • ScaleHLS - C++/LLVM, MLIR based HLS compiler, ML focused
  • DuroHLS CorelabVerilog - C++/LLVM, set of hls passes
  • domino-compiler 2016 -> C++, c like packet processing language and compiler
  • orcc - C++/LLVM, Open RVC-CAL Compiler hw/sw dataflow and img processing focused

Useful publications

Timing database generator scripts

Releases

No releases published

Packages

No packages published