SPLAG is an FPGA accelerator for the single-source shortest path (SSSP) problem, featuring:
- A coarse-grained priority queue (CGPQ) that enables high-throughput priority-order graph traversal with a large frontier
- A customized vertex cache (CVC) that reduces off-chip memory access and improves the random-access throughput to read and update vertex data
- Outstanding performance & energy efficiency on a single U280 FPGA
- Up to a 4.9× speedup over state-of-the-art SSSP accelerators
- Up to a 2.6× speedup over 32-thread CPU running at 4.4 GHz
- Up to a 0.9× speedup over an A100 GPU (that has 4.1× power budget and 3.4× HBM bandwidth)
- Could be placed in the 14th position of the Graph 500 benchmark for data intensive applications
- The highest using a single FPGA with only a 45 W power budget
- Open-source and fully parameterized TAPA HLS C++ implementation
- Easily portable to a different FPGA with a different configuration
- TAPA
- AutoBridge
- Xilinx Alveo U280 FPGA and its
xilinx_u280_xdma_201920_3
shell platform
git clone https://github.com/UCLA-VAST/splag.git
cd splag
mkdir build
cd build
cmake ..
make sssp
./sssp ../data/graph500-scale-5
make sssp-hw-xo
make sssp-cosim
make SSSP.xilinx_u280_xdma_201920_3.hw_xclbin
./sssp -v=3 --bitstream=SSSP.xilinx_u280_xdma_201920_3.hw.xclbin ../data/graph500-scale-5
Currently, SPLAG takes as input the same binary edge list format as Graph 500.
Each dataset is stored as two separate files in the same directory, e.g., dataset
and dataset.weights
.
The dataset
file is an array of PackedEdge
.
The dataset.weights
file is an array of 32-bit float
.
The dataset is assumed to be undirected.
See the release page for more datasets and instructions on reproducing the experimental results in the FPGA'22 paper.