FpgaNIC is an FPGA-based, GPU-centric, versatile SmartNIC
1, that enables direct PCIe P2P communication with local GPUs using GPU virtual address,
2, that allows GPUs to directly manipulate FpganIC without CPU intervention,
3, that provides reliable 100Gb network access to remote GPUs, and
4, that allows to offload various complex compute tasks to a customized data-path accelerator for line-rate in-network computing on the FPGA, thereby complementing the processing at the GPU.
Besides, FpgaNIC enables efficient efficient FPGA-GPU co-processing.
-
At least two nodes, each has a GPU that supports NVIDIA GPUDirect and a Xilinx U280 or U50 card.
-
Each FPGA card is connected to a 100Gbps Ethernet switch.
-
FPGA card and GPU are connected to the same PCIe switch.
-
Host OS: Linux 4.15.0-20-generic
-
Nvidia Driver Version: 450.51.05
-
CUDA Version: 11.0
7, Make sure that each server has enabled Hugepages.
There are three steps to run each experiment. Before running FpgaNIC, please clone the source code:
$ git clone https://github.com/RC4ML/FpgaNIC
- $ cd bitstream
-
Using vivado and flush the bitstream to every FPGA card.
-
Every time you download the bitstream to the FPGA, you have to reboot the machine, do not forget to reinstall xdma driver and GDR driver.
-
$ cd FpgaNIC/driver
-
$ make && sudo insmod xdma_driver.ko
-
$ cd FpgaNIC/gdrcopy
-
$ sudo ./insmod.sh
-
Note that you need to reinstall xdma driver and gdr driver every time you reboot your machine.
-
$ cd FpgaNIC/sw && mkdir build && cd build
-
$ cmake ../src
-
$ make
-
$ sudo ./dma-example -b 0
-
$ Above command would report GPU read CPU memory latency, for more details, please refer to sw/README.md
If you use it in your paper, please cite our work
@inproceedings{wang_atc22,
title={FpgaNIC: An FPGA-based Versatile 100Gb SmartNIC for GPUs},
author={Zeke Wang and Hongjing Huang and Jie Zhang and Fei Wu and Gustavo Alonso},
year={2022},
booktitle={2022 USENIX Annual Technical Conference (ATC)},
}