-
Notifications
You must be signed in to change notification settings - Fork 17
/
Copy pathartifact
174 lines (115 loc) · 4.04 KB
/
artifact
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
## Build example project
## Rrerequisites
```
* GPU and FPGA are under the same PCIe switch.
* Linux 4.15.0-20-generic; Nvidia Driver Version: 450.51.05; CUDA Version: 11.0
```
## Getting Started
```
$ git clone https://github.com/RC4ML/FpgaNIC.git
$ git submodule update --init --recursive
```
## Build FPGA Project
It takes about 1-2 hours to generate a bitstream, you can also skip this step and directly use the generated bitstream file in the bitstream folder.
### Prerequisites
- Xilinx Vivado 2020.1
- Ubuntu (not sure whether other Linux OS works or not)
Supported boards
- Xilinx Alveo U280
### Steps for Building an FPGA Bitstream for the direct mood
#### 1. Create build directory
```
$ mkdir build
$ cd build
```
#### 2. Configure xdma project build
```
$ cmake ..
```
#### 3. Make HLS IP Core
```
$ make installip
```
#### 4. Create vivado project (You can choose one project to create)
##### a. Create direct project(Figure 6)
```
$ make direct
```
##### b. Create pcie_benchmark project(Figure 3 4)
```
$ make pcie_benchmark
```
##### c. Create tcp_latency project(Figure 5a)
```
$ make tcp_latency
```
##### d. Create tcp_benchmark project(Figure 5b)
```
$ make tcp_benchmark
```
##### e. Create allreduce project(Figure 7 8)
```
$ make allreduce
```
##### f. Create hyperloglog project(Figure 9)
```
$ make hyperloglog
```
#### 5. Generate bitstream
-open the project by Vivado2020.1 and generate bitstream
## Download the bitstream to FPGAs
### 1.Connect the download server
```
$ ssh -p 6000 atc_bitstream@101.37.28.229
```
### 2.Open the vivado
We need to open the GUI of Vivado to download the bitstream, so we need a terminal that supports X11 forwarding, such as MobaXterm.
```
$ vivado
```
### 3. Open hardware manage
As shown, click it.
![image](https://github.com/RC4ML/FpgaNIC/blob/gpu_hll/img/openhw.jpg)
### 4. Open target
Click "Open target" and "Open New Target..""
![image](https://github.com/RC4ML/FpgaNIC/blob/gpu_hll/img/opentar.jpg)
Click "Next"
![image](https://github.com/RC4ML/FpgaNIC/blob/gpu_hll/img/opentar1.jpg)
Choose "Remote server", the "Host name" is 192.168.189.23, "Port" is 3121, click "Next"
![image](https://github.com/RC4ML/FpgaNIC/blob/gpu_hll/img/opentar2.jpg)
Click "next", then "Finish"
![image](https://github.com/RC4ML/FpgaNIC/blob/gpu_hll/img/opentar3.jpg)
### 5. Download the bitstream
Right click "xilinx_tcf/Xilinx/221770205K038A" (server act_m4) and click "Open Target"
![image](https://github.com/RC4ML/FpgaNIC/blob/gpu_hll/img/downbit1.jpg)
Right click "xcu280_u55_0" and click "Program Device.."
![image](https://github.com/RC4ML/FpgaNIC/blob/gpu_hll/img/downbit2.jpg)
Click "..." to choose the bitstream
![image](https://github.com/RC4ML/FpgaNIC/blob/gpu_hll/img/downbit3.jpg)
Choose the bitstream for the experiments, such as pcie_benchmark.bit for Figure 3. the dictionary of bitstream is /home/atc_bitstream/bitstream
![image](https://github.com/RC4ML/FpgaNIC/blob/gpu_hll/img/opentar5.jpg)
Click "Program" to download the bitstrem to the FPGA
![image](https://github.com/RC4ML/FpgaNIC/blob/gpu_hll/img/downbit4.jpg)
### 6.Download the bitstream to the other machine
Open another terminal and repeat steps 1-5. In step 5, right click "xilinx_tcf/Xilinx/221770202700VA" (server act_m7). Then, download the same bitstream as server act_m4.
![image](https://github.com/RC4ML/FpgaNIC/blob/gpu_hll/img/opentar6.jpg)
### 7.Reboot server atc_m4 and atc_m7.
After the bitstreams are completely downloaded to the servers, open a terminal in atc_m4 and atc_m7 respectively, and reboot the server
```
$ sudo reboot
```
The driver will be loaded automatically, and the application can be executed in the ./sw or ./sw_dev dictionary.
## Build XDMA Driver
```
$ cd driver/
```
According to driver/README.md, build the driver and insmod the driver
## Build Software Project
```
$ cd sw/
```
According to sw/README.md, build the software project and run the application.
Note: Some of the experimental programs are in the ./sw_dev folder! But the experimental instructions are all in sw/README.md.
```
$ cd sw_dev/
```