In this exercise you will integrate a new hardware "accelerator" into pulpissimo using reggen. For this purpose we prepared a dummy hw accelerator that performs arithmetic operations on very long data words (256-bits) simultaneously. The module has a non standard interface and must first be wrapped and connected to a register file so we can control it through normal loads and stores from the core. For that purpose we already prepared register file description file in hjson format usable by reggen. This will allow us to simultaneously generate the register file for our IP wrapper as well a generating documentation and C header files for the hardware abstraction layer (HAL). After wrapping the IP with a pulp compatible interface (we will use an APB plug and some protocol converter from some PULP IP library) we attach it to APB crossbar in pulp_soc and add the appropriate address rules for our peripheral.
Finally, after integating the IP into PULPissimo/pulp_soc we will proceed with writing into the sdk a small driver to interact with the IP using the RISC-V core in PULPissimo.
This exercise is quite complex to solve but it will guide you through all the necessary steps to integrate arbitrary IPs into pulp_soc. Although this particular IP will only contain a register slave port for communication, using the knowledge from the previous exercise (memory layout modification) it should not be hard to adapt the flow for IPs that also require master ports to the shared memory.
-
Familiarize yourself with the module we are going to integrate. Go through the toplevel signals and try to get a rough idea what the wide alu module does. As mentioned before, its taks is to accept two 256-bit operands and apply a user configurable operation (multiply, add, or etc.) on the operand and provide a 512-bit result. In addition to that, the wide-alu also allows to program a deacceleartion factor which is just a fancy name for a user configurable delay, how many clock cycles later the result should be availble (I know... a pretty useless "feature" 😄).
-
Go through the register file descripion
wide_alu.hjson
. Consult the documention on the opentitan website on the specifics of the reggen hjson format. It is not important to uderstand every detail but debugging will be easier if you have a rough idea what registers are used to map to the different signals of thewide_alu
module.Of course, if you were wrapping your own IP you would have to write this hjson file yourself.
-
Create a new github repository with the files of this directory and add a
Bender.yml
description file.We do not have any dependencies to other IPs yet but this will change as soon as we add the wrapper to our IP. Add the wide_alu dependency to the pulp_soc you are working on. From the pulpissimo folder do
./bender clone pulp_soc
. This will create the folderworking_dir/pulp_soc
and generate the Bender.lock which will tell bender to fetch pulp_soc from the new directory instead than github. Modify the pulp_soc's Bender.yml. From the pulpissimo folder:make checkout make scripts make build
These three steps will alwasy be needed when modifying the Bender.yml of one of the repositories in the
working_dir
. -
We are now going to write a wrapper for our wide_alu ip: In the directory of your new wide_alu repository, create a new SV module called
wide_alu_top
. The module should have the following portlist:- clk_i, rst_ni, test_mode_i
- A single register slave port (use the
reg_req_t
andreg_rsp_t
SystemVerilog struct). This interface is part of the register_interface IP so you can already add it as a dependency to yourBender.yml
(make sure to use the same version as pulp_soc in order to avoid version conflicts).register_interface: { git: "https://github.com/pulp-platform/register_interface.git", version: 0.3.1 }
and the following parameters:
#( parameter type reg_req_t = logic, parameter type reg_rsp_t = logic )
-
Generate the register file using the reggen tool. This exercise directory contains a link to the python tool that was patched by ETH for usage of the "Generic Register Interface Protocol" a much simpler alternative to the TileLink Protocol the lowRISC normally uses for their IPs.
git submodule --init --recursive ./register_interface/vendor/lowrisc_opentitan/util/regtool.py -r -t wide_alu.hjson
Have a look at the generated SystemVerilog package and the portlist of the generated register file. Include them in the Bender.yml of your wide_alu IP.
-
Declare wiring signals that will connect the register file with the actual wide_alu ip with the following snippet:
wide_alu_reg2hw_t reg_file_to_ip; wide_alu_hw2reg_t ip_to_reg_file;
These are once again structs. But this time, they are autogenerated by reggen. Regardless, you have to import the struct from the autogenerated SystemVerilog Package at the beginning of your module (Please not at the beginning of the file. Remember what I said about imports into the compilation unit scope!)
-
Instantiate the wide_alu IP. The two wiring signal structs we just created contain subfields for each register in our hjson and with each of those register fields, they contain fiels for the individual bitfields of our registers. Our last task for the wrapper is to connect those bitfields from the register file to the corresponding ports of our
wide_alu
ip. We provide you with the first part of the instantiation:wide_alu i_wide_alu( .clk_i, .rst_ni, .trigger_i(reg_file_to_ip.ctrl1.trigger.q & reg_file_to_ip.ctrl1.trigger.qe), .clear_err_i(reg_file_to_ip.ctrl1.clear_err.q & reg_file_to_ip.ctrl1.clear_err.qe), .op_a_i(reg_file_to_ip.op_a), .op_b_i(reg_file_to_ip.op_b), .result_o(ip_to_reg_file.result), .deaccel_factor_we_i(reg_file_to_ip.ctrl2.delay.qe), .deaccel_factor_i(reg_file_to_ip.ctrl2.delay.q), .deaccel_factor_o(ip_to_reg_file.ctrl2.delay.d), .op_sel_we_i(reg_file_to_ip.ctrl2.opsel.qe), .op_sel_i(wide_alu_pkg::optype_e'(reg_file_to_ip.ctrl2.opsel.q)), ...
The last two port connections are missing. Try to figure out yourself how you can connect those to the register file.
You might notice that for some signals we use only the
q
andd
signal, while we additionally use theqe
signal for others. Have a look at the OpenTitan documentaion on the register tool about the difference betweenhwext: True
registers and regular registers to figure out why.
-
Commit your changes and add your wide_alu repo as a dependency of
pulp_soc
. In the case you'll find a bug while integrating your wide_alu_top, you'll be able to work on it with the./bender clone wide_alu
command from the pulpissimo repository. As usual, to update yourwork
library for questasim you'll need to do:make checkout make scripts make build
-
Since we want to plug the IP to the APB crossbar with the same register interface we will need a protocol converter from APB to the "generic register interface". Luckily such a protocol converter is already availble in the register_interface repository on github. Have a look at the corresponding module. We will use this module when inserting the
wide_alu_top
in thepulp_soc/rtl/soc_peripherals.sv
-
Let's add an APB_SLAVE. Go to
rtl/includes/periph_bus_defines.sv
in pulpissimo and add a slave increasing NB_MASTER from 11 to 12. Then, add it as in the following address space:`define WIDE_ALU_START_ADDR 32'h1A12_0000 `define WIDE_ALU_END_ADDR 32'h1A12_0FFF
in
rtl/pulp_soc/periph_bus_wrap.sv
we have the APB crossbar. Let's plug the slave and expose the additional APB_BUS.Master wide_alu_master interface to the top.`APB_ASSIGN_MASTER(s_masters[11], wide_alu_master); assign s_start_addr[11] = `WIDE_ALU_START_ADDR; assign s_end_addr[11] = `WIDE_ALU_END_ADDR;
-
In the
rtl/pulp_soc/soc_peripherals.sv
we will instantiate thewide_alu_top
and theapb_to_register
converter. Unfortunately, the autogenerated register_file does not contain a neat wrapper the converts form SystemVerilog Interface to structs. We thus have to do this step ourself; Use the following snippet to do the conversion:
//Convert the REG_BUS interface to the struct signals used by the autogenerated interface
localparam RegAw = 6;
localparam RegDw = 32;
typedef logic [RegAw-1:0] reg_addr_t;
typedef logic [RegDw-1:0] reg_data_t;
typedef logic [RegDw/8-1:0] reg_strb_t;
`REG_BUS_TYPEDEF_REQ(reg_req_t, reg_addr_t, reg_data_t, reg_strb_t)
`REG_BUS_TYPEDEF_RSP(reg_rsp_t, reg_data_t)
reg_req_t reg_req;
reg_rsp_t reg_rsp;
`REG_BUS_ASSIGN_TO_REQ(reg_req,wide_alu_reg_bus)
`REG_BUS_ASSIGN_FROM_RSP(wide_alu_reg_bus,reg_rsp)
apb_to_reg i_apb_to_wide_alu
(
.clk_i ( clk_i ),
.rst_ni ( rst_ni ),
.penable_i ( s_wide_alu.penable ),
.pwrite_i ( s_wide_alu.pwrite ),
.paddr_i ( s_wide_alu.paddr ),
.psel_i ( s_wide_alu.psel ),
.pwdata_i ( s_wide_alu.pwdata ),
.prdata_o ( s_wide_alu.prdata ),
.pready_o ( s_wide_alu.pready ),
.pslverr_o ( s_wide_alu.pslverr ),
.reg_o ( wide_alu_reg_bus )
);
This is some macro magic that declares the structs for the "generic register interface protocol" with the correct bitwidth, declares signals of that struct type and assigns the signals of the interface to the struct signals and vice-versa. You will have to add the following includes to the top of your wrapper file for this to work:
`include "register_interface/typedef.svh"
`include "register_interface/assign.svh"
Don't forget to register the corresponding include directory in your src_files.yml
.
-
Instantiate your wide alu wrapper
wide_alu_top
in the same file. -
Commit your changes to pulp_soc to a new fork (or branch if you already have a fork) of
pulp_soc
and modify the version inBender.yml
of pulpissimo. -
That's all. The ip is now integrated. Update all ips, regenerate the TCL scripts and try to build pulpissimo with the integrated IP. Check the log to make sure it actually fetches the ips_list.yml of your
wide_alu
repository. Get rid of all remaining errors during compilation and makes sure pulpissimo builds without any errors (some warnings are ok).
Now that we are done integrating the IP into the HW, lets write a small driver to actually use it without hardcoding register addresses ourselves.
You are given an unfinished driver. You need to integrate it into pulp-sdk
and complete the implementation. If you do everything correctly, running the
given example program (test.c
) should return SUCCESS
You are given the following files
wide_alu_v1_hal.c
: contains partial implementation of the wide_alu driverwide_alu_v1_archi.h
: contains the header file for the partial implementaiton of the wide_alu driver
Follow these steps
-
Integrate the given put the driver into pulp-sdk by
- Moving the header file in
pulp-sdk/rtos/pulpos/pulp_archi/include/archi/wide_alu/
- Moving the implementation file into
pulp-sdk/rtos/pulpos/pulp_hal/include/hal/wide_alu/
- Patching up
properties.h
,memory_map.h
andinclude/pulp.h
in both HAL locations
- Moving the header file in
-
Finish the driver implementation and get the
test.c
program to print the correct result. In order to do that you are required to implement the following functions.int wide_multiply(uint32_t *a, uint32_t *b, uint32_t *result)
Before you start, studytest.c
, the already given driver files and header file.
When you run
test.c
it should printSUCCESS
to stdout.