Skip to content

Commit

Permalink
Merge pull request #204 from stnolting/cpu_logic_optimization
Browse files Browse the repository at this point in the history
CPU logic optimization
  • Loading branch information
stnolting authored Nov 14, 2021
2 parents ce2b4b1 + 11140f2 commit 5bb4782
Show file tree
Hide file tree
Showing 6 changed files with 499 additions and 558 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ defined by the `hw_version_c` constant in the main VHDL package file [`rtl/core/

| Date (*dd.mm.yyyy*) | Version | Comment |
|:----------:|:-------:|:--------|
| 14.11.2021 | 1.6.3.7 | major control unit and ALU logic optimizations, reduced hardware footprint; closed further illegal instruction encoding holes (system environment instructions, ALU and ALU-immediate instructions, FENCE instructions); [PR #204](https://github.com/stnolting/neorv32/pull/204) |
| 10.11.2021 | 1.6.3.6 | optimized BUSKEEPER: removed redundant logic - bus keeper now also shows an external interface access timeout (if implemented) as "timeout error"; removed _BUSKEEPER_ERR_SRC_ status flag; :warning: added `err_o` (fault access operation) to the custom functions subsystem (CFS) |
| 09.11.2021 | 1.6.3.5 | :warning: reworked IRQ trigger logic of SPI, TWI, UART0, UART1, NELOED and SLINK; FIRQs now only trigger **once** when the programmed interrupt condition is met instead of triggering **all the time** (see [PR #202](https://github.com/stnolting/neorv32/pull/202)) |
| 06.11.2021 | 1.6.3.4 | :bug: fixed bug in **WISHBONE** interface: _pipelined_ Wishbone mode did not clear STB after first transfer cycle |
Expand Down
21 changes: 12 additions & 9 deletions docs/datasheet/cpu.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -611,25 +611,28 @@ code (see `sw/example/floating_point_test`).

The CSR access instructions as well as the exception and interrupt system (= the privileged architecture)
is implemented when the `CPU_EXTENSION_RISCV_Zicsr` configuration generic is _true_.

[IMPORTANT]
If the `Zicsr` extension is disabled the CPU does not provide any _privileged architecture_ features at all!
In order to provide the full set of privileged functions that are required to run more complex tasks like
operating system and to allow a secure execution environment the `Zicsr` extension should always be enabled.

In this case the following instructions are available:

* CSR access: `csrrw`, `csrrs`, `csrrc`, `csrrwi`, `csrrsi`, `csrrci`
* environment: `mret`, `wfi`

[WARNING]
If the `Zicsr` extension is disabled the CPU does not provide any _privileged architecture_ features at all!
In order to provide the full set of functions and to allow a secure execution
environment the `Zicsr` extension should always be enabled.
[NOTE]
If `rd=x0` for the `csrrw[i]` instructions there will be no actual read access to the according CSR.
However, access privileges are still enforced so these instruction variants _do_ cause side-effects
(the RISC-V spec. state that these combinations "_shall_ not cause any side-effects").

[NOTE]
The "wait for interrupt instruction" `wfi` works like a sleep command. When executed, the CPU is
The "wait for interrupt instruction" `wfi` acts like a sleep command. When executed, the CPU is
halted until a valid interrupt request occurs. To wake up again, the according interrupt source has to
be enabled via the `mie` CSR and the global interrupt enable flag in `mstatus` has to be set.

[NOTE]
The `wfi` instruction may also be executed in user-mode without causing an exception as <<_mstatus>> bit
`TW` (timeout wait) is hardwired to zero.

`TW` (timeout wait) is _hardwired_ to zero.



Expand Down
5 changes: 4 additions & 1 deletion rtl/core/neorv32_cpu.vhd
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ architecture neorv32_cpu_rtl of neorv32_cpu is
signal be_store : std_ulogic; -- bus error on store data access
signal fetch_pc : std_ulogic_vector(data_width_c-1 downto 0); -- pc for instruction fetch
signal curr_pc : std_ulogic_vector(data_width_c-1 downto 0); -- current pc (for current executed instruction)
signal next_pc : std_ulogic_vector(data_width_c-1 downto 0); -- next pc (for next executed instruction)
signal fpu_flags : std_ulogic_vector(4 downto 0); -- FPU exception flags

-- pmp interface --
Expand Down Expand Up @@ -285,6 +286,7 @@ begin
imm_o => imm, -- immediate
fetch_pc_o => fetch_pc, -- PC for instruction fetch
curr_pc_o => curr_pc, -- current PC (corresponding to current instruction)
next_pc_o => next_pc, -- next PC (corresponding to next instruction)
csr_rdata_o => csr_rdata, -- CSR read data
-- FPU interface --
fpu_flags_i => fpu_flags, -- exception flags
Expand Down Expand Up @@ -355,7 +357,8 @@ begin
-- data input --
rs1_i => rs1, -- rf source 1
rs2_i => rs2, -- rf source 2
pc2_i => curr_pc, -- delayed PC
pc_i => curr_pc, -- current PC
pc2_i => next_pc, -- next PC
imm_i => imm, -- immediate
csr_i => csr_rdata, -- CSR read data
-- data output --
Expand Down
91 changes: 41 additions & 50 deletions rtl/core/neorv32_cpu_alu.vhd
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,8 @@ entity neorv32_cpu_alu is
-- data input --
rs1_i : in std_ulogic_vector(data_width_c-1 downto 0); -- rf source 1
rs2_i : in std_ulogic_vector(data_width_c-1 downto 0); -- rf source 2
pc2_i : in std_ulogic_vector(data_width_c-1 downto 0); -- delayed PC
pc_i : in std_ulogic_vector(data_width_c-1 downto 0); -- current PC
pc2_i : in std_ulogic_vector(data_width_c-1 downto 0); -- next PC
imm_i : in std_ulogic_vector(data_width_c-1 downto 0); -- immediate
csr_i : in std_ulogic_vector(data_width_c-1 downto 0); -- CSR read data
-- data output --
Expand All @@ -85,10 +86,8 @@ architecture neorv32_cpu_cpu_rtl of neorv32_cpu_alu is

-- results --
signal addsub_res : std_ulogic_vector(data_width_c downto 0);
--
signal alu_res : std_ulogic_vector(data_width_c-1 downto 0);
signal cp_res : std_ulogic_vector(data_width_c-1 downto 0);
signal arith_res : std_ulogic_vector(data_width_c-1 downto 0);
signal logic_res : std_ulogic_vector(data_width_c-1 downto 0);

-- co-processor arbiter and interface --
type cp_ctrl_t is record
Expand Down Expand Up @@ -119,7 +118,7 @@ begin

-- ALU Input Operand Mux ------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
opa <= pc2_i when (ctrl_i(ctrl_alu_opa_mux_c) = '1') else rs1_i; -- operand a (first ALU input operand), only required for arithmetic ops
opa <= pc_i when (ctrl_i(ctrl_alu_opa_mux_c) = '1') else rs1_i; -- operand a (first ALU input operand), only required for arithmetic ops
opb <= imm_i when (ctrl_i(ctrl_alu_opb_mux_c) = '1') else rs2_i; -- operand b (second ALU input operand)


Expand All @@ -136,31 +135,55 @@ begin
op_a_v := (opa(opa'left) and (not ctrl_i(ctrl_alu_unsigned_c))) & opa;
op_b_v := (opb(opb'left) and (not ctrl_i(ctrl_alu_unsigned_c))) & opb;
-- add/sub(slt) select --
if (ctrl_i(ctrl_alu_addsub_c) = '1') then -- subtraction
if (ctrl_i(ctrl_alu_op0_c) = '1') then -- subtraction
op_y_v := not op_b_v;
cin_v(0) := '1';
else -- addition
op_y_v := op_b_v;
cin_v(0) := '0';
end if;
-- adder core (result + carry/borrow) --
-- adder core --
addsub_res <= std_ulogic_vector(unsigned(op_a_v) + unsigned(op_y_v) + unsigned(cin_v(0 downto 0)));
end process binary_arithmetic_core;

-- direct output of address result --
-- direct output of adder result --
add_o <= addsub_res(data_width_c-1 downto 0);

-- ALU arithmetic logic core --
arithmetic_core: process(ctrl_i, addsub_res)

-- ALU Operation Select -------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
alu_core: process(ctrl_i, addsub_res, rs1_i, opb)
begin
if (ctrl_i(ctrl_alu_arith_c) = alu_arith_cmd_addsub_c) then -- ADD/SUB
arith_res <= addsub_res(data_width_c-1 downto 0);
else -- SLT
arith_res <= (others => '0');
arith_res(0) <= addsub_res(addsub_res'left); -- => carry/borrow
end if;
end process arithmetic_core;
case ctrl_i(ctrl_alu_op2_c downto ctrl_alu_op0_c) is
when alu_op_add_c => alu_res <= addsub_res(data_width_c-1 downto 0); -- (default)
when alu_op_sub_c => alu_res <= addsub_res(data_width_c-1 downto 0);
-- when alu_op_mova_c => alu_res <= rs1_i; -- FIXME
when alu_op_slt_c => alu_res <= (others => '0'); alu_res(0) <= addsub_res(addsub_res'left); -- => carry/borrow
when alu_op_movb_c => alu_res <= opb;
when alu_op_xor_c => alu_res <= rs1_i xor opb; -- only rs1 required for logic ops (opa would also contain pc)
when alu_op_or_c => alu_res <= rs1_i or opb;
when alu_op_and_c => alu_res <= rs1_i and opb;
when others => alu_res <= addsub_res(data_width_c-1 downto 0);
end case;
end process alu_core;

-- ALU Function Select --------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
alu_function_mux: process(ctrl_i, alu_res, pc2_i, csr_i, cp_res)
begin
case ctrl_i(ctrl_alu_func1_c downto ctrl_alu_func0_c) is
when alu_func_core_c => res_o <= alu_res; -- (default)
when alu_func_nxpc_c => res_o <= pc2_i;
when alu_func_csrr_c => res_o <= csr_i;
when alu_func_copro_c => res_o <= cp_res;
when others => res_o <= alu_res; -- undefined
end case;
end process alu_function_mux;


-- **************************************************************************************************************************
-- Co-Processors
-- **************************************************************************************************************************

-- Co-Processor Arbiter -------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
Expand Down Expand Up @@ -193,7 +216,7 @@ begin
end process cp_arbiter;

-- is co-processor operation? --
cp_ctrl.cmd <= '1' when (ctrl_i(ctrl_alu_func1_c downto ctrl_alu_func0_c) = alu_func_cmd_copro_c) else '0';
cp_ctrl.cmd <= '1' when (ctrl_i(ctrl_alu_func1_c downto ctrl_alu_func0_c) = alu_func_copro_c) else '0';
cp_ctrl.start <= '1' when (cp_ctrl.cmd = '1') and (cp_ctrl.cmd_ff = '0') else '0';

-- co-processor select / star trigger --
Expand All @@ -209,38 +232,6 @@ begin
cp_res <= cp_result(0) or cp_result(1) or cp_result(2) or cp_result(3);


-- ALU Logic Core -------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
alu_logic_core: process(ctrl_i, rs1_i, opb)
begin
case ctrl_i(ctrl_alu_logic1_c downto ctrl_alu_logic0_c) is
when alu_logic_cmd_movb_c => logic_res <= opb; -- (default)
when alu_logic_cmd_xor_c => logic_res <= rs1_i xor opb; -- only rs1 required for logic ops (opa would also contain pc)
when alu_logic_cmd_or_c => logic_res <= rs1_i or opb;
when alu_logic_cmd_and_c => logic_res <= rs1_i and opb;
when others => logic_res <= opb; -- undefined
end case;
end process alu_logic_core;


-- ALU Function Select --------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
alu_function_mux: process(ctrl_i, arith_res, logic_res, csr_i, cp_res)
begin
case ctrl_i(ctrl_alu_func1_c downto ctrl_alu_func0_c) is
when alu_func_cmd_arith_c => res_o <= arith_res; -- (default)
when alu_func_cmd_logic_c => res_o <= logic_res;
when alu_func_cmd_csrr_c => res_o <= csr_i;
when alu_func_cmd_copro_c => res_o <= cp_res;
when others => res_o <= arith_res; -- undefined
end case;
end process alu_function_mux;


-- **************************************************************************************************************************
-- Co-Processors
-- **************************************************************************************************************************

-- Co-Processor 0: Shifter (CPU Core ISA) --------------------------------------------------
-- -------------------------------------------------------------------------------------------
neorv32_cpu_cp_shifter_inst: neorv32_cpu_cp_shifter
Expand Down
Loading

0 comments on commit 5bb4782

Please sign in to comment.