-
Notifications
You must be signed in to change notification settings - Fork 141
Implement delayed branches in Functional simulation #626
Comments
It is not clear, where exactly i should add this option for delay slots. |
I don't know either. I suggest to start with implementing it in FuncSim. |
So i should add bool enabe_delayed_branches argument here? Or in the constructor? |
Usually constructor is preferrable, as you are able to look configuration in all class methods. |
So I should add function, that replaces instructions in instruction list, when delayed slots option is turned on? |
First of all, I advise to get sure you have the test trace which is failing without delayed branches.
What is the "instruction list"? |
List of instructions that run method need to proceed. I mean, changing instructions with each other shouldn't be the work of simulator, should it? In my vision order of the instructions should be changed in constructor, as it would be done before by compiler. Is my vision proper? P.s: Or I have to work in model where simulator knows only instructions it already decoded? But then it is impossible to use delayed brunch method. On lecture we've been told that finding independent instructions is done with the compiler not with the pipeline. So is this issue's point is figuring out some algorithm to find independent instructions? |
And is there any branch prediction functionality in functional simulator now? |
Yes, it is clearly stated in README file
No, we do not have such a list. We fetch instructions one-by-one from memory, as the next instruction program counter is dependent on previous instruction result.
Right. But our simulator does not assume compiler may do this at the moment, so instead of this:
we get this:
which is not what compiler wanted us to do.
The point is to execute 1 or 2 instruction after the branch independently from branch condition, i.e. the program counter change should be effective with a delay. |
Sorry for stupid questions, but i can't find anything to understand what should i do here. There are no tests for delayed slots in traces directory and i don't understand what set noreorder directive is. |
GitHub has a nice feature called "search": By default, MIPS compiler/assembler assumes the CPU (or simulator) supports branch delays and therefore it finds independent instructions to fill the delay slot. Since MIPT-MIPS does not support it, we had to disable them by setting the |
More detailed explanation how compiler/assembler works: |
I was searching in this repository, so that i couldn't find anything, sorry. So after I remove all .set noreorder detectives all the reorders I will be able to cover all the reorders by implementing delayed branches? |
Actually, I never investigated deeply how assembler works and what brake in torture tests — I just disabled reordering and TT started to work, Let's start with simpler test provided on SO; .set noreorder
.section .text
.globl __start
__start:
addi $a0, $0, 100
addi $a1, $0, 200
jal test
jr $zero # Required to halt simulation
test:
add $v0, $a0, $a1
jr $ra If we build it with current flow: we should get something like the following code (you can check it with
We need to get something like SO author had:
|
So in order to test it I need to use MIPS binutils or something else? |
Yes, that's the only way to build ELF files from MIPS assembler. |
I guess I should copy of all tests with noreorder because both functional simulator and performance simulator use same tests, and paste new path here:
|
If you want to run these tests with |
That is how disassembly looks like:
There are 2 instructions after the jump, and these 2 are always filled with nops or independent instructions, so simulation should work without any pipeline flushing or whatever, am I right? |
We do not touch pipeline at the moment, we are talking only about functional simulation (single-cycle implementation, func_sim.cpp).
Please run functional simulation at your own and see what it does. |
I've just understood that functional simulator is single-cycled, but then it can't face any branches hazards. So what does 'implementing delayed branches' stands for if we talk about single-cycled simulator? |
Please, do this and share what instructions are executed by default ( |
I see the problem. So as soon as simulator meets branch condition it should execute 2 more instructions before jumping? |
That's the problem with delayed branches method Oleg explained on the lecture — whereas it works well in pipelined microarchitecture, it still should be supported in other implementations, like single-cycle or more complicated. |
Why do we need to optionally select 0, 1 or 2 if compiler always put 2 instructions after the branch condition |
So far I don't see 2 instructions case, I see always 1 — please explain if I'm wrong
|
And, finally, we won't have a branch delay slot in PerfSim for some time, so we have to have no slots to ensure both simulators behave similarly. |
This delayed option slots should be one of the command line parameters ideally, or I should just add it into constructor parameter as you said? |
Do as best as you can, we'll refactor that later. |
Is alu.h related only to the functional simulator? |
No. |
Committing 1 point as something like a solution was proposed |
Closing as duplicate of #737 |
Delayed branch is one of the (obsolete) ways to fix problem of control hazards, but, however, it exists in MIPS and cannot be removed — there is a lot of binary files compatible with it.
The idea is to put an instruction independent from branch to be executed right after the branch. Then, in a case of branch misprediction, that instruction is not flushed from the pipeline:
Let's assume there is a following control flow:
Pipeline without delayed branches:
We save one cycle with a single branch delay slot:
We save two cycles with two delay slots:
How can we support branch delay slot in micro-architecture? There are straightforward steps:
.set noreorder
directive in MIPS assembly files.The text was updated successfully, but these errors were encountered: