run <assembly-filename>
[-s, --step-by-step](bool) (run interactively step by step. default: false)
[-v, --verbose](bool) (verbose on debug mode. default: false)
[-o, --output-folder](string) (output folder where to store debug and memory files)
[-c, --config-filename](string) (processor config filename, a valid config filename is required)
[--max-cycles](int) (maximum number of cycles to execute. default: 3000)
Sample: run samples/programs/fibonacci.asm -c samples/configs/default.config-o results/my-test --max-cycles 1000 --step-by-step -v
bin/
- Autogenerated binaries when the program is compiled
benchmark/
- Autogenerated output files/stats when benchmarks are executed
samples/
benchmark/
- sh scripts for executing the different benchmarks in the simulator
configs/
- list of different configurations with different architectures to be benchmarked
programs/
- list of different programs availables to run benchmarks
src/ (source code)
github.com/codegangsta/cli/
- Open source library for command line application "styling".
app/ (Processor-simulator source code)
logger/
- Files for managing logging
simulator/
processor/
- Definition of the processor models, components, config and all bussiness-logic
standards/
- Definition of standards used and its implementation (IEEE754)
translator/
- Translafor in charge of compile the assembly file (.asm) and produce the machine code file (.hex)
main.go
- Entry point :)
utils/
- Go utilities
If the program is executed using the flag -s
or --step-by-step
you will be able to see the state of registers and/or data memory at the end of every step executed otherwise if the flag is not provided you will be able to see the final state at the end.
The following menu will be presented:
Press the desired key and then hit [ENTER]...
- (R) to see registers memory
- (D) to see data memory
- (E) to exit and quit
- (*) Any other key to continue
If selected R
or D
, the data will be displayed in the following format:
0x00 0x04 0x08 0x0C
0x00 0x0000000A 0x0010000A 0x000C0000 0x00000000
0x10 0x000100E8 0x00000008 0x00000012 0x00000087
0x20 0x0000FF00 0x00000000 0x00D00068 0x002000A8
0x30 0x000000E8 0x0000C008 0x00000012 0x00000087
0x40 0x00000012 0x00000000 0x00100000 0x00000000
....
At the end six files will be generated with details of the execution, debugging and final memory states.
The location of those output files can be selected with the flag -o
or --output-folder
- assembly.hex: Machine code interpreted by the processor
- memory.dat: Final state of the data memory.
- registers.dat: Final state of the registers.
- output.log: Execution resources according to the configuration and output statistics.
- debug.log: Complete log for debugging purposes.
- pipeline.dat: Pipeline diagram of the different executed instruction stages vs execution cycles
This application has a builtin translator that converts human readable assembly instructions into machine code, the available instructions allowed are the ones defined on the previous instructions section.
- Only one instruction allowed per line
- Comments prefix is
;
- Comments are allowed to be on a single line or after an instruction in the same line
- It does not care about the amount of empty spaces or tabs
- Branch labels must be on a single line
- No instructions allowed to be on the same line where the branch label is declared
- Blank lines are allowed
; Here are some comments on a new line
PROCESS_LOOP: ; Here is a label followed by an inline comment
ADDI R1, R1, 1 ; Here is a instruction along with its operands and an inline comment
; Here it is an empty line which is allowed followed by an inline comment
ADD R15, R15, R16 ; R15 += C[I]
BLT R1, R20, PROCESS_LOOP ; Here is an instruction using a branch label followed by an inline comment
The following diagram shows the components of the architecture along with the pipeline and the interaction with its components:
- 32 bits architecture
- Scalar, Pipelined or N-way superscalar
- Out-of-order execution and non-blocking issue
- 32 general purpose registers (32-bit) (used for integer & FP)
- 1 MB Instructions Memory
- 1 MB Data Memory
- Fetch, Decode, Issue/Dispatch, Execute & Writeback
- 2 ALU units
- 2Load/Store units
- 1 Branch units
- 1 FPU units
- None (Stall)
- Static: Always, Never, Forward, Backward
- Dynamic: One bit predictor, Two-bit predictor (BHT)
- Instruction Fetch Unit (IFU):
- 16 bytes fetch on each cycle (4 instructions)
- Instruction Queue (IQ):
- 18 instructions buffer
- Instruction Decoding Unit (IDU)
- 4 decoding units
- Instructions Decoded Queue (IDQ):
- 28 instructions buffer
- Common Data Bus (CDB)
- Register Renaming
- Register Alias Table (RAT) with 32 entries
- Reorder buffer (ROB)
- 32 entries
- Up to 4 instructions written back on each cycle
- Unified reservation station (URS):
- 128 entries
- Up to 6 instructions dispatched on each cycle
The architecture of the processor can be configured based on a json
file that will enable/disable/set different features of the processor.
Some configurations available at: samples/configs
{
"cycle_period_ms": 70,
"registers_memory_size": 128,
"instructions_memory_size": 1024,
"data_memory_size": 1024,
"branch_predictor_type": "one_bit",
"pipelined": true,
"instructions_fetched_per_cycle": 4,
"instructions_queue": 18,
"instructions_decoded_queue": 28,
"instructions_dispatched_per_cycle": 6,
"instructions_written_per_cycle": 6,
"reservation_station_entries": 128,
"reorder_buffer_entries": 32,
"register_alias_table_entries": 32,
"decoder_units": 4,
"branch_units": 1,
"load_store_units": 2,
"alu_units": 2,
"fpu_units": 1
}
- 32 bit instructions wide
- Instructions formats: R, I & J
- Instructions types: Arithmetic (ALU & FPU), Load/Store, Control/Branch
- 32-bit registers used for integer operations or floating point operations
The next tables shows the format structure of the instructions accordingly to the different types: R, I, J
Type | Format (32 bits) | |||||
---|---|---|---|---|---|---|
R | Opcode (6) | Rd (5) | Rs (5) | Rt (5) | Shmt (5) | Func (6) |
Type | Format (32 bits) | |||
---|---|---|---|---|
I | Opcode (6) | Rd (5) | Rs (5) | - I m m e d i a t e (1 6 b i t s) - |
Type | Format (32 bits)|| -----|------------|----|| J | Opcode (6) | - - - - - - - - - - A d d r e s s (2 6 b i t s ) - - - - - - - - - - |
- All instructions are
32-bit
long (1 word
) Rs
,Rt
, andRd
are general purpose registersPC
stands for the program counter addressC
denotes a constant (immediate)-
denotes that those values do not care
-
From Opcode 000000 to 001111
-
ALU
Syntax | Description | Type | --------------------|-----------------|------| add/addi Rd,Rs,Rt | Rd = Rs + Rt/C | R | sub/subi Rd,Rs,Rt | Rd = Rs - Rt/C | R | cmp Rd,Rs,Rt | Rd = Rs <=> Rt | R | mul Rd,Rs,Rt | Rd = Rs * Rt | R | shl/shli Rd,Rs,Rt | Rd = Rs << Rt/C | R | shr/shrl Rd,Rs,Rt | Rd = Rs >> Rt/C | R | and/andi Rd,Rs,Rt | Rd = Rs & Rt/C | R | or/ori Rd,Rs,Rt | Rd = Rs | Rt/C | R |
-
FPU
Syntax | Description | Type | ----------------|--------------|------| fadd Rd,Rs,Rt | Rd = Rs + Rt | R | fsub Rd,Rs,Rt | Rd = Rs - Rt | R | fmul Rd,Rs,Rt | Rd = Rs * Rt | R | fdiv Rd,Rs,Rt | Rd = Rs / Rt | R |
-
From Opcode 010000 to 011111
Syntax | Description | Type | Notes | ---------------|----------------|------|-------------------------| lw Rd,Rs,C | Rd = M[Rs + C] | I | load M[Rs + C] into Rd | sw Rd,Rs,C | M[Rd + C] = Rs | I | store Rd into M[Rs + C] | lli Rd,C | Rd = C | I | load lower immediate | sli Rd,C | M[Rd] = C | I | store lower immediate | lui Rd,C | Rd = C << 16 | I | load upper immediate | sui Rd,C | M[Rd] = C << 16| I | store upper immediate |
Control-PDF file
-
From Opcode 100000 to 101111
Syntax | Description | Type | Notes | ---------------|-----------------|------|----------------------| beq Rd,Rs,C | br on equal | I | PC = PC + 4 + 4C | bne Rd,Rs,C | br on not equal | I | PC = PC + 4 + 4C | blt Rd,Rs,C | br on less | I | PC = PC + 4 + 4C | bgt Rd,Rs,C | br on greater | I | PC = PC + 4 + 4C | j C | jump to C | J | PC = 4*C |
The following PDF file contains a brief description of the processor simulator along with the different experiments and benchmarks performed on this project