-
Notifications
You must be signed in to change notification settings - Fork 3
/
IDEAS
40 lines (35 loc) · 1.42 KB
/
IDEAS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
URGENT:
write a proper n-way icache
TODO:
write suduko solver for LM32 2 evenings
add prefetch instruction (load r0) => re=we=0, fifo in pbus
add ITTAGE predictor 5 evenings
add sign to multiply 0.5 evenings
add SRT division step ... then use microcode? 5 evenings
other stuff:
CSRs => put in 2nd slow cycle mux with sext[bh]
interrupts => load a fault instruction
i_err => load a fault instruction
bad instruction handler => fault instruction
d_err => raise fault
finalize optimization of fast adder equality
implement TLB [3]
use the PC history to select victim way?
add L2 instruction and data prefetch?
try making non-faulting ops final once ready => IPC gain?
distinguish two types of dbus_busy, dbus loading into L1 and dbus cannot accept
split L1d into tag+dirty+word_valid and word+byte_valid => deeper M20k
implement FPU [5]
[3] include a nocache bit for IO mappings
[5] FPU ops take 4 cycles, but live in slow EUs
... to avoid additional write ports, add an extra bypass on slow memories.
=> 3-cycle writes go into readable bypass register
=> write-back happens on 4th cycle
(just like how in-order CPUs do it)
BUGS:
return stack gets pushed and popped speculatively
zeroing memory gets into a bad feedbak loop with gaps
... old ordered patch does not help
=> need to kill ALL ordered ops
bad executed instructions corrupt prediction SM with 'X's
pbus double-read! (retry faster than device ack)