Skip to content

Implement Instruction Cache Unit and introduce it into pipeline #18

Closed
pavelkryukov opened this issue Apr 14, 2016 · 26 comments
Closed
Labels
3 Features of medium complexity or infrastructure enhancements enhancement Adds a new feature to simulation. S2 — Caches To solve the issue, you NEED knowledge about caches. OOO hierarchy etc.

Comments

@pavelkryukov
Copy link
Member

pavelkryukov commented Apr 14, 2016

We have a cache model implemented, but it is not connected to PerfSim.

AMB: Instruction Cache Unit is implemented and performance studies are made on instruction stress trace.

@pavelkryukov pavelkryukov added enhancement Adds a new feature to simulation. 4 Features of medium complexity which usually require infrastructure enhancements. labels Apr 14, 2016
@pavelkryukov pavelkryukov changed the title Implement Instruction Cache Unit and introduce it into pipeline (4 points) Implement Instruction Cache Unit and introduce it into pipeline Apr 18, 2016
@pavelkryukov pavelkryukov added 3 Features of medium complexity or infrastructure enhancements and removed 4 Features of medium complexity which usually require infrastructure enhancements. labels Apr 17, 2017
@pavelkryukov pavelkryukov added the S2 — Caches To solve the issue, you NEED knowledge about caches. OOO hierarchy etc. label Apr 26, 2017
@pavelkryukov pavelkryukov added this to the Performance Model milestone Oct 23, 2017
@pavelkryukov
Copy link
Member Author

pavelkryukov commented Feb 27, 2018

On cache hit, proceed as is.
On cache miss, push bubble into pipeline and send a request to long-latency loop port to model memory access. Special thread (clock_instr_cache) listens to that port and fill cache with actual data.

Instruction cache should be controlled by 3 configuration values:

  • Cache size in bytes
  • Cache associativity
  • Size of cache line

@pavelkryukov
Copy link
Member Author

pavelkryukov commented Feb 27, 2018

@alex19999 — I think you almost completed the pipeline part of the task. Would you like to proceed with introduction of ICache class?

@alex19999
Copy link
Contributor

alex19999 commented Feb 27, 2018 via email

@pavelkryukov
Copy link
Member Author

Good!

You may find detailed description of CacheTagArray here: https://github.com/MIPT-ILab/mipt-mips/wiki/Cache-model

Default values for parameters should be:

  • size = 2048
  • ways = 4
  • line size = 64

@alex19999
Copy link
Contributor

alex19999 commented Feb 27, 2018 via email

@pavelkryukov
Copy link
Member Author

There is no dedicated trace to test IC, but you may use dc_ic_stress.s. By chaning cache parameters, you should see performance results close to the chart from L11.

See this page to get list of all cache configurations and fill the IPC numbers: https://github.com/MIPT-ILab/mipt-mips/wiki/cache-associativiy-studies

@pavelkryukov
Copy link
Member Author

Not closing that as there are next steps:

  • Cache size, associativity, and line size should be configurable with Config objects
  • Sensitivity to cache line size should be demonstrated with dc_ic_stress.s. If cache line is 4 bytes, each instruction is a cache miss → AMAT=30. If cache line is 8 bytes, each second instruction is a cache miss → AMAT=15. Since dc_ic_stress.s does not contain loops, there is no much sensitivity to cache size and/or # of ways.

@alex19999
Copy link
Contributor

Pavel Igorevich, may I change
create_simulator( const std::string& isa, bool functional_only, bool log) function?( I mean I am going to add cache parameters(uint32 size_in_bytes, uint32 ways, uint32 line_size))

@pavelkryukov
Copy link
Member Author

pavelkryukov commented Mar 13, 2018

It's better to put these configuration objects into fetch.cpp.
Use BPU code as example.

@alex19999
Copy link
Contributor

Allright, thank you

@alex19999
Copy link
Contributor

And about task 2: As I've understood, the task is to make cache cache associativity studies. What files should be in the commit?

@pavelkryukov
Copy link
Member Author

pavelkryukov commented Mar 13, 2018

What files should be in the commit?

My original intention is to have a Wiki page with chart, similar to what we got from standalone cache. But:

Since dc_ic_stress.s does not contain loops, there is no much sensitivity to cache size and/or # of ways.

I realized that the only IC hungry trace dc_ic_stress.s is not a trace to run such study. It is just a simple stream of instructions, so all cache misses are 'cold' misses. Moreover, if you add a backward loop from the end of trace to the beginning, you won't see results neither, since instructions from the end of trace replace ones from the beginning of trace.

So the only effective parameter is cache line size which corresponds to throughput of our fictionuous memory->cache interface. You may just put results with different options here.

@alex19999
Copy link
Contributor

So, as I understood, I only need to count the AMAT depending on the size of the cache line and leave these statistics here using dc_ic_stress.c. This is not very important, but how many instructions should I take? Сorrect me if I was wrong somewhere, please.

@pavelkryukov
Copy link
Member Author

pavelkryukov commented Mar 14, 2018

You may skip AMAT and use IPC instead. I would use 400.000 instructions.

@alex19999
Copy link
Contributor

Constant parameters:

  • Instruction cache size in bytes = 2048;

  • Amount of ways in instruction cache = 4;

  • dc_is_stress.s was used for study with 400.000 instructions;

Cache study results

Line size of instruction cache (in bytes) IPC
4 0.041842
8 0.0803065
16 0.148593
32 0.25847
64 0.410101
128 0.580328
256 0.732299
512 0.842629

@pavelkryukov
Copy link
Member Author

The data on plot. Looks reasonable!

image

Thank you for your contribution!

@pavelkryukov
Copy link
Member Author

@alex19999

Could you please put you results into new Wiki page?

@alex19999
Copy link
Contributor

alex19999 commented Apr 2, 2018 via email

@pavelkryukov
Copy link
Member Author

Great. The page is here: https://github.com/MIPT-ILab/mipt-mips/wiki/IPC-sensitivity-to-cache-line-size

Please use structure I defined in the page. In general, it should be similar to reports you do in General Physics laboratory of MIPT.

@alex19999
Copy link
Contributor

Pavel Igorevich, I have finished this page. Could you review my work please?

@pavelkryukov
Copy link
Member Author

Thanks, looks good. I'll make grammar fixes later if needed.

@pavelkryukov
Copy link
Member Author

pavelkryukov commented Apr 7, 2018

Please add some quantitative analysis which explains the logarithmic law.
There should be a simple formula to get IPC from cache line size, pipeline depth, and miss penalty.
You may involve AMAT as well.

@alex19999
Copy link
Contributor

Pavel Igorevich, sorry, but it is important for this task to know percent of memory-access instructions, isn't it?

CPI = CPI(execution) + memory_stalls_per_instruction;
memory_stalls_per_instruction = memory_accesses_per_instruction * miss_rate * miss_penalty;
IPC = 1 / CPI;

and miss_rate is correlated with cache_line_size;

@pavelkryukov
Copy link
Member Author

Why? We study instruction cache, and each instruction is fetched from it...

@alex19999
Copy link
Contributor

Oh, yes, I forgot about it. Just think about cache in general

@pavelkryukov
Copy link
Member Author

Please add some quantitative analysis which explains the logarithmic law.
There should be a simple formula to get IPC from cache line size, pipeline depth, and miss penalty.
You may involve AMAT as well.

@alex19999 Could you please complete this point?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
3 Features of medium complexity or infrastructure enhancements enhancement Adds a new feature to simulation. S2 — Caches To solve the issue, you NEED knowledge about caches. OOO hierarchy etc.
Projects
None yet
Development

No branches or pull requests

2 participants