-
Notifications
You must be signed in to change notification settings - Fork 141
BPU model
This page is maintained by George Korepanov
The Branch Prediction Unit is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure, thus enabling the fetch unit to continue fetching instructions without stalling the pipeline.
For more detailed description please refer to the lectures.
The operation of this unit is simulated inside BP
class.
The BaseBP
class provides very common interface of BPU as follows:
class BaseBP
{
virtual bool is_taken( Addr PC) = 0;
virtual Addr get_target( Addr PC) = 0;
virtual void update( bool is_taken,
Addr branch_ip,
Addr target) = 0;
}
- The first two methods are called by fetch stage, whenever it checks whether the next instruction is a branch to be taken. It uses
get_target()
to determine the IP to be fetched next cycle, andis_taken
to store BPU prediction so that later memory stage could check if misprediction took place. - As mentioned, on memory stage the BPU prediction is compared with condition calculated by ALU and in case of misprediction, updates BPU information by calling
update
method.
Lookups the BP and if an entry holds info about requested IP, retrieves that info from it. Otherwise we have nothing else but return false
(if we return true
, we won't be able to return the target as well).
Lookups the BP entry and if it is TAKEN
, returns predicted target. If there is no entry or it is predicted as NOT TAKEN
, it returns PC + 4
.
Touches the existing BP entry in cache or creates a new one if it is not present (in the latter case we also have to reset the new entry with target to make it valid).
The implementation of described interface is very straight-forward. We just create and serve the cache holding BP entries, passing all the requests to corresponding entries (thus the real logic of operation is comprised within BP entries, not the BP class itself). In hardware BPU has it's own independent cache (BTB) for storing data, but in simulator we are free to reuse existing generic CacheTagArray
implementation:
template<typename T>
class BP final: public BaseBP
{
std::vector<std::vector<T>> data;
CacheTagArray tags;
BP class is templated with a type of different entries. To create different modes of BP, you may use BPFactory class:
class BPFactory {
BaseBP* create( std::string mode,
uint32 size,
uint32 ways,
uint32 branch_ip_size_in_bits = 32);
Please, describe configuration variables here
All of implemented predictors store IP's (instruction pointers) of branches (both conditional & unconditional) to be able to retrieve address of next instruction without decoding the current one. Thus the basic class BP
implements common BTB (branch target buffer, practically the cache keeping IP's), encapsulating all the differences of implementation of specific predictor inside BPentry
.
Please, describe the behavior here
Please, describe the behavior here
Please, describe the behavior here
Please, describe the behavior here. You may add a picture from the slides.
MIPT-V / MIPT-MIPS — Cycle-accurate pre-silicon simulation.