联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2020-04-30 11:21

FINAL PROJECT: Verilog and Python Co-Simulation of Cache and Branch Prediction enabled MIPS Pipelined Processor

FINAL PROJECT: Verilog and Python Co-Simulation of

Cache and Branch Prediction enabled MIPS Pipelined

Processor

Due May 6 by 11:59pm Points 100 Submitting a file upload File Types tgz

Available Apr 14 at 11:59am - May 9 at 12:01am 25 days

Submit Assignment

Overview:

In this project, you will implement identical functionality in both your python and Verilog versions of

the pipelined processor design and simulate an execution that begins in python, continues in your

Verilog model, and then resumes in your python emulation model. The additional functionality that

you will be adding consists of caching and branch prediction hardware and two new instructions.

A) The specific configurations for the caches and branch predictor will be described below.

0) Base pipeline designs:

FDEMW 5-stage pipeline with branch resolution in D. Non-memory hazard detection in D.

Always not-taken predictor. For your Verilog baseline, use the one provided in V1. The

pipelined model of the PPSSM will be provided shortly.

Multi-cycle IMEM and DMEM models will be provided for the optional sensitivity study.

1) Additional instruction support:

MULTU, MFLO (needed by matrix multiplication benchmark)

2) Branch prediction extension:

2K entry bimodal table of 2-bit predictors, initial state t=10

32 entry BTB storing the contents, not the address, of the target instruction.

BTB caches the following supported types of branches and jumps {BEQ, BNE. J, JAL}.

JR/JALR are explicitly not cached in the BTB due to target indirection.

3) Instruction cache:

4/21/2020 FINAL PROJECT: Verilog and Python Co-Simulation of Cache and Branch Prediction enabled MIPS Pipelined Processor

https://psu.instructure.com/courses/2051388/assignments/11668830 2/3

The instruction cache will be accessed in lieu of the IMEM. Misses in the instruction cache

will be served from the IMEM. Since the IMEM is read-only, the instruction cache does not

support writes and does not have a write policy.

2KB Direct mapped, block size 64 byte

4) Data cache:

The data cache will be accessed in lieu of the DMEM. Misses in the instruction cache will be

served by the DMEM. The DMEM write policy is write-back and will stall-allocate on store

misses. Evicted dirty lines must be written back to the DMEM.

2KB 2-way associative, inverse-MRU eviction policy (evict the block that is not the most

recently used block - requires 1 bit of metadata per block: set MRU to 1 on accesses and set

other block in the set's MRU to 0, evict whichever block has MRU=0), block size 32 bytes.

B) Co-simulation

Once both your Verilog and Python simulation models can support the above modeled

execution, the final task will be to support co-simulation, as follows:

0) Add state dumping/loading capability to your new branch and cache structures.

1) Add an instruction-fetched counter to your Verilog design. Count only correct path, nonflushed,

successfully retrieved from cache instructions that actually get written into the IF/ID

register. Add logic to A) stall fetch such that the 100th successfully fetched instruction is not

treated as ever being successfully fetched and B) once all stages DEMW contain injected

NOPs from the induced flush of the 100th instruction, dump the architectural state, BTB,

BHT, and caches and exit simulation.

2) Resume your Verilog execution from the 100th instruction and run for an additional 100

instructions.

3) Resume your Python execution from the 100th instruction and run your program to

completion.

Supporting scripts for input/output conversion between Verilog and Python will be provided (the

required data is identical, but the formatting requires an automated conversion) (Pending)

C) (OPTIONAL, 10% EXTRA CREDIT)

Using the multi-cycle IMEM and DMEM models,

4/21/2020 FINAL PROJECT: Verilog and Python Co-Simulation of Cache and Branch Prediction enabled MIPS Pipelined Processor

https://psu.instructure.com/courses/2051388/assignments/11668830 3/3

1) compute the AMAT and CPI impact of having vs. not having the caches for both NMM and

BFS

2) rewrite your naive matrix multiplication benchmark to improve the cache hit rate

3) discuss why BFS is more difficult to optimize for both caching and branch prediction


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp