You've successfully subscribed to The Daily Awesome
Great! Next, complete checkout for full access to The Daily Awesome
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info is updated.
Billing info update failed.

The Overview of Pipeline

. 4 min read

Performance Issues

Longest delay determines clock period

  • Critical path: load instruction
  • Instruction memory → register file → ALU → data memory → register file

Not feasible to vary period for different instructions

Violates design principle

  • Making the common case fast

We will improve performance by pipelining

MIPS Pipeline

Pipeline: an implementation technique in which multiple instructions are overlapped in execution

Five stages, one step per stage

  • IF: Instruction fetch from memory
  • ID: Instruction decode & register read
  • EX: Execute operation or calculate address
  • MEM: Access memory operand
  • WB: Write result back to register

Performance: 耗时最长的stage决定时钟周期

Speedup:

  • If all stages are balanced (all take the same time)
  • $Time,between,instructions_{pipelined} = \frac{Time,between,instructions_{nonpipelined}}{
    Number,of,stages}$
  • else
  • 除以耗时最长的stage

Speedup due to increased throughput, Latency does not decrease.

Pipelining and ISA Design

与以x86为代表的CISC指令集比较

  • MIPS ISA designed for pipelining
  • All instructions are 32-bits
  • Easier to fetch and decode in one cycle
  • Few and regular instruction formats
  • Can decode and read registers in one step
  • Load/store addressing
  • Can calculate address in 3rd stage, access memory in 4th stage
  • Alignment of memory operands
  • Memory access takes only one cycle

Hazards

Situations that prevent starting the next instruction in the next cycle

Structure hazards

  • A required resource is busy
  • RAM数据的读写与ROM指令数据的读取
  • Improved by Separating instruction/data caches

Data hazard

  • Need to wait for previous instruction to complete its data read/write
  • 如赋值后紧接着读取
  • Improved by Forwarding & Code Scheduling

Control hazard

  • Decisions of control action depends on the previous instruction
  • 跳转信号结果的计算
  • Improved by Predicting

Structure Hazards

  • Conflict for use of a resource
  • If MIPS pipeline has only one memory (data and instructions all in one), then
  • Load/store requires data access
  • Instruction fetch would have to stall for that cycle
  • Hence, pipelined datapaths require separate instruction/data memories
  • Or separate instruction/data caches

Data Hazards

An instruction depends on completion of data access by a previous instruction

add $s0, $t0, $t1
sub $t2, $s0, $t3

$s0 的写入后立即读取产生 data hazard,产生两个 bubble

Control Hazards

Branch determines flow of control

Fetching next instruction depends on branch outcome

Pipeline can’t always fetch correct instruction

  • Still working on ID stage of branch

In MIPS pipeline

  • Need to compare registers and compute target early in the pipeline
  • Add hardware to do it in ID stage

Improvement of hazard

Forwarding (aka Bypassing)【Improve Data Hazard】

Forwarding can help to solve data hazard

Core idea: Use result immediately when it is computed

  • Don’t wait for it to be stored in a register
  • Requires extra connections in the data path
  • Add a bypassing line to connect the output of EX to the input

通过额外的连线来传输数据

Can’t always avoid stalls by forwarding

  • If value not computed when needed
  • Can’t forward backward in time

不能完全避免因为内存读取产生的stall,(结果不能再EX stage后获取,只能减少一个bubble)

Code Scheduling to Avoid Stalls【Improve Data Hazard】

  • Reorder code to avoid use of load result in the next instruction (avoid “load + exe” pattern)

尽量避免数据从内存载入后的立即使用

Branch Prediction

通过概率避免 stall,不能完全解决

  • Longer pipelines can’t readily determine branch outcome early
  • Stall penalty becomes unacceptable
  • Predict outcome of branch
  • Stall only if prediction is wrong
  • In MIPS pipeline
  • Can predict branches not taken
  • Fetch instruction after branch, with no delay

More-Realistic Branch Prediction

Static branch prediction

编译器根据语法进行预测

  • Based on typical branch behavior
  • Example: loop and if-statement branches
  • Predict backward branches taken
  • Predict forward branches not taken

Dynamic branch prediction

基于硬件的预测:统计历史;假设与历史相同

  • Hardware measures actual branch behavior
  • record recent history of each branch
  • Assume future behavior will continue the trend
  • When wrong, stall while re-fetching, and update history

Summary

  • Pipelining improves performance by increasing instruction throughput
  • Executes multiple instructions in parallel
  • Each instruction has the same latency
  • Subject to hazards
  • Structure, data, control
  • Instruction set design affects complexity of pipeline implementation