0% found this document useful (0 votes)
155 views45 pages

Revisiting Hazards: Data Hazards Control Hazards Hardware

The document discusses hardware solutions for handling data hazards in pipelines. It describes detecting data hazards when an instruction in the execute stage writes a register that a later instruction in the ID or execute stage needs to read. It then proposes forwarding either the intermediate or final result value to resolve the hazard, rather than stalling the pipeline until the write is completed. This allows instructions to execute in parallel and out-of-order while avoiding incorrect results from data dependencies between instructions.

Uploaded by

Vishal Mittal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
155 views45 pages

Revisiting Hazards: Data Hazards Control Hazards Hardware

The document discusses hardware solutions for handling data hazards in pipelines. It describes detecting data hazards when an instruction in the execute stage writes a register that a later instruction in the ID or execute stage needs to read. It then proposes forwarding either the intermediate or final result value to resolve the hazard, rather than stalling the pipeline until the write is completed. This allows instructions to execute in parallel and out-of-order while avoiding incorrect results from data dependencies between instructions.

Uploaded by

Vishal Mittal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Revisiting Hazards

 So far our datapath and control have ignored hazards


 We shall revisit data hazards and control hazards and
enhance our datapath and control to handle them in
hardware…
Data Hazards and Forwarding
 Problem with starting an instruction before previous are finished:
 data dependencies that go backward in time – called data hazards
Time (in clock cycles)

$2 = 10 before sub; Value of CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9


$2 = -20 after sub register $2: 10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
Program
execution
order
(in instructions)

sub $2, $1, $3 IM Reg DM Reg

sub $2, $1, $3


and $12, $2, $5 and $12, $2, $5 IM Reg DM Reg
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2) or $13, $6, $2 IM Reg DM Reg

add $14, $2, $2 IM Reg DM Reg

sw $15, 100($2) IM Reg DM Reg


Software Solution
 Have compiler guarantee never any data hazards!
 by rearranging instructions to insert independent instructions
between instructions that would otherwise have a data hazard
between them,
 or, if such rearrangement is not possible, insert nops
sub $2, $1, $3 sub $2, $1, $3
lw $10, 40($3) nop
slt $5, $6, $7 nop
and $12, $2, $5 or and $12, $2, $5
or $13, $6, $2 or $13, $6, $2
add $14, $2, $2 add $14, $2, $2
sw $15, 100($2) sw $15, 100($2)

 Such compiler solutions may not always be possible, and nops


slow the machine down
MIPS: nop = “no operation” = 00…0 (32bits) = sll $0, $0, 0
Hardware Solution:
Forwarding
 Idea: use intermediate data, do not wait for result to be
finally written to the destination register. Two steps:
1. Detect data hazard
2. Forward intermediate data to resolve hazard
Pipelined Datapath with Control II (as before)
PCSrc

ID/EX
0
M
u WB
x EX/ME M
1
Control M WB
ME M/WB

EX M WB
IF/ID

Add

Add
4 Add
result

RegW rite
Branch
Shift
left 2

MemWrite
ALUSrc

MemtoReg
Read
Instruction

PC Address register 1
Read
data 1
Read
register 2 Zero
Instruction
Registers Read ALU ALU
memory W rite 0 Read
data 2 result Address 1
register M data
Data M
u
memory u
W rite x
data x
1
0
Write
data

Instruction
16 32 6
[15– 0] Sign ALU
M emRead
extend control

Control signals Instruction

emanate from
[20– 16]
0 ALUOp
M
the control Instruction
[15– 11]
u
x

portions of the 1
RegDst

pipeline registers
Hazard Detection
 Hazard conditions:
1a. EX/MEM.RegisterRd = ID/EX.RegisterRs
1b. EX/MEM.RegisterRd = ID/EX.RegisterRt
2a. MEM/WB.RegisterRd = ID/EX.RegisterRs
2b. MEM/WB.RegisterRd = ID/EX.RegisterRt
 Eg., in the earlier example, first hazard between sub $2, $1, $3 and
and $12, $2, $5 is detected when the and is in EX stage and the
sub is in MEM stage because
 EX/MEM.RegisterRd = ID/EX.RegisterRs = $2 (1a)

 Whether to forward also depends on:


 if the later instruction is going to write a register – if not, no need to
forward, even if there is register number match as in conditions above
 if the destination register of the later instruction is $0 – in which case
there is no need to forward value ($0 is always 0 and never overwritten)
Data Forwarding
 Plan:
 allow inputs to the ALU not just from ID/EX, but also later
pipeline registers, and
 use multiplexors and control signals to choose appropriate
inputs to ALU
Time (in clock cycles)
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
Value of register $2 : 10 10 10 10 10/–20 – 20 –20 –20 –20
Value of EX/MEM : X X X –20 X X X X X
Value of MEM/WB : X X X X – 20 X X X X

Program
execution order
(in instructions)
sub $2, $1, $3 IM Reg DM Reg

sub $2, $1, $3


and $12, $2, $5 and $12, $2, $5 IM Reg DM Reg

or $13, $6, $2
add $14, $2, $2 or $13, $6, $2 IM Reg DM Reg
sw $15, 100($2)
add $14, $2, $2 IM Reg DM Reg

sw $15, 100($2) IM Reg DM Reg

Dependencies between pipelines move forward in time


ID/EX EX/MEM MEM/WB

R e g is te rs ALU

D a ta
m e m o ry M
u
x

a . N o fo rw a rd in g Datapath before adding forwarding hardware


ID / E X E X /M E M M E M /W B

M
u
x
R e g is te rs
F o r w a rd A ALU

M D a ta
u m e m o ry
x M
u
x

Rs F o rw a rd B
Rt
Rt M
u E X /M E M .R e g iste rR d
Rd
x
F o rw a rd in g M E M /W B .R e g is te rR d
u n it

b . W ith fo rw a rd in g Datapath after adding forwarding hardware


Forwarding Hardware:
Multiplexor Control
Mux control Source Explanation
ForwardA = 00 ID/EX The first ALU operand comes from the register file
ForwardA = 10 EX/MEM The first ALU operand is forwarded from prior ALU result
ForwardA = 01 MEM/WB The first ALU operand is forwarded from data memory
or an earlier ALU result
ForwardB = 00 ID/EX The second ALU operand comes from the register file
ForwardB = 10 EX/MEM The second ALU operand is forwarded from prior ALU result
ForwardB = 01 MEM/WB The second ALU operand is forwarded from data memory
or an earlier ALU result

Depending on the selection in the rightmost multiplexor


(see datapath with control diagram)
Data Hazard: Detection and
Forwarding
 Forwarding unit determines multiplexor control according to the
following rules:

1. EX hazard
if ( EX/MEM.RegWrite // if there is a write…
and ( EX/MEM.RegisterRd  0 ) // to a non-$0 register…
and ( EX/MEM.RegisterRd = ID/EX.RegisterRs ) ) // which matches, then…
ForwardA = 10

if ( EX/MEM.RegWrite // if there is a write…


and ( EX/MEM.RegisterRd  0 ) // to a non-$0 register…
and ( EX/MEM.RegisterRd = ID/EX.RegisterRt ) ) // which matches, then…
ForwardB = 10
Data Hazard: Detection and
Forwarding
2. MEM hazard
if ( MEM/WB.RegWrite // if there is a write…
and ( MEM/WB.RegisterRd  0 ) // to a non-$0 register…
and ( EX/MEM.RegisterRd  ID/EX.RegisterRs ) // and not already a register match
// with earlier pipeline register…
and ( MEM/WB.RegisterRd = ID/EX.RegisterRs ) ) // but match with later pipeline
register, then…
ForwardA = 01

if ( MEM/WB.RegWrite // if there is a write…


and ( MEM/WB.RegisterRd  0 ) // to a non-$0 register…
and ( EX/MEM.RegisterRd  ID/EX.RegisterRt )
// and not already a register match
// with earlier pipeline register…
and ( MEM/WB.RegisterRd = ID/EX.RegisterRt ) ) // but match with later pipeline
register, then…
ForwardB = 01

This check is necessary, e.g., for sequences such as add $1, $1, $2; add $1, $1, $3; add $1, $1, $4;
(array summing?), where an earlier pipeline (EX/MEM) register has more recent data
Forwarding Hardware with Control
Called forwarding unit, not hazard detection unit,
because once data is forwarded there is no hazard!
ID/EX

WB
EX/MEM

Control M WB
MEM/WB

IF/ID EX M WB

M
Instruction

u
x
Registers
Instruction Data
PC ALU
memory memory M
u
M x
u
x

IF/ID.RegisterRs Rs
IF/ID.RegisterRt Rt
IF/ID.RegisterRt Rt
M EX/MEM.RegisterRd
IF/ID.RegisterRd Rd u
x
Forwarding MEM/WB.RegisterRd
unit

Datapath with forwarding hardware and control wires – certain details,


e.g., branching hardware, are omitted to simplify the drawing
Note: so far we have only handled forwarding to R-type instructions…!
Forwarding
 Execution
example:

sub $2, $1, $3


and $4, $2, $5
or $4, $4, $2
add $9, $4, $2
or $4, $4, $2 and $4 , $2, $5 sub $ 2 , $ 1 , $ 3 b e fo re < 1 > b e fo re < 2 >

ID/EX

sub $2, $1, $3 10


WB
10
EX/MEM
and $4, $2, $5
or $4, $4, $2 Control M WB
MEM/WB
add $9, $4, $2
IF/ID EX M WB

2 $2 $1

M
In stru ctio n

5 u
x

Registers
Instruction Data
PC ALU
memory memory M
$5 $3
u
M x
u
x

2 1

5 3

M
4 2 u
x

Forwarding
unit

C lo c k 3
a dd $9 , $ 4, $2 or $ 4 , $ 4, $ 2 a nd $4 , $ 2, $5 sub $ 2, . . . b e fo r e < 1 >

sub $2, $1, $3 ID/EX

and $4, $2, $5 10 10


WB
or $4, $4, $2 EX/MEM

add $9, $4, $2 Control M WB


10
MEM/WB

IF/ID EX M WB

4 $4 $2

M
I nstr uction

6 u
x

Registers
Instruction Data
PC ALU
memory M
memory $2 $5
u
M x
u
x

2 2

6 5

M 2
4 4 u
x

Forwarding
unit

C lo c k 4
a fte r < 1 > add $9, $4, $2 or $4, $4, $2 a nd $ 4, . . . sub $ 2 , . . .

sub $2, $1, $3 ID/EX

and $4, $2, $5 10 10


or $4, $4, $2 WB
EX/MEM

add $9, $4, $2 10


Control M WB
MEM/WB

1
IF/ID EX M WB

4 $4 $4

M
In s tru ctio n

2 u
x

Registers
2 Data
Instruction ALU
PC memory
memory M
$2 $2
u
M x
u
x

4 4

2 2

M 4
2
u
9 4
x

Forwarding
unit

C lo c k 5
a f te r < 2 > a f te r < 1 > a dd $9 , $ 4, $2 or $4, . . . and $4, . . .

sub $2, $1, $3 ID/EX

and $4, $2, $5 10


WB
or $4, $4, $2 EX/MEM

add $9, $4, $2 10


Control M WB
MEM/WB

1
IF/ID EX M WB

$4

M
In stru ctio n

u
x

Registers
4 Data
Instruction ALU
PC memory
memory $2 M
u
M x
u
x

M 4 4
9 u
x

Forwarding
unit

C lo c k 6
Data Hazards and Stalls
 Load word can still cause a hazard:
 an instruction tries to read a register following a load instruction that
writes to the same register
Time (in clock cycles)
Program CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9

lw $2, 20($1) execution


order
and $4, $2, $5 (in instructions)

or $8, $2, $6 lw $2, 20($1) IM R eg DM Reg

add $9, $4, $2


Slt $1, $6, $7
and $4, $2, $5 IM Reg DM R eg

As even a pipeline
dependency goes
or $8, $2, $6 IM Reg DM Reg

backward in time
forwarding will not add $9, $4, $2 IM Reg DM Reg

solve the hazard

slt $1, $6, $7 IM Reg DM Reg

 therefore, we need a hazard detection unit to stall the pipeline after


the load instruction
Pipelined Datapath with Control II (as before)
PCSrc

ID/EX
0
M
u WB
x EX/MEM
1
Control M WB
MEM/WB

EX M WB
IF/ID

Add

4 Add
Add result

RegWrite
Shift Branch
left 2

MemWrite
ALUSrc

MemtoReg
Read
Instruction

PC Address register 1 Read


Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address data 1
register M
u Data M
Write x memory u
data x
1
0
Write
data

Instruction 16 32 6
[15–0] Sign ALU MemRead
extend control
Control signals Instruction
emanate from [20–16]
0 ALUOp
M
the control Instruction u
x
[15–11]
portions of the 1
RegDst
pipeline registers
Hazard Detection Logic to Stall
 Hazard detection unit implements the following check if to stall

if ( ID/EX.MemRead // if the instruction in the EX stage is a load…


and ( ( ID/EX.RegisterRt = IF/ID.RegisterRs ) // and the destination register
or ( ID/EX.RegisterRt = IF/ID.RegisterRt ) ) ) // matches either source register
// of the instruction in the ID stage, then…
stall the pipeline
Mechanics of Stalling
 If the check to stall verifies, then the pipeline needs to stall
only 1 clock cycle after the load as after that the forwarding
unit can resolve the dependency
 What the hardware does to stall the pipeline 1 cycle:
 does not let the IF/ID register change (disable write!) – this will
cause the instruction in the ID stage to repeat, i.e., stall
 therefore, the instruction, just behind, in the IF stage must be
stalled as well – so hardware does not let the PC change
(disable write!) – this will cause the instruction in the IF stage to
repeat, i.e., stall
 changes all the EX, MEM and WB control fields in the ID/EX
pipeline register to 0, so effectively the instruction just behind
the load becomes a nop – a bubble is said to have been inserted
into the pipeline
 note that we cannot turn that instruction into an nop by 0ing all the
bits in the instruction itself – recall nop = 00…0 (32 bits) – because
it has already been decoded and control signals generated
Hazard Detection Unit ID/EX.MemRead
Hazard
detection
unit ID/EX

IF/IDWrite
WB
EX/MEM
M
Control u M WB
x MEM/WB
0
IF/ID EX M WB
PCWrite

M
Instruction

u
x
Registers
Instruction Data
PC ALU
memory memory M
u
M x
u
x

IF/ID.RegisterRs
IF/ID.RegisterRt
IF/ID.RegisterRt Rt M EX/MEM.RegisterRd
IF/ID.RegisterRd Rd u
x
ID/EX.RegisterRt Rs Forwarding MEM/WB.RegisterRd
Rt unit

Datapath with forwarding hardware, the hazard detection unit and


controls wires – certain details, e.g., branching hardware are omitted
to simplify the drawing
Stalling Resolves a Hazard
 Same instruction sequence as before for which forwarding by
itself could not resolve the hazard:
Program Time (inclock cycles)
execution CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 CC 10
order
(ininstructions)

lw$2, 20($1) IM Reg DM Reg

lw $2, 20($1)
and $4, $2, $5 and$4, $2, $5 IM Reg Reg DM Reg

or $8, $2, $6
add $9, $4, $2
Slt $1, $6, $7 or $8, $2, $6 IM IM Reg DM Reg

bubble

add$9, $4, $2 IM Reg DM Reg

slt $1, $6, $7 IM Reg DM Reg

Hazard detection unit inserts a 1-cycle bubble in the pipeline, after


which all pipeline register dependencies go forward so then the
forwarding unit can handle them and there are no more hazards
Stall/Bubble in the Pipeline

Stall inserted
here
Stall/Bubble in the Pipeline

Or, more
accurately…
Stalling
 Execution
example:

lw $2, 20($1)
and $4, $2, $5
or $4, $4, $2
add $9, $4, $2
a nd $ 4 , $ 2 , $ 5 lw $ 2 , 2 0 ( $ 1 ) b e fo re < 1 > b e fo re < 2 > b e fo re < 3 >

Hazard
ID/EX.MemRead
detection
1 unit ID/EX
X
11
WB

IF /ID W rite
EX/MEM

M
Control u M WB
x MEM/WB
0

IF/ID EX M WB

1 $1
P C W rite

M
Instructio n

X u
x

Registers
Instruction Data
PC ALU
memory memory M
$X
u
M x
u
x

X
lw $2, 20($1) 2
M
and $4, $2, $5 u
x
or $4, $4, $2 ID/EX.RegisterRt Forwarding
add $9, $4, $2 unit

C lo c k 2
or $4, $4, $2 a nd $4 , $ 2, $5 lw $ 2 , 2 0 ( $ 1 ) b e fo re < 1 > b e fo r e < 2 >

Hazard
ID/EX.MemRead
detection
2
unit ID/EX
5
00 11
WB

I F /ID W rite
EX/MEM

M
Control u M WB
x MEM/WB
0

IF/ID EX M WB

$2 $1
P C W rite

M
I n structio n

5 u
x

Registers
Instruction Data
PC ALU
memory memory M
$5 $X
u
M x
u
x

2 1

5 X

2 M
lw $2, 20($1) 4 u
x
and $4, $2, $5 ID/EX.RegisterRt Forwarding
or $4, $4, $2 unit

add $9, $4, $2


C lo c k 3
or $4 , $ 4, $2 a nd $ 4, $2 , $ 5 b u b b le lw $ 2 , . . . b e fo r e < 1 >

Hazard
ID/EX.MemRead
detection
2 unit ID/EX
5
10 00
WB

IF /ID W r ite
EX/MEM

M
11
Control u M WB
x MEM/WB
0

IF/ID EX M WB

2 $2 $2
P C W r ite

M
I ns tru ctio n

5 u
x

Registers
Data
Instruction ALU
PC memory
memory $5 $5 M
u
M x
u
x

2 2

5 5

M 2
4 4 u
lw $2, 20($1) x

and $4, $2, $5 ID/EX.RegisterRt Forwarding


unit
or $4, $4, $2
add $9, $4, $2
C lo c k 4
add $9, $4, $2 or $4, $4, $2 and $4, $2, $5 b u b b le lw $ 2 , . . .

Hazard
ID/EX.MemRead
detection
4
unit ID/EX
2
10 10

I F / ID W rite
WB
EX/MEM

M 0
Control u M WB
x MEM/WB
0
11
IF/ID EX M WB
P C W rit e

4 $4 $2

M
Ins truction

2 u
x

Registers
2 Data
Instruction ALU
PC memory
memory $2 $5 M
u
M x
u
x

4 2

2 5

M 2
lw $2, 20($1) 4 4 u
x
and $4, $2, $5 ID/EX.RegisterRt Forwarding
or $4, $4, $2 unit

add $9, $4, $2


C lo c k 5
a f te r < 1 > add $9, $4, $2 or $4, $4, $2 a nd $4 , . . . b u b b le

Hazard
ID/EX.MemRead
detection
4
unit ID/EX
2
10 10
WB

IF / ID W rite
EX/MEM

M 10
Control u M WB
x MEM/WB
0
0
IF/ID EX M WB
P CW rit e

4 $4 $4
M
I ns truc t io n

2 u
x

Registers
Data
Instruction ALU
PC memory
memory $2 $2 M
u
M x
u
x

4 4

2 2

M 4
lw $2, 20($1) 9 4 u
x
and $4, $2, $5 ID/EX.RegisterRt Forwarding

or $4, $4, $2 unit

add $9, $4, $2


C lo c k 6
a f te r < 2 > a ft e r < 1 > add $9 , $4, $2 or $4, . . . a nd $ 4 , . . .

Hazard
ID/EX.MemRead
detection
unit ID/EX

10 10
WB
I F/ I DW r ite
EX/MEM
M 10
Control u M WB
x MEM/WB
0
1
IF/ID EX M WB

$4
PC W rite

M
Ins tr uct io n

u
x

Registers
4 Data
Instruction ALU
PC memory
memory $2 M
u
M x
u
x

4
2

M 4 4
lw $2, 20($1) 9 u
x
and $4, $2, $5 ID/EX.RegisterRt Forwarding

or $4, $4, $2 unit

add $9, $4, $2


C lo c k 7
Control (or Branch) Hazards
 Problem with branches in the pipeline we have so far is that the
branch decision is not made till the MEM stage – so what
instructions, if at all, should we insert into the pipeline following
the branch instructions?

 Possible solution: stall the pipeline till branch decision is known


 not efficient, slow the pipeline significantly!

 Another solution: predict the branch outcome


 e.g., always predict branch-not-taken – continue with next
sequential instructions
 if the prediction is wrong have to flush the pipeline behind the
branch – discard instructions already fetched or decoded – and
continue execution at the branch target
Predicting Branch-not-taken:
Misprediction delay
Program Time (in clock cycles)
execution CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
order
(in instructions)

40 beq $1, $3, 7 IM Reg DM Reg

44 and $12, $2, $5 IM Reg DM Reg

48 or $13, $6, $2 IM Reg DM Reg

52 add $14, $2, $2 IM Reg DM Reg

72 lw $4, 50($7) IM Reg DM Reg

The outcome of branch taken (prediction wrong) is decided only when


beq is in the MEM stage, so the following three sequential instructions
already in the pipeline have to be flushed and execution resumes at lw
Optimizing the Pipeline to Reduce Branch Delay

 Move the branch decision from the MEM stage (as in our
current pipeline) earlier to the ID stage
 calculating the branch target address involves moving the
branch adder from the MEM stage to the ID stage – inputs to
this adder, the PC value and the immediate fields are already
available in the IF/ID pipeline register
 calculating the branch decision is efficiently done, e.g., for
equality test, by XORing respective bits and then ORing all the
results and inverting, rather than using the ALU to subtract and
then test for zero (when there is a carry delay)
 with the more efficient equality test we can put it in the ID stage
without significantly lengthening this stage – remember an objective
of pipeline design is to keep pipeline stages balanced
 we must correspondingly make additions to the forwarding and
hazard detection units to forward to or stall the branch at the ID
stage in case the branch decision depends on an earlier result
Flushing on Misprediction
 Same strategy as for stalling on load-use data hazard…
 Zero out all the control values (or the instruction itself) in
pipeline registers for the instructions following the branch that
are already in the pipeline – effectively turning them into nops
– so they are flushed
 in the optimized pipeline, with branch decision made in the ID
stage, we have to flush only one instruction in the IF stage – the
branch delay penalty is then only one clock cycle
Optimized Datapath for Branch
IF.Flush

Hazard
detection IF.Flush control zeros out the instruction in the IF/ID
unit
ID/EX
pipeline register (which follows the branch)
M
u
x
WB
EX/MEM
M
Control u M WB
x MEM/WB
0

IF/ID EX M WB

4 Shift
left 2
M
u
x
Registers =
Instruction Data
PC ALU
memory memory M
u
M x
u
x

Sign
extend

M
u
x
Forwarding
unit

Branch decision is moved from the MEM stage to the ID stage – simplified drawing
not showing enhancements to the forwarding and hazard detection units
Reducing Branch Delay
 Move hardware to determine outcome to ID
stage
 Target address adder
 Register comparator
 Example: branch taken
36: sub $10, $4, $8
40: beq $1, $3, 7
44: and $12, $2, $5
48: or $13, $2, $6
52: add $14, $4, $2
56: slt $15, $6, $7
...
72: lw $4, 50($7)
Example: Branch Taken
Example: Branch Taken
Data Hazards for Branches
 If a comparison register is a destination
of 2nd or 3rd preceding ALU instruction

add $1, $2, $3 IF ID EX MEM WB

add $4, $5, $6 IF ID EX MEM WB

… IF ID EX MEM WB

beq $1, $4, target IF ID EX MEM WB

 Can resolve using forwarding


Data Hazards for Branches
 If a comparison register is a destination
of preceding ALU instruction or 2nd
preceding load instruction
 Need 1 stall cycle

lw $1, addr IF ID EX MEM WB

add $4, $5, $6 IF ID EX MEM WB

beq stalled IF ID

beq $1, $4, target ID EX MEM WB


Data Hazards for Branches
 If a comparison register is a destination
of immediately preceding load
instruction
 Need 2 stall cycles

lw $1, addr IF ID EX MEM WB

beq stalled IF ID

beq stalled ID

beq $1, $0, target ID EX MEM WB


Simple Example: Comparing
Performance
 Compare performance for single-cycle, multicycle, and pipelined
datapaths using the gcc instruction mix
 assume 2 ns for memory access, 2 ns for ALU operation, 1 ns
for register read or write
 assume gcc instruction mix 23% loads, 13% stores, 19%
branches, 2% jumps, 43% ALU
 for pipelined execution assume
 50% of the loads are followed immediately by an instruction that
uses the result of the load
 25% of branches are mispredicted
 branch delay on misprediction is 1 clock cycle
 jumps always incur 1 clock cycle delay so their average time is 2
clock cycles
Simple Example: Comparing
Performance
 Single-cycle : average instruction time 8 ns
 Multicycle : average instruction time 8.04 ns
 Pipelined:
 loads use 1 cc (clock cycle) when no load-use dependency and 2 cc
when there is dependency – given 50% of loads are followed by
dependency the average cc per load is 1.5
 stores use 1 cc each
 branches use 1 cc when predicted correctly and 2 cc when not –
given 25% misprediction average cc per branch is 1.25
 jumps use 2 cc each
 ALU instructions use 1 cc each
 therefore, average CPI is
1.5  23% + 1  13% + 1.25  19% + 2  2% + 1  43% = 1.18
 therefore, average instruction time is 1.18  2 = 2.36 ns

You might also like