The Single Cycle Datapath
Data Register # Registers Register # Register # Data
PC
Address Instruction memory
Instruction
ALU
Address Data memory
Note: Some of the material in this lecture are COPYRIGHT 1998 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGH RESERVED. Figures may be reproduced only for classroom or personal education use in conjunction with our text and only when the above line is included.
2/6/02 CSE 141 - Single Cycle Datapath
The Performance Big Picture
Execution Time = Insts * CPI * Cycle Time Processor design (datapath and control) will determine:
Clock cycle time Clock cycles per instruction
Starting today:
Single cycle processor:
Advantage: CPI = 1 Disadvantage: long cycle time
Execute an entire instruction
CSE 141 - Single Cycle Datapath
Processor Design
We're ready to implement the MIPS core
load-store instructions: lw, sw reg-reg instructions: add, sub, and, or, slt control flow instructions: beq
First, we need to fetch an instruction into processor
program counter (PC) supplies instruction address get the instruction from memory
Write Enable
PC
Address
Data In 32 Clk
3
DataOut 32
CSE 141 - Single Cycle Datapath
Processor Design
We're ready to implement the MIPS core
load-store instructions: lw, sw reg-reg instructions: add, sub, and, or, slt control flow instructions: beq
First, we need to fetch an instruction into processor
program counter (PC) supplies instruction address get the instruction from memory
0
Write Enable
PC
Address
Data In 32 Clk
4
DataOut 32
instruction appears here
CSE 141 - Single Cycle Datapath
That was too easy
A problem how will we do a load or store?
remember that memory has only 1 port and we want to do everything in 1 cycle
0
Write Enable
PC
Address
Data In 32 Clk
5
DataOut 32
instruction appears here
CSE 141 - Single Cycle Datapath
Instruction & Data in same cycle?
Solution: separate data and instruction memory
There will be only one DRAM memory
We want a stored program architecture
How else can you compile and then run a program?? (Well study caches later)
address Instruction cache PC
But we can have separate SRAM caches
instruction appears here Write Enable Data In 32 Clk
6 Data Cache
Address
DataOut 32
CSE 141 - Single Cycle Datapath
Instruction Fetch Unit
Updating the PC for next instruction
Sequential Code: PC <- PC + 4 Branch and Jump: PC <- something else
well worry about these later
Add
4 Read address Instruction Instruction memory
PC
CSE 141 - Single Cycle Datapath
The MIPS core subset
R-type
add rd, rs, rt sub, and, or, slt
31 op 6 bits 26 rs 5 bits 21 rt 5 bits 16 rd 5 bits 11 shamt 5 bits 6 funct 6 bits 0
LOAD and STORE
31 op
1. Read registers rs and rt 2. Feed them to ALU 3. Update register file
21 rs 5 bits rt 5 bits 16
26
0 immediate 16 bits
lw rt, rs, imm sw rt, rs, imm
6 bits
1. Read register rs (and rt for store) 2. Feed rs and immed to ALU 3. Move data between mem and reg
21 rs 5 bits rt 5 bits 16 displacement 16 bits 0
BRANCH:
31 op
26 6 bits
beq rs, rt, imm
1. Read registers rs and rt 2. Feed to ALU to compare 3. Add PC to disp; update PC
CSE 141 - Single Cycle Datapath
Processor Design
Generic Implementation: all instruction read some registers all instructions use the ALU after reading registers memory accessed & registers updated after ALU
Suggests basic design:
Data Register # Registers Register # Register # Data
PC
Address Instruction memory
Instruction
ALU
Address Data memory
CSE 141 - Single Cycle Datapath
Datapath for Reg-Reg Operations
R[rd] <- R[rs] op R[rt] Example: add
rd, rs, rt
Ra, Rb, and Rw come from rs, rt, and rd fields ALUoperation signal depends on op and funct
31 op 6 bits 26 rs 5 bits
Read register 1 Instruction Read register 2 Registers Write register Write data
21 rt 5 bits
16 rd 5 bits
11 shamt 5 bits
3
6 funct 6 bits
ALU operation
Read data 1 Zero ALU Read data 2 ALU result
RegWrite
10 CSE 141 - Single Cycle Datapath
Datapath for Load Operations
R[rt] <- Mem[R[rs] + SignExt[imm16]] Example: lw rt, rs, imm16
31 op 6 bits 26 rs 5 bits 21 rt 5 bits 16 immediate 16 bits 0
Read register 1 Instruction Read register 2 Registers Write register Write data RegWrite 16
3 Read data 1
ALU operation MemWrite Zero
ALU Read data 2
ALU result
Address
Read data Data memory
Write data Sign extend 32
MemRead
11
CSE 141 - Single Cycle Datapath
Datapath for Store Operations
Mem[R[rs] + SignExt[imm16]] <- R[rt] Example: sw
31 op 6 bits 26 rs 5 bits 21 rt 5 bits 16 immediate 16 bits
rt, rs, imm16
0
Read register 1 Instruction Read register 2 Registers Write register Write data RegWrite 16
3 Read data 1
ALU operation MemWrite Zero
ALU Read data 2
ALU result
Address
Read data Data memory
Write data Sign extend 32
MemRead
12
CSE 141 - Single Cycle Datapath
Combining datapaths
How do we allow different datapaths for different instructions??
Read register 1 Instruction Read register 2 Registers Write register Write data 3 Read data 1 Zero ALU Read data 2 ALU result ALU operation
Read register 1 Instruction 3 ALU operation MemWrite Zero ALU ALU result Address Read data Data memory Read data 1 Read register 2 Registers Write register Read data 2 Write data RegWrite
Write data 16 Sign extend 32
RegWrite
MemRead
R-type
Store
13
CSE 141 - Single Cycle Datapath
Combining datapaths
How do we allow different datapaths for different instructions??
Read register 1 Instruction Read register 2 Registers Write register Write data 3 Read data 1 Zero ALU Read data 2 ALU result ALU operation
Read register 1 Instruction 3 ALU operation MemWrite Zero ALU ALU result Address Read data Data memory Read data 1 Read register 2 Registers Write register Read data 2 Write data RegWrite
Write data 16 Sign extend 32
RegWrite
MemRead
Use a multiplexor!
ALUscr
Read register 1 Instruction Read register 2 Registers Write register Write data RegWrite 16 Sign extend 32 3 Read data 1 Zero ALU Read data 2 Write data ALU result Address ALU operation
MemWrite
Read data Data memory
MemRead
14
CSE 141 - Single Cycle Datapath
Datapath for Branch Operations
beq rs, rt, imm16
31 op 6 bits 26 rs 5 bits 21 rt 5 bits
We need to compare Rs and Rt
16 immediate 16 bits 0
PC + 4 from instruction datapath
Add Sum
Shift left 2 Read register 1 3 ALU operation
Branch target
Instruction
Read data 1 Read register 2 Registers Write register Read data 2 Write data RegWrite 16 Sign extend 32
ALU Zero
To branch control logic
15
CSE 141 - Single Cycle Datapath
Computing the Next Address
PC is a 32-bit byte address into the instruction memory: Sequential operation: PC<31:0> = PC<31:0> + 4 Branch: PC<31:0> = PC<31:0> + 4 + SignExt[Imm16] * 4 We dont need the 2 least-significant bits because: The 32-bit PC is a byte address And all our instructions are 4 bytes (32 bits) long The 2 LSB's of the 32-bit PC are always zeros
16
CSE 141 - Single Cycle Datapath
All together: the single cycle datapath
PCSrc 1 M u x 0
Add 4 RegWrite Instruction [2521] PC Read address Instruction [310] Instruction memory Instruction [2016] 1 M u Instruction [1511] x 0 RegDst Instruction [150] Read register 1 Read register 2 Shift left 2 ALU Add result
Read data 1
MemWrite ALUSrc 1 M u x 0 Zero ALU ALU result MemtoReg Address Read data 1 M u x 0
Read Write data 2 register Write Registers data 16 Sign 32 extend
Write data ALU control
Data memory
MemRead
Instruction [50] ALUOp
17
CSE 141 - Single Cycle Datapath
The R-Format (e.g. add) Datapath
PCSrc 1 M u x 0 Add 4 RegWrite Instruction [2521] PC Read address Instruction [310] Instruction memory Instruction [2016] 1 M u Instruction [1511] x 0 RegDst Instruction [150] Read register 1 Read register 2 Shift left 2 ALU Add result
Read data 1
MemWrite ALUSrc 1 M u x 0 Zero ALU ALU result MemtoReg Address Read data 1 M u x 0
Read Write data 2 register Write Registers data 16 Sign 32 extend
Write data ALU control
Data memory
MemRead
Instruction [50] ALUOp
Need ALUsrc=1, ALUop=add, MemWrite=0, MemToReg=0, RegDst = 0, RegWrite=1 and PCsrc=1. 18 CSE 141 - Single Cycle Datapath
The Load Datapath
PCSrc 1 M u x 0
Add 4 RegWrite Instruction [2521] PC Read address Instruction [310] Instruction memory Instruction [2016] 1 M u Instruction [1511] x 0 RegDst Instruction [150] Read register 1 Read register 2 Shift left 2 ALU Add result
Read data 1
MemWrite ALUSrc 1 M u x 0 Zero ALU ALU result MemtoReg Address Read data 1 M u x 0
Read Write data 2 register Write Registers data 16 Sign 32 extend
Write data ALU control
Data memory
MemRead
Instruction [50] ALUOp
What control signals do we need for load??
19 CSE 141 - Single Cycle Datapath
The Store Datapath
PCSrc 1 M u x 0
Add 4 RegWrite Instruction [2521] PC Read address Instruction [310] Instruction memory Instruction [2016] 1 M u Instruction [1511] x 0 RegDst Instruction [150] Read register 1 Read register 2 Shift left 2 ALU Add result
Read data 1
MemWrite ALUSrc 1 M u x 0 Zero ALU ALU result MemtoReg Address Read data 1 M u x 0
Read Write data 2 register Write Registers data 16 Sign 32 extend
Write data ALU control
Data memory
MemRead
Instruction [50] ALUOp
20
CSE 141 - Single Cycle Datapath
The beq Datapath
PCSrc 1 M u x 0
Add 4 RegWrite Instruction [2521] PC Read address Instruction [310] Instruction memory Instruction [2016] 1 M u Instruction [1511] x 0 RegDst Instruction [150] Read register 1 Read register 2 Shift left 2 ALU Add result
Read data 1
MemWrite ALUSrc 1 M u x 0 Zero ALU ALU result MemtoReg Address Read data 1 M u x 0
Read Write data 2 register Write Registers data 16 Sign 32 extend
Write data ALU control
Data memory
MemRead
Instruction [50] ALUOp
21
CSE 141 - Single Cycle Datapath
Key Points
CPU is just a collection of state and combinational logic We just designed a very rich processor, at least in terms of functionality Execution time = Insts * CPI * Cycle Time
where does the single-cycle machine fit in?
22
CSE 141 - Single Cycle Datapath
Computer of the Day
The IBM 1620 (1959)
A 2nd generation computer: transistors & core storage
(First generation ones used tubes and delay-based memory)
Example of creative architecture ~ 2000 built. Relatively inexpensive ( < $1620/month rental)
A decimal computer 6 bits per digit or character
4 bits, flag (for +/- and end-of-word), ECC Variable-length data fields terminated by flag
Arithmetic by table lookup! Codenamed CADET
23
Cant Add, Doesnt Even Try
CSE 141 - Single Cycle Datapath