Computer Architecture, The Arithmetic/Logic Unit Slide 1
Computer Architecture, The Arithmetic/Logic Unit Slide 1
For example:
27 = (11011)two = (124) + (123) + (022) + (121) + (120)
10 6
1010 9 7 0110 Turn y notches
8
clockwise
1001 0111 to subtract y
1000
Possible encodings
(a) Binary (b) Unary
0 00 0 00
1 01 1 01 (First alternate)
2 10 1 10 (Second alternate)
11 (Unused) 2 11
1 0 2 1 1 0 2 1
MSB 0 0 1 0 = 2 First bit 0 0 1 1 = 3
LSB 1 0 0 1 = 9 Second bit 1 0 1 0 = 10
0
a. Carry-save addition. b. Adding two carry-save numbers.
x
0
x mod 2
Binary representation of x/2
Signed-magnitude representation
+27 in 8-bit signed-magnitude binary code 0 0011011
–27 in 8-bit signed-magnitude binary code 1 0011011
–27 in 2-digit decimal code with BCD digits 1 0010 0111
Biased representation
Represent the interval of numbers [N, P] by the unsigned
interval [0, P + N]; i.e., by adding N to every number
+
–2 2
_ –3 0 3
1100 –4 +4 0100 –4 4
–5 5
–6 6
–5 +5 –7 –8 7
1011 0101
–6 +6
1010 –7 +7 0110 Turn 16 – y notches
–8
counterclockwise to
1001 0111 add –y (subtract y)
1000
k
x / c in
k
Adder / xy
k c out
y / k
/
y or
y
AddSub
For example:
2.375 = (10.011)two = (121) + (020) + (021) + (122) + (123)
1.100 –.5
_ + +.5 0.100
+.625
–.625
1.011 0.101
+.75
–.75
1.010 –.875 +.875 0.110
–1
1.001 0.111
1.000
Figure 9.7 Schematic representation of 4-bit 2’s-complement
encoding for (1 + 3)-bit fixed-point numbers in the range [–1, +7/8].
Inputs Outputs
x y cin cout s
x y
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1 Digit-set interpretation:
0 1 1 1 0 cout FA cin {0, 1} + {0, 1} + {0, 1}
1 0 0 0 1
1 0 1 1 0 = {0, 2} + {0, 1}
1 1 0 1 0
s
1 1 1 1 1
HA c out
c in
s
(a) FA built of two HAs
x
y
0 0 0
1 1
c out 2 2
3 1 3
c in
c in
s
s
(b) CMOS mux-based FA (c) Two-level AND-OR FA
x31 y31 x1 y1 x0 y0
c32 c31 c2 c1 c0
FA . . . FA FA
cout cin
Critical path
s31 s1 s0
Figure 10.4 Ripple-carry binary adder with 32-bit inputs and output.
cout 0 1 0 1 1 0 0 1 1 1 0 0 0 0 1 1 cin
\__________/\__________________/ \________/\____/
4 6 3 2
g = xy Carry chains and their lengths p=xy
Carry network
ck
c k1
... ci ... c0
c k2 c1
c i+1
si
Figure 10.5 The main part of an adder is the carry network. The rest
is just a set of gates to produce the g and p signals and the sum bits.
Computer Architecture, The Arithmetic/Logic Unit Slide 27
Ripple-Carry Adder Revisited
The carry recurrence: ci+1 = gi pi ci
...
ck ck1 ck2 c2 c1 c0
Carry network
ck
c k1
... ci ... c0
c k2 c1
c i+1
si
One-way street
Freeway
k Count
/ 0 k register k c in
k / D Q /
Data in / 1 _
C Q
IncrInit Adder /
k
Update
a c out
(Increment
amount)
...
ck ck1 ck2 c2 c1
sk1 sk2 s2 s1 s0
Carry network
ck
c k1
... ci ... c0
c k2 c1
c i+1
si
Figure 10.5
[6, 7 ] [2, 3 ]
[4, 5 ] [0, 1 ]
¢ ¢
[4, 7 ]
[0, 3 ]
¢ ¢
¢ ¢ ¢
g [0, 1] p [0, 1]
¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢
[6, 7 ] [2, 3 ]
[4, 5 ] [0, 1 ]
¢ ¢
4-input Brent-Kung [4, 7 ]
[0, 3 ]
carry network ¢ ¢
¢ ¢ ¢
¢ ¢ ¢
[0, 7 ] [0, 6 ] [0, 5 ] [0, 4 ] [0, 3 ] [0, 2 ] [0, 1 ] [0, 0 ]
ci
Intermeidte carries
c out c in c out c in
0 1
Adder Adder
Version 0 Version 1
of sum bits 0 1 of sum bits
c
a
s
[a, b]
op rs rt rd sh fn
31 25 20 15 10 5 0
R 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1
ALU Unused Source Destination Shift sra = 3
instruction register register amount
op rs rt rd sh fn
31 25 20 15 10 5 0
R 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1
ALU Amount Source Destination Unused srav = 7
instruction register register register
00 No shift 0 1 2 3
01 Logical left (0 or 4)-bit shift
10 Logical right x[31], x[31, 1] 2
11 Arith right 0, x[31, 1] y[31, 0]
x[30, 0], 0
x[31, 0] 0 1 2 3
32 32 32 32 (0 or 2)-bit shift
2
0 1 2 3 z[31, 0]
2
Multiplexer
32 0 1 2 3
(0 or 1)-bit shift
2
32-pixel (4 8) block of
black-and-white image:
Operand 1 Arith
unit 0
Result
1
Logic
Operand 2 unit
Select fn type
(logic or arith)
5 LSBs Shifted y 2
0
x c0 0 or 1
32 1 Shorthand
s symbol
xy MSB
Adder 2
32 for ALU
32
32 c 31 Control
y k c 32 3
/ x
AddSub Func
s
ALU
32- y Ovfl
Logic input Zero
unit NOR
AND 00
OR 01 2
XOR 10 Logic function Zero Ovfl
NOR 11
Figure 10.19 A multifunction ALU with 8 control signals (2 for
function class, 1 arithmetic, 3 shift, 2 logic) specifying the operation.
Computer Architecture, The Arithmetic/Logic Unit Slide 44
11 Multipliers and Dividers
Modern processors perform many multiplications & divisions:
• Encryption, image compression, graphic rendering
• Hardware vs programmed shift-add/sub algorithms
–––––––––––––––––––––––––– ––––––––––––––––––––––––––
2z (1) 0 1 0 1 0 10z (1) 2 4 6 9 6
z (1)
0 1 0 1 0 z (1)
0 2 4 6 9 6
+y1x24 1 0 1 0 +y1x104 2 1 1 6 8
–––––––––––––––––––––––––– ––––––––––––––––––––––––––
2z (2) 0 1 1 1 1 0 10z (2) 2 3 6 3 7 6
z (2)
0 1 1 1 1 0 z (2)
2 3 6 3 7 6
+y2x24 0 0 0 0 +y2x104 0 0 0 0 0
–––––––––––––––––––––––––– ––––––––––––––––––––––––––
2z (3) 0 0 1 1 1 1 0 10z (3) 0 2 3 6 3 7 6
z (3)
0 0 1 1 1 1 0 z (3)
0 2 3 6 3 7 6
+y3x2 4
0 0 0 0 +y3x10 1 4 1 1 2
4
–––––––––––––––––––––––––– ––––––––––––––––––––––––––
2z (4) 0 0 0 1 1 1 1 0 10z (4) 1 4 3 4 8 3 7 6
z (4)
0 0 0 1 1 1 1 0 z (4)
1 4 3 4 8 3 7 6
========================= =========================
Figure 11.2 Step-by-step multiplication examples for 4-digit unsigned numbers.
–––––––––––––––––––––––––– ––––––––––––––––––––––––––
2z (4) 1 1 1 0 1 1 1 0 2z (4) 0 0 0 1 1 1 1 0
z (4)
1 1 1 0 1 1 1 0 z (4)
0 0 0 1 1 1 1 0
========================= =========================
Positive y Negative y
Figure 11.3 Step-by-step multiplication examples for 2’s-complement numbers.
0 1 Enable
yj
Mux
Select
c out Adder c in
Add’Sub
From adder
cout Sum
/k–1
/k
To adder yj
Figure11.5 Shifting incorporated in the connections to
the partial product register rather than as a separate phase.
0, x, 2x, or 3x
Product z
... ...
Small tree of
Large tree of carry-save
carry-save Log- adders
adders depth
Log-
Adder depth Adder
Product Product
(a) Full-tree multiplier (b) Partial-tree multiplier
Example 11.3
Finding the 32-bit product of 32-bit integers in MiniMIPS
yj
0 1 Enable
Mux
Select
$t2 (counter)
Part of the
c out Adder c in control in
Add’Sub hardware
0 Enable 1
Mux 1
Select
c Adder c
out in 1
(Always
Trial difference
subtract)
y Quotient
x Divisor
z Dividend
0, x, 2x, or 3x
s Remainder
z2
y2 b d Our original
MS MS MS MS 0 dot-notation
for division
z1
y1
MS MS MS MS 0
z0
y0
MS MS MS MS 0
Straightened
dots to depict
s3 s2 s1 s0 an array divider
Example 11.7
Compute z mod x, where z (singed) and x > 0 are integers
if remainder is negative,
then add |x| to (Hi) to obtain z mod x
else Hi holds z mod x
$t2 (counter)
c out Adder c in Part of the
1
control in
(Always
Trial difference hardware
subtract)
Enable
yj
0 1 1 0
Mux
1 Enable
Mux
Select Select
z7 z6 z5 z4
Figure 11.14
array multiplier
to the left Turn upside-down Figure 11.7
8 bits, 23 bits for fractional part IEEE 754 0, , NaN Denormals:
bias = 127, (plus hidden 1 in integer part)
–126 to 127 Format 1.f 2e
Sign Exponent Significand
0.f 2emin
11 bits,
bias = 1023, 52 bits for fractional part
–1022 to 1023 (plus hidden 1 in integer part)
Denormals allow
Long (64-bit) format
graceful underflow
Negative numbers Positive numbers
– max – FLP – min – 0 min + FLP + max + +
3 3
2 2
1 1
x x
–4 –3 –2 –1 1 2 3 4 –4 –3 –2 –1 1 2 3 4
–1 –1
–2 –2
–3 –3
–4 –4
(a) Round to nearest even integer (b) Round to nearest integer
3 3
2 2
1 1
x x
–4 –3 –2 –1 1 2 3 4 –4 –3 –2 –1 1 2 3 4
–1 –1
–2 –2
–3 –3
–4 –4
(a) Round inward to nearest integer (b) Round upward to nearest integer
Mu x Sub
Possible swap
& complement
Align
Control significands
& sign
logic
Add
Normalize
& round
Figure 12.5
Simplified schematic of Sign Exponent Significand
a floating-point adder. Pack
Output
Computer Architecture, The Arithmetic/Logic Unit Slide 76
12.4 Other Floating-Point Operations
Floating-point multiplication Overflow
(underflow)
(2e1s1) (2e2s2) = 2e1+ e2(s1 s2) possible
Product of significands in [1, 4)
If product is in [2, 4), halve to normalize (increment exponent)
Floating-point square-rooting
(2es)1/2 = 2e/2(s)1/2 when e is even
= 2(e–1)2(2s)1/2 when e is odd
Normalization not needed
Computer Architecture, The Arithmetic/Logic Unit Slide 77
Hardware for Input 1 Input 2
Floating-Point Unpack
Signs Exponents Significands
Multiplication
and Division MulDiv
Multiply
Control or divide
& sign
logic
Normalize
& round
Figure 12.6 Simplified
schematic of a floating-
Sign Exponent Significand
point multiply/divide unit.
Pack
Output
Computer Architecture, The Arithmetic/Logic Unit Slide 78
12.5 Floating-Point Instructions
Floating-point arithmetic instructions for MiniMIPS:
add.s $f0,$f8,$f10 # set $f0 to ($f8) +fp ($f10)
sub.d $f0,$f8,$f10 # set $f0 to ($f8) –fp ($f10)
mul.d $f0,$f8,$f10 # set $f0 to ($f8) fp ($f10)
div.s $f0,$f8,$f10 # set $f0 to ($f8) /fp ($f10)
neg.s $f0,$f8 # set $f0 to –($f8)
op ex ft fs fd fn
31 25 20 15 10 5 0
F 0 1 0 0 0 1 0 0 0 0 x 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 x x x
Floating-point s=0 Source Source Destination add.* = 0
instruction d=1 register 2 register 1 register sub.* = 1
mul.* = 2
div.* = 3
neg.* = 7
Figure 12.7 The common floating-point instruction format for
MiniMIPS and components for arithmetic instructions. The extension
(ex) field distinguishes single (* = s) from double (* = d) operands.
Computer Architecture, The Arithmetic/Logic Unit Slide 79
The Floating-Point Unit in MiniMIPS
. ..
Loc 0 Loc 4 Loc 8 m 2 32
4 B / location Memory Coprocessor 1
up to 2 30 words Loc Loc
m 8 m 4
...
Hi Lo
TMU BadVaddr Trap &
(Coproc. 0) Status memory
Cause unit
Chapter Chapter Chapter EPC
10 11 12
op ex ft fs fd fn
31 25 20 15 10 5 0
F 0 1 0 0 0 1 0 0 0 0 x 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 x x x
Floating-point *.w = 0 Unused Source Destination To format:
instruction w.s = 0 register register s = 32
w.d = 1 d = 33
*.* = 1 w = 36
op rs rt rd sh fn
31 25 20 15 10 5 0
R 0 1 0 0 0 1 0 0 x 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Floating-point mfc1 = 0 Source Destination Unused Unused
instruction mtc1 = 4 register register
Figure 12.9 Instructions for floating-point data movement in MiniMIPS.
Compute (a + b) + c Compute a + (b + c)
2 2 0.00011011 25 0.00000000
Sum = 2 6 1.10110000 Sum = 0 (Normalize to special code for 0)
Example of arithmetic: [xl, xu] +interval [yl, yu] = [xl +fp yl, xu +fp yu]
Table Table
for a for b
Best linear
approximation
in subinterval
Multiply
x
xH