Fixed-point and floating-point numbers
CS370 Fall 2003
Representations of numbers
Unsigned integers Signed integers 1s and 2s complement representation To represent
Very Large and very Small numbers Real numbers in general
Fixed-point numbers Floating-point numbers
2
Base-10 (decimal) arithmetic
Uses the ten numbers from 0 to 9 Each column represents a power of 10
Thousands (103) column Hundreds (102) column Tens (101) column Ones (100) column
1999.10
= 1x103 + 9x102 + 9x101 + 9x100
3
Base-10 (decimal) arithmetic
Uses the ten numbers from 0 to 9 Each column represents a power of 10
Tens (101) column Ones (100) column Tenths (10-1) column Hundredths (10-2) column
19.9910
= 1x101 + 9x100 + 9x10-1 + 9x10-2
4
Standard binary representation
Uses the two numbers from 0 to 1 Every column represents a power of 2
Eights (23) column Fours (22) column Twos (21) column Ones (20) column
1001.2
= 1x23 + 0x22 + 0x21 + 1x20
5
Fixed-point representation
Uses the two numbers from 0 to 1 Every column represents a power of 2
Twos (21) column Ones (20) column Halves (2-1) column Fourths (2-2) column
10.012
= 1x21 + 0x20 + 0x2-1 + 1x2-2
6
Addition
Base-10 Base-2
1. 1. 2.
2 5 7
5 0 5
+ 1
1. 1. 0.
0 1 1
1 0 1
Range of values in a byte
Lowest exponent 0 -1 -2 -4 Min Step Max 255 127.5 63.75 15.9375 Value of 00110001 0 1 0 .5 0 .25 0 .0625
Scientific notation (1)
One billion = 1,000,000,000 = 1 x 109
significand or mantissa: 1 base or radix: 10 exponent: 9
Scientific notation (2)
1999 = 1.999 x 103
significand or mantissa: 1999 base or radix: 10 exponent: 3
= 19.99 x 10 = 199.9 x 10
10
Practice (base 10)
258 = 2.58 x 102
Mantissa = 258 Radix = 10 Exponent = 2
24.25 = 2.425 x 101
Mantissa = 2425 Radix = 10 Exponent = 1
11
Base-2 scientific notation
2.25ten = 10.01two = 10.01two x 20 = 1.001two x 21 normalized Numbers are usually normalized which means that the leading bit is always a 1.
12
8-bit floating point format (1)
sign 1 bit 0 0 0 1 exponent significand number number 3 bits base 2 base 10 4 bits 001 1001 1.001x21 2.25 011 111 001 1100 1110 1110 1.1 x 23 12.0
1.11 x 27 224.0 1.11 x 2-1 0.875
13
Improvements
Bias the exponent
Always subtract a fixed amount, e.g., 3 Allows representation of negative exponents
Implicit one
- Leading one in a Phone number such as 1-619-556-0231 is redundant. Why use a bit for the leading one?
14
8-bit floating-point format (2)
Exponent (3 bits) is biased by 3 The leading one of significand is implicit Zero is represented by all zeros
sign exponent 3 1 bit bits 0 100 0 011 0 111 1 001 significand 4 bits 0010 1000 1100 1100 number base number base 2 10 1.001x21 2.25 1.1 x 23 12.0 1.11 x 27 224.0 1.11 x 2-1 0.875
15
IEEE standard floating-point
Single precision
32 bits
sign: 1 bit exponent: 8 bits significand: 23 bits
Double precision
64 bits
sign: 1 bit exponent: 11 bits significand: 52 bits
Bias: 127
Bias: 511
16
Practice( base 10)
13 = 1.3 x 101
= 1.011 x 23
1.25 = 1.25 x 100
= 1.010 x 20
17
18
exponent
mantissa
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
exponent
mantissa
3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
19