4.Computer Arithmetic and Memory
4.Computer Arithmetic and Memory
Algorithm
When the signs of A and B are same, add the two magnitudes and attach the sign of
result as that of A. When the signs of A and B are not same, compare the magnitudes
and subtract the smaller number from the larger. Choose the sign of the result to be
the same as A, if A > B or the complement of the sign of A if A < B. If the two
magnitudes are equal, subtract B from A and make the sign of the result will be
positive.
Hardware implementation
To implement the two arithmetic operations with hardware, it is first
necessary that the two numbers be stored in registers. Let A and B be
two registers that keeps the magnitudes of the numbers, and As and Bs
be two flip-flops that hold the corresponding signs. The result of the
operation may be transferred into A and As. Thus an accumulator
register is formed by A and As.
Consider now the hardware implementation of the algorithm above.
First, we need a parallel-adder to perform the microoperation A + B.
Second, a comparator circuit is needed to establish whether A > B, A =
B, or A < B. Third, we need two parallel-subtractor circuits to perform
the microoperations A – B and B – A. The sign relationship can be
obtained from an exclusive OR gate with As and Bs as inputs.
Hence we require a magnitude comparator, an adder, and two
subtractors. But there is a different procedure that requires less
equipment. First, we know that subtraction can be accomplished by
means of complement and add. Second, the result of a comparison can
be determined from the end carry after subtraction. Careful
investigation of the alternatives suggests that the use of 2’s complement
for subtraction and comparison is an efficient procedure and we require
only an adder and a complementer.
Figure below shows a block diagram of the hardware for implementing
the addition and subtraction
operations. It has registers A and B and flip-flop As and Bs are used for
sign. We perform subtraction by
adding A to the 2’s complement of B. The output carry is loaded into
flip-flop E, where it can be checked to
discover the relative magnitudes of the two numbers. The add-overflow
flip-flop AVF contains the
overflow bit for addition of A and B. The A register provides other
micro-operations that may be needed
when we specify the sequence of steps in the algorithm.
The operation A+ B is done through the parallel adder. The S (sum)
output of the adder becomes to the input of the A register. The
complementer gives an output of B or the complement of B depending
or the state of the mode control M. The complement consists of XOR
gates and the parallel adder consists of full-adder circuits. The M signal
is also applied to the input carry of the adder. When M = 0, the output
of B is transferred to the adder, the input carry is 0, and the output of
the adder is equal to the sum A + B. When M = 1, the 1’s complement
of B is applied to the adder, the input carry is 1, and output S = A + B
+ 1. This is equal to A plus the 2’s complement of B, which is
equivalent to the subtraction A – B.
Hardware algorithm
We compare the signs of As and Bs by an exclusive-OR gate. If we get
0 output, the signs are identical and if it is 1, the signs are different. For
an add operation, identical signs dictate that the magnitudes be added.
For a subtract operation, different signs tells that the magnitudes be
added. The magnitudes are added with a microoperation EA ← A + B,
where EA is register that consists of E and A. The carry in E after the
addition constitutes an overflow if it is equal to 1. The value of E is
transferred into the add-overflow flip-flop AVF.
We subtract the two magnitudes if the signs are different for an add
operation or identical for a subtract operation. The magnitudes are
subtracted by adding A to the 2’s complement of B. No overflow can
occur if the numbers are subtracted so AVF is cleared to 0. A 1 in E
tells that A > B. and the number in A contains the correct result. If A
contains a zero, the sign As, must be made positive to avoid a negative
zero.
A 0 in E indicates that A < B. For this case it is necessary to take the
2’s complement of the value in A. This operation can be done with one
microoperation A + 1. However, we assume that the A register has
circuits for microoperations complement and increment, so the 2’s
complement is obtained from these two microoperations. In other paths
of the flowchart, the sign of the result is the same as the sign of A, so
no change in As is required. However, when A < B, the sign of the
result is the complement of the original sign of A. It is then necessary
to complement As, to obtain the correct sign. The final is found in
register A and its sign in As. The value in AVF provides an overflow
indication. The final value of E is immaterial.
Addition and Subtraction with signed-2’s complement
The signed-2’s complement representation of numbers together with
arithmetic algorithms for addition and subtraction. They are
summarized here. The leftmost bit of a binary number represents the
sign: 0 to denote to denote positive and 1 to denote negative. If the sign
bit is 1, then the number is represented in 2’s complement form. Thus
+ 33 is represented as 00100000 and -33 as 11011110. Note that
11011110 is the 2’s complement of 00100000, and vice versa.
The addition of two numbers in signed 2’s complement form by adding
the numbers with the sign bits treated the same as the other bits of the
number. We discard the carry of the sign-bit position. The subtraction
consists of first taking the 2’s complement of the subtrahend and then
adding it to the minuend.
When we add two numbers of n digits then the sum occupies n + 1
digits, in this case an overflow occurs. The effect of an overflow on the
sum of two signed 2’s complement numbers has been discussed
already. We can detect an overflow by inspecting the last two carries of
the addition. When the two carriers are applied to an exclusive-OR gate,
the overflow is detected when the output of the gate is equal to 1.
The register configuration for the hardware implementation is given in
figure below. This is the same configuration as in figure below except
that the sign bits are not separated from the rest of the registers. We call
the A register AC (accumulator) and the B register BR. The two sign
bits are added or subtracted together with the other bits in the
complementer and parallel adder. The overflow flip-flop V is set to 1 if
there is an overflow. The output carry in this case is discarded.
The algorithm for adding and subtracting two binary numbers in signed
2’s complement representation is shown in the flowchart of Figure
below. We obtain the sum by adding the contents of AC and BR
(including their sign bits). The overflow bit V is set to 1 if the exclusive
OR of the last two carries is 1, otherwise it is cleared. The subtraction
operation is performed by adding the content of AC to the 2’s
complement of BR. Taking the 2’s complement of BR has the effect of
changing a positive number to negative, and vice versa. We have to
check an overflow during this operation because the two numbers
added may have the same sign. It should be noted that if an overflow
occurs, there is an erroneous result in the AC register.
4.2 Multiplication Algorithms
A multiplication algorithm is an algorithm (or method) to multiply two numbers.
Depending on the size of the numbers, different algorithms are used.
Booth Multiplication Algorithm:
The booth algorithm is a multiplication algorithm that allows us to multiply the two
signed binary integers in 2's complement, respectively. It is also used to speed up
the performance of the multiplication process. It is very efficient too. It works on the
string bits 0's in the multiplier that requires no additional bit only shift the right-most
string bits and a string of 1's in a multiplier bit weight 2k to weight 2m that can be
considered as 2k+ 1 - 2m.
In the above flowchart, initially, AC and Qn + 1 bits are set to 0, and the SC is a
sequence counter that represents the total bits set n, which is equal to the number of
bits in the multiplier. There are BR that represent the multiplicand bits, and QR
represents the multiplier bits. After that, we encountered two bits of the multiplier
as Qn and Qn + 1, where Qn represents the last bit of QR, and Qn + 1 represents the
incremented bit of Qn by 1. Suppose two bits of the multiplier is equal to 10; it means
that we have to subtract the multiplier from the partial product in the accumulator
AC and then perform the arithmetic shift operation (ashr). If the two of the
multipliers equal to 01, it means we need to perform the addition of the multiplicand
to the partial product in accumulator AC and then perform the arithmetic shift
operation (ashr), including Qn + 1. The arithmetic shift operation is used in Booth's
algorithm to shift AC and QR bits to the right by one and remains the sign bit in AC
unchanged. And the sequence counter is continuously decremented till the
computational loop is repeated, equal to the number of bits (n).
Algorithm:
1. Set the Multiplicand and Multiplier binary bits as M and Q, respectively.
2. Initially, we set the AC and Qn + 1 registers value to 0.
3. Initially represents the number of Multiplier bits (Q), and it is a sequence
counter that is continuously decremented till equal to the number of bits (n)
or reached to 0.
4. A Qn represents the last bit of the Q, and the Qn+1 shows the incremented bit
of Qn by 1.
5. On each cycle of the booth algorithm, Qn and Qn + 1 bits will be checked on
the following parameters as follows:
i. When two bits Qn and Qn + 1 are 00 or 11, we simply perform the
arithmetic shift right operation (ashr) to the partial product AC. And the
bits of Qn and Qn + 1 is incremented by 1 bit.
ii. If the bits of Qn and Qn + 1 is shows to 01, the multiplicand bits (M)
will be added to the AC (Accumulator register). After that, we perform
the right shift operation to the AC and QR bits by 1.
iii. If the bits of Qn and Qn + 1 is shows to 10, the multiplicand bits (M)
will be subtracted from the AC (Accumulator register). After that, we
perform the right shift operation to the AC and QR bits by 1.
6. The operation continuously works till we reached n - 1 bit in the booth
algorithm.
7. Results of the Multiplication binary bits will be stored in the AC and QR
registers.
Hardware algorithm
In case E should remain as 1, the carry from the addition should not be
transferred to E.
In case the shift left operation inserts 0 into E, the divisor is subtracted
by inserting its 2’s complement value. The carry has to be transferred
to E. In case E = 1, it means that X Y and as such, QS is set to 1. In case
E = 0, it means that X < Y, and therefore, the original number is restored
by adding Y to X. In this case, the 0 that was inserted during a shift is
left in QS.
Floating Point arithmetic operations
2. ROM
Read Only Memory, is non-volatile and is more like a permanent
storage for information. It also stores the bootstrap loader program, to
load and start the operating system when computer is turned on. There
are different types of ROM:
PROM (Programmable read-only memory) – It can be
programmed by the user. Once programmed, the data and
instructions in it cannot be changed.
EPROM (Erasable Programmable read only memory) – It can
be reprogrammed. To erase data from it, expose it to ultraviolet
light. To reprogram it, erase all the previous data.
EEPROM (Electrically erasable programmable read only
memory) – The data can be erased by applying an electric field,
with no need for ultraviolet light. We can erase only portions of the
chip.
(M ROM) – The very first ROMs were hard-wired devices that
contained a pre-programmed set of data or instructions. These kind
of ROMs are known as Mask ROMs, which are inexpensive.
MROM cannot enable the user to change the data stored in it. If it
can, the process would be difficult or slow.
Auxiliary Memory
An Auxiliary memory is referred to as the lowest-cost, highest-space, and
slowest-approach storage in a computer system. It is where programs and
information are preserved for long-term storage or when not in direct use.
There are different categories of Auxiliary storage:
1) Magnetic Storage
Magnetic disks are coated with a magnetic material such as iron
oxide.
Magnetic tape, similar to the tape used in tape recorders, has also
been used for auxiliary storage, primarily for archiving data. Tape is
cheap, but access time is far slower than that of a magnetic disk
because it is sequential-access memory—i.e., data must be
sequentially read and written as a tape is unwound, rather than
retrieved directly from the desired point on the tape. Servers may
also use large collections of tapes or optical discs, with robotic
devices to select and load them, rather like old-fashioned jukeboxes.
Floppies and Hard disk are examples of magnetic storage.
2) Optical storage
Optical storage devices store and read data using light, often recording
information on what's called an optical disk. The most common types
of optical storage devices are drives that read and write CDs, DVDs and
Blu-ray discs. Scientists continue to research ways to pack more data
onto discs that can fit into a compact space.
3) Magneto-optical discs
Magneto-optical discs are a hybrid storage medium. In reading, spots
with different directions of magnetization give different polarization in
the reflected light of a low-power laser beam. In writing, every spot on
the disk is first heated by a strong laser beam and then cooled under
a magnetic field, magnetizing every spot in one direction, to store all
0s. The writing process then reverses the direction of the magnetic field
to store 1s where desired.
4.5 Associative Memory Hardware Organization
Associative memory is also known as content addressable memory (CAM) or
associative storage or associative array. It is a special type of memory that is
optimized for performing searches through data, as opposed to providing a simple
direct access to the data based on the address.
Associative memory of conventional semiconductor memory (usually RAM) with
added comparison circuity that enables a search operation to complete in a single
clock cycle. It is a hardware search engine, a special type of computer memory used
in certain very high searching applications.
Match Logic
The match logic for each word can be derived from the comparison
algorithm for two binary numbers.
First we neglect the key bits and compare the argument in A with the
bits stored in the cells of the words. Word i is equal to the argument in
A if Aj = Fij for j = 1, 2, ... , n. Two bits are equal if they are both 1 or
both 0
The equality of two bits can be expressed logically by the Boolean
function xj=AjFij+A′jF′ij where xj = 1 if the pair of bits in position j
are equal; otherwise, xj = 0.
For a word i to be equal to the argument in A we must have all
xj variables equal to 1.
This is the condition for setting the corresponding match bit Mi to 1.
The Boolean function for this condition is Mi=x1x2x3...xn and
constitutes the AND operation of all pairs of matched bits in a word.
We now include the key bit Kj in the comparison logic. The
requirement is that if Kj = 0, the corresponding bits of Aj and Fij need
no comparison. Only when Kj = 1 must they be compared. This
requirement is achieved by ORing each term with K'j thus:
Cache Memory
Cache Memory is a special very high-speed memory. It is used to speed up
and synchronizing with high-speed CPU. Cache memory is costlier than main
memory or disk memory but economical than CPU registers. Cache memory
is an extremely fast memory type that acts as a buffer between RAM and the
CPU. It holds frequently requested data and instructions so that they are
immediately available to the CPU when needed.
Cache memory is used to reduce the average time to access data from the Main
memory. The cache is a smaller and faster memory which stores copies of the
data from frequently used main memory locations. There are various different
independent caches in a CPU, which store instructions and data.
Associative mapping
In this type of mapping, the associative memory is used to store content and
addresses of the memory word. Any block can go into any line of the cache.
This means that the word id bits are used to identify which word in the block
is needed, but the tag becomes all of the remaining bits. This enables the
placement of any word at any place in the cache memory. It is considered to
be the fastest and the most flexible mapping form.
Direct Mapping
The simplest technique, known as direct mapping, maps each block of main
memory into only one possible cache line. or
In Direct mapping, assign each memory block to a specific line in the cache.
If a line is previously taken up by a memory block when a new block needs to
be loaded, the old block is trashed. An address space is split into two parts
index field and a tag field. The cache is used to store the tag field whereas the
rest is stored in the main memory. Direct mapping`s performance is directly
proportional to the Hit ratio.
For purposes of cache access, each main memory address can be viewed as
consisting of three fields. The least significant w bits identify a unique word
or byte within a block of main memory. In most contemporary machines, the
address is at the byte level. The remaining s bits specify one of the 2s blocks
of main memory. The cache logic interprets these s bits as a tag of s-r bits
(most significant portion) and a line field of r bits. This latter field identifies
one of the m=2r lines of the cache.
Set-Associative Mapping
This form of mapping is an enhanced form of direct mapping where the
drawbacks of direct mapping are removed. Set associative addresses the
problem of possible thrashing in the direct mapping method. It does this by
saying that instead of having exactly one line that a block can map to in the
cache, we will group a few lines together creating a set. Then a block in
memory can map to any one of the lines of a specific set. Set-associative
mapping allows that each word that is present in the cache can have two or
more words in the main memory for the same index address. Set associative
cache mapping combines the best of direct and associative cache mapping
techniques.
Writing into Cache
Cache is a technique of storing a copy of data temporarily in rapidly accessible
storage memory. Cache stores most recently used words in small memory to
increase the speed at which data is accessed. It acts as a buffer between RAM
and CPU and thus increases the speed at which data is available to the
processor.
Whenever a Processor wants to write a word, it checks to see if the address it
wants to write the data to, is present in the cache or not. If the address is
present in the cache i.e., Write Hit.
We can update the value in the cache and avoid expensive main memory
access. But this results in Inconsistent Data Problem. As both cache and
main memory have different data, it will cause problems in two or more
devices sharing the main memory (as in a multiprocessor system).
Write Through:
In write-through, data is simultaneously updated to cache and memory.
This process is simpler and more reliable. This is used when there are no
frequent writes to the cache (The number of write operations is less).
It helps in data recovery (In case of a power outage or system failure). A data
write will experience latency (delay) as we have to write to two locations (both
Memory and Cache). It Solves the inconsistency problem. But it questions the
advantage of having a cache in write operation (As the whole point of using a
cache was to avoid multiple access to the main memory).
Write Back:
The data is updated only in the cache and updated into the memory at a later
time. Data is updated in the memory only when the cache line is ready to be
replaced (cache line replacement is done using Belady’s Anomaly, Least
Recently Used Algorithm, FIFO, LIFO, and others depending on the
application).
Write Back is also known as Write Deferred.
Dirty Bit: Each Block in the cache needs a bit to indicate if the data present
in the cache was modified(Dirty) or not modified(Clean). If it is clean there is
no need to write it into the memory. It is designed to reduce write operation
to a memory. If Cache fails or if the System fails or power outages the
modified data will be lost. Because it’s nearly impossible to restore data from
cache if lost.
Cache initialization
One more aspect of cache organization that must be taken into consideration
is the problem of initialization. The cache is initialized when power is applied
to the computer or when the main memory is loaded with a complete set of
programs from auxiliary memory. After initialization the cache is considered
to be empty, but in effect it contains some non-valid data.it is customary
to include with each word in cache a valid bit to indicate whether or not the
word contains valid data.
The cache is initialized by clearing all the valid bits to 0. The valid bit of a
particular cache word is set to 1 the first time this word is loaded from main
memory and stays set unless the cache has to be initialized again. The
introduction of the valid bit means that a word in cache is not replaced by
another word unless the valid bit is set to 1 and a mismatch of tags occurs. If
the valid bit happens to be 0, the new word automatically replaces the invalid
data. Thus the initialization condition has the effect of forcing misses from the
cache until it fills with valid data.
4.6 Virtual Memory
Virtual Memory is a storage scheme that provides user an illusion of having a very
big main memory. This is done by treating a part of secondary memory as the main
memory. In this scheme, User can load the bigger size processes than the available
main memory by having the illusion that the memory is available to load the process.
Instead of loading one big process in the main memory, the Operating System loads
the different parts of more than one process in the main memory. By doing this, the
degree of multiprogramming will be increased and therefore, the CPU utilization
will also be increased.
Address space and Memory Space
Addresses that are used by programmers are known as virtual addresses, and
the set of such addresses is known as the address space. Space or spot where
the address is saved in the main memory is referred to as location or physical
address and the set of such locations is known as the memory space.
The associated programs and data require not to be in adjacent areas in the
memory, because data is being transferred in and out, and null spaces can be
distributed in the memory.
Fig: Address Space and memory Space
Suppose, the address field of an instruction code includes 20 bits, but physical
memory addresses can only be defined with 15 bits. As a result, the CPU will
reference instructions and information with a 20-bit address, because the
information at this address should be taken from physical memory. After all,
the approach to auxiliary storage for single words would be intensely high.
Page replacement
Page replacement happens when a requested page is not in memory (page
fault) and a free page cannot be used to satisfy the allocation, either because
there are none, or because the number of free pages is lower than some
threshold.
There are various page replacement algorithms. Each algorithm has a different
method by which the pages can be replaced.