0% found this document useful (0 votes)
365 views79 pages

Unit-I: Register Transfer

The document discusses register transfer and register transfer language (RTL). It defines register transfer as performing micro-operations and transferring the result to registers. RTL uses symbolic notation to describe micro-operations and register transfers. Common operations include simple transfers between registers and conditional transfers based on control functions. Arithmetic and logical instructions also manipulate data in registers.

Uploaded by

Harshit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
365 views79 pages

Unit-I: Register Transfer

The document discusses register transfer and register transfer language (RTL). It defines register transfer as performing micro-operations and transferring the result to registers. RTL uses symbolic notation to describe micro-operations and register transfers. Common operations include simple transfers between registers and conditional transfers based on control functions. Arithmetic and logical instructions also manipulate data in registers.

Uploaded by

Harshit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UNIT-I

Register Transfer

The term Register Transfer refers to the availability of hardware logic circuits that can perform a given
micro-operation and transfer the result of the operation to the same or another register.

Most of the standard notations used for specifying operations on various registers are stated below.

o The memory address register is designated by MAR.


o Program Counter PC holds the next instruction's address.
o Instruction Register IR holds the instruction being executed.
o R1 (Processor Register).
o We can also indicate individual bits by placing them in parenthesis. For instance, PC (8-15), R2 (5),
etc.
o Data Transfer from one register to another register is represented in symbolic form by means of
replacement operator. For instance, the following statement denotes a transfer of the data of
register R1 into register R2.

Register Transfer Language (RTL)

In symbolic notation, it is used to describe the micro-operations transfer among registers.


It is a kind of intermediate representation (IR) that is very close to assembly language, such as
that which is used in a compiler. The term “Register Transfer” can perform micro-operations and
transfer the result of operation to the same or other register.
Micro-operations :
The operation executed on the data store in registers are called micro-operations. They are
detailed low-level instructions used in some designs to implement complex machine
instructions.
Register Transfer :
The information transformed from one register to another register is represented in symbolic
form by replacement operator is called Register Transfer.
Replacement Operator :
In the statement, R2 <- R1, <- acts as a replacement operator. This statement defines the
transfer of content of register R1 into register R2.

There are various methods of RTL –


1. General way of representing a register is by the name of the register enclosed in a
rectangular box as shown in (a).

2. Register is numbered in a sequence of 0 to (n-1) as shown in (b).

3. The numbering of bits in a register can be marked on the top of the box as shown in (c).

1
4. A 16-bit register PC is divided into 2 parts- Bits (0 to 7) are assigned with lower byte of 16-bit
address and bits (8 to 15) are assigned with higher bytes of 16-bit address as shown in (d).

Symbol Description Example

Letters and MAR, R1,


Denotes a Register
Numbers R2

R1(8-bit)
() Denotes a part of register
R1(0-7)

<- Denotes a transfer of information R2 <- R1

R1 <- R2
Specify two micro-operations of Register
,
Transfer R2 <- R1

P : R2 <-
R1
: Denotes conditional operations
if P=1

Naming Denotes another name for an already


Ra := R1
Operator (:=) existing register/alias

Basic symbols of RTL :

Register Transfer Operations:


The operation performed on the data stored in the registers are referred to as register
transfer operations.
There are different types of register transfer operations:
1. Simple Transfer – R2 <- R1
The content of R1 are copied into R2 without affecting the content of R1. It is an unconditional
type of transfer operation.
2. Conditional Transfer –
2
P: R2 ← R1
It indicates that if P=1, then the content of R1 is transferred to R2. It is a unidirectional
operation.
3. Simultaneous Operations –
If 2 or more operations are to occur simultaneously then they are separated with comma (,).

If the control function P=1, then load the content of R1 into R2 and at the same clock load the
content of R2 into R1

Arithmetic MicroInstructions :

The four basic arithmetic operations are addition, subtraction, multiplication, and
division. Most computers provide instructions for all four operations.
Typical Arithmetic Instructions –
Name Mnemonic Example Explanation
It will increment the register B by 1
B<-B+1
Increment INC INC B

It will decrement the register B by 1


B<-B-1
Decrement DEC DEC B

It will add contents of register B to the contents of


the accumulator
and store the result in the accumulator
AC<-AC+B
Add ADD ADD B

It will subtract the contents of register B from the


contents of the
Subtract SUB SUB B accumulator and store the result in the

3
accumulator
AC<-AC-B

It will multiply the contents of register B with the


contents of the
accumulator and store the result in the
accumulator
AC<-AC*B
Multiply MUL MUL B

It will divide the contents of register B with the


contents of the
accumulator and store the quotient in the
accumulator
AC<-AC/B
Divide DIV DIV B

It will add the contents of register B and the carry


flag with the
contents of the accumulator and store the result in
the
accumulator
ADDC
AC<-AC+B+Carry flag
Add with carry ADDC B

It will subtract the contents of register B and the


carry flag from
the contents of the accumulator and store the
result in the
accumulator
Subtract with
AC<-AC-B-Carry flag
borrow SUBB SUBB B

It will negate a value by finding 2’s complement


of its single operand.
This means simply operand by -1.
Negate(2’s
B<-B’+1
complement) NEG NEG B

Logical and Bit Manipulation Instructions :


4
Logical instructions perform binary operations on strings of bits stored in registers.
They are useful for manipulating individual bits or a group of bits.

Typical Logical and Bit Manipulation Instructions –


Mnemon
Name ic Example Explanation

It will set the accumulator to 0 AC<-0


Clear CLR CLR

It will complement the accumulator AC<-(AC)’


Complement COM COM A

It will AND the contents of register B with the contents of


accumulator and store it in the accumulator
AC<-AC AND B
AND AND AND B

It will OR the contents of register B with the contents of


accumulator and store it in the [Link]<-AC OR B
OR OR OR B

It will XOR the contents of register B with the contents of the


accumulator and store it in the [Link]<-AC XOR B
Exclusive-OR XOR XOR B

It will set the carry flag to 0. Carry flag<-0


Clear carry CLRC CLRC

It will set the carry flag to 1. Carry flag<-1


Set carry SETC SETC

Complement COM
It will complement the carry flag. Carry flag<- (Carry flag)’
carry C COMC

Enable interrupt EI EI It will enable the interrupt

Disable
interrupt DI DI It will disable the interrupt

Shift Micro-Instructions :
Shifts are operations in which the bits of a word are moved to the left or right. Shift
5
instructions may specify either logical shifts, arithmetic shifts, or rotate-type
operations.
Typical Shift Instructions –
Name Mnemonic
Logical shift right SHR

Logical shift left SHL

Arithmetic shift right SHRA

Arithmetic shift left SHLA

Rotate right ROR

Rotate left ROL

Rotate right through carry RORC

Rotate left through carry ROLC


.
Logical Shift Left –
In this shift one position moves each bit to the left one by one. The Empty least
significant bit (LSB) is filled with zero (i.e, the serial input), and the most significant bit
(MSB) is rejected.

Right Logical Shift –


In this one position moves each bit to the right one by one and the least significant
bit(LSB) is rejected and the empty MSB is filled with zero.

6
Left Arithmetic Shift –
In this one position moves each bit to the left one by one. The empty least significant bit
(LSB) is filled with zero and the most significant bit (MSB) is rejected. Same as the Left
Logical Shift.

Right Arithmetic Shift –


In this one position moves each bit to the right one by one and the least significant bit is
rejected and the empty MSB is filled with the value of the previous MSB.

Left Circular Shift –

7
Right Circular Shift –

Computer Registers
Registers are a type of computer memory used to quickly accept, store, and transfer data and instructions
that are being used immediately by the CPU. The registers used by the CPU are often termed as Processor
registers.

A processor register may hold an instruction, a storage address, or any data (such as bit sequence or
individual characters).

The computer needs processor registers for manipulating data and a register for holding a memory
address. The register holding the memory location is used to calculate the address of the next instruction
after the execution of the current instruction is completed.

Register Symbol Number of bits Function

Data register DR 16 Holds memory operand

Address register AR 12 Holds address for the memory

Accumulator AC 16 Processor register


8
Instruction register IR 16 Holds instruction code

Program counter PC 12 Holds address of the instruction

Temporary register TR 16 Holds temporary data

Input register INPR 8 Carries input character

Output register OUTR 8 Carries output character

The following image shows the register and memory configuration for a basic computer.

o The Memory unit has a capacity of 4096 words, and each word contains 16 bits.
o The Data Register (DR) contains 16 bits which hold the operand read from the memory location.
o The Memory Address Register (MAR) contains 12 bits which hold the address for the memory
location.
o The Program Counter (PC) also contains 12 bits which hold the address of the next instruction to be
read from memory after the current instruction is executed.
o The Accumulator (AC) register is a general purpose processing register.
o The instruction read from memory is placed in the Instruction register (IR).
o The Temporary Register (TR) is used for holding the temporary data during the processing.
o The Input Registers (IR) holds the input characters given by the user.
o The Output Registers (OR) holds the output after processing the input data.

9
Common Bus System
We shall study the common bus system of a very basic computer in this article. A basic
computer has 8 registers, memory unit and a control unit. The diagram of the common bus
system is as shown below.

Connections:

The outputs of all the registers except the OUTR (output register) are connected to the
common bus. The output selected depends upon the binary value of variables S2, S1 and
S0. The lines from common bus are connected to the inputs of the registers and memory. A
register receives the information from the bus when its LD (load) input is activated while in
case of memory the Write input must be enabled to receive the information. The contents of
memory are placed onto the bus when its Read input is activated.

Various Registers:
4 registers DR, AC, IR and TR have 16 bits and 2 registers AR and PC have 12 bits. The
INPR and OUTR have 8 bits each. The INPR receives character from input device and
delivers it to the AC while the OUTR receives character from AC and transfers it to the
10
output device. 5 registers have 3 control inputs LD (load), INR (increment) and CLR
(clear). These types of registers are similar to a binary counter.

Adder and logic circuit:

The adder and logic circuit provides the 16 inputs of AC. This circuit has 3 sets of inputs.
One set comes from the outputs of AC which implements register micro operations. The
other set comes from the DR (data register) which are used to perform arithmetic and
logic micro operations. The result of these operations is sent to AC while the end around
carry is stored in E as shown in diagram. The third set of inputs is from INPR.

Instruction set
An instruction is a set of codes that the computer processor can understand. The code is usually in
1s and 0s, or machine language. It contains instructions or tasks that control the movement of bits and
bytes within the processor.
Example of some instruction sets −
 ADD − Add two numbers together.
 JUMP − Jump to designated RAM address.
 LOAD − Load information from RAM to the CPU.

Types of Instruction Set


[Link] Instruction set Computer (RISC)
[Link] Instruction Set Computer (CISC)

Instruction Cycle
Each computer’s CPU can have different cycles based on different instruction sets, but will be similar to the following
cycle:
Fetch Stage: The next instruction is fetched from the memory address that is currently stored in the program counter
and stored in the instruction register. At the end of the fetch operation, the PC points to the next instruction that will be
read in the next cycle.

 Fetch Cycle: Fetch the instruction from memory


 Decode Cycle: Decode the instruction. Then Read the effective address from the memory
 Execute Cycle: Execute the instruction.

11
Registers Involved In Each Instruction Cycle
Following are the different types of registers involved in each instruction cycle:

1. Memory address registers(MAR): It is connected to the address lines of the system bus. It specifies the address in
memory for a read or write operation.
2. Memory Buffer Register(MBR): It is connected to the data lines of the system bus. It contains the value to be
stored in memory or the last value read from the memory.
3. Program Counter(PC): Holds the address of the next instruction to be fetched.
4. Instruction Register(IR): Holds the last instruction fetched.
Role of Registers Involved in Instruction Cycle
The program counter (PC) is a special register that holds the memory address of the next instruction to be executed.
During the fetch stage, the address stored in the PC is copied into the memory address register (MAR) and then the PC
is incremented in order to “point” to the memory address of the next instruction to be executed.
The CPU then takes the instruction at the memory address described by the MAR and copies it into the memory data
register (MDR). The MDR also acts as a two-way register that holds data fetched from memory or data waiting to be
stored in memory (it is also known as the memory buffer register (MBR) because of this). Eventually, the instruction
in the MDR is copied into the current instruction register (CIR) which acts as a temporary holding ground for the
instruction that has just been fetched from memory.
During the decode stage, the control unit (CU) will decode the instruction in the CIR. The CU then sends signals to
other components within the CPU, such as the arithmetic logic unit (ALU) and the floating-point unit (FPU). The
ALU performs arithmetic operations such as addition and subtraction and also multiplication by repeated addition and
division via repeated subtraction. It also performs logic operations such as AND, OR, NOT, and binary shifts as well.
The FPU is reserved for performing floating-point operations.
Components of Instruction Cycle
Following are the main components of every instruction cycle:
1. Fetch Cycle
The fetching of instruction is the first phase. The fetch instruction is common for each instruction executed in a central
processing unit. In this phase, the central processing unit sends the PC to MAR and then sends the READ command
into a control bus.
After sending a read command on the data bus, the memory returns the instruction, which is stored at that particular
address in the memory. Then, the CPU copies data from the data bus into MBR and then copies the data from MBR to
registers.
After all this, the pointer is incremented to the next memory location so that the next instruction can be fetched from
memory.
2. Decode Cycle
The decoding of instruction is the second phase. In this phase, the CPU determines which instruction is fetched from
the instruction and what action needs to be performed on the instruction. The opcode for the instruction is also fetched
from memory and decodes the related operation which needs to be performed for the related instruction.
The reading of an effective address is the third phase. This phase deals with the decision of the operation. The
operation can be of any type of memory type non-memory type operation. Memory instruction can be categorized into
two categories: direct memory instruction and indirect memory instruction.
3. Execute Cycle
The execution of instruction is the last phase. In this stage, the instruction is finally executed. The instruction is
executed, and the result of the instruction is stored in the register. After the execution of an instruction, the CPU
prepares itself for the execution of the next instruction. For every instruction, the execution time is calculated, which is
used to tell the processing speed of the processor.

Computer Organization | Basic Computer Instructions

The basic computer has 16-bit instruction register (IR) which can denote either memory
reference or register reference or input-output instruction.
12
1. Memory Reference Instruction– These instructions refer to memory address as an
operand. The other operand is always accumulator. Specifies 12-bit address, 3-bit opcode
(other than 111) and 1-bit addressing mode for direct and indirect addressing.
Example –
IR register contains = 0001XXXXXXXXXXXX, i.e. ADD after fetching and decoding of
instruction we find out that it is a memory reference instruction for ADD operation.
Hence, DR ← M[AR]
AC ← AC + DR, SC ← 0

2. Register Reference – These instructions perform operations on registers rather than


memory addresses. The IR(14 – 12) is 111 (differentiates it from memory reference) and
IR(15) is 0 (differentiates it from input/output instructions). The rest 12 bits specify register
operation.
Example –
IR register contains = 0111001000000000, i.e. CMA after fetch and decode cycle we find out
that it is a register reference instruction for complement accumulator.
Hence, AC ← ~AC

3. Input/Output – These instructions are for communication between computer and outside
environment. The IR(14 – 12) is 111 (differentiates it from memory reference) and IR(15) is 1
(differentiates it from register reference instructions). The rest 12 bits specify I/O operation.
Example –
IR register contains = 1111100000000000, i.e. INP after fetch and decode cycle we find out
that it is an input/output instruction for inputing character. Hence, INPUT character from
peripheral device.

INPUT-OUTPUT AND INTERRUPT


An interrupt I/O is a process of data transfer in which an external device or a peripheral informs the
CPU that it is ready for communication and requests the attention of the CPU.

13
I/O Configuration
The terminals send and receive serial information. Each portion of serial data has eight bits of
alphanumeric code, where the leftmost bit is continually 0. The serial data from the input register is
transferred into the input register INPR. The output register OUTR can save the serial data for the printer.
These two registers interact with the Accumulator (AC) in parallel and with a communication interface in a
serial structure.
The Input/Output configuration is displayed in the figure. The transmitter interface gets serial data from the
keyboard and sends it to INPR. The receiver interface gets data from OUTR and transfers it to the printer
serially.
The input/output registers include eight bits. The FGI is a 1-bit input flag, which is a control flip-flop. The
flag bit is set to 1 when new data is accessible in the input device and is cleared to 0 when the data is
approved through the device.
When a key is clicked on the keyboard, the alphanumeric code equivalent to the key is shifted to INPR and
the input flag FGI is set to 0. The data in INPR cannot be modified considering the flag is set. The device
tests the flag bit; if it is 1, the data from INPR is sent in parallel into AC, and FGI is cleared to 0.
The output register OUTR works equivalent to the input register INPR.
The flow of data by the OUTR is the opposite of INPR. Therefore, the output flag FGO is set to 1 originally.
The device tests the flag bit; if it is 1, the data from AC is sent in parallel to OUTR, and FGO is cleared to
0. The new data cannot be loaded into OUTR when the FGO is 0 because this condition denotes that the
output device is in the procedure of printing a character.

Input Register:
The INPR input register is a register that includes eight bits and influences alphanumeric input data. The 1-
bit input flag FGI is a control flip-flop. When new data is accessible in the input device, the flag bit is set to
1. It is cleared to 0 when the data is approved by the device. The flag is needed to synchronize the timing
rate difference between the input device and the computer.
The process of data transfer is as follows −

14
 The input flag FGI is set to 0. When a user clicks any key on the keyboard, an 8-bit alphanumeric code is
transferred into INPR and the input flag FGI is set to 1.
 The device tests the flag bit. If the bit is 1, thus the data from INPR is transferred to AC and together FGI is
cleared to 0.
 Then the flag is cleared, new data can be transferred into INPR by introducing another key.

Output Register:
The working of the output register OUTR is equivalent to that of the input register INPR, therefore the
control of data flow is in the opposite.
The procedure of data transfer is as follows −

 The output flag FGO is set to 1.


 The device tests the flag bit. If the bit is 1, the data from AC is shared to OUTR and concurrently FGO is
cleared to 0.
 After that, the output device receives the coded 8-bit data and prints the matching character.
 After this operation is done, the output device sets the FGO to 1.

Design of Basic Computer


The basic computer comprises the following hardware components:
1. Memory Unit with 4096 words of 16 bits each
2. Eight registers
1. AR (Address Register): In indirect or direct addressing, the processor needs to keep track of the
locations in the memory that it is addressing. For this purpose, the processor uses Address
Register(AR).
2. PC (Program Counter): It holds the memory address of the next instruction to get.
3. DR (Data Register): It is used to keep the operand, found using direct or indirect addressing. The
processor uses the value stored in DR as data for its operations.
4. AC (Accumulator): It is a general-purpose register. It can be referred to load the AC with a
specific memory or storing the AC contents into one particular memory location.
5. IR (Instruction Register): It stores the instruction code currently being processed. The control
circuitry then converts this code into microoperations, necessary to implement it.
6. TR (Temporary Register) stores the intermediate results or other temporary data. It acts as a
scratch register for the processor.
7. OUTR (Output Register): It holds an 8-bit character to the output device.
8. INPR (Input Register): It holds an 8-bit character received from the input device.
3. Seven flip-flops
1. I flip-flop: It gives information about the addressing mode, whether it is direct or indirect.
2. S flip-flop: It is the start-stop flip-flop
3. E flip-flop: It is used for the 'carry.'
4. R flip-flop: It is used for the 'interrupt.'
5. IEN flip-flop: It is used for Interrupt enable on
6. FGI flip-flop: used for input flag
7. FGO flip-flop: used for output flag
4. Two decoders
1. 3x8 operation decoder: used to decode the output of IR (Instruction Register)
2. 4x16 timing decoder: used to decode the time signal generated by SC (Sequence Counter)
5. A 16-bit common bus: The basic computer has eight registers, a memory unit, and a control unit, and
there is a requirement of a path to transfer information from one register to another and between
15
memory and register. So to provide this path we use a common bus. Here is a diagram of the 16-bit
common bus for a better understanding
6. Control logic gates
7. Adder and logic circuit connected to the input of AC

The input for the control logic gate comes from:


1. Input from I flip-flops
2. Input from the two decoders
3. Input from 0-11 bits of IR (Instruction Register)
4. Other inputs to the control logic gate include
1. AC (Accumulator) bits 0-15, to check if AC=0 and to detect the sign bit in AC(15)
2. DR (Data Register) bits 0-15, to check if DR=0 and check the values of seven flip-flops.
The output of the control logic circuit is as follows:
1. Signals to control the read and write inputs of memory
2. Signals to control the inputs of the eight registers.
3. Signals to control the AC adder and logic circuit
4. Signals to control the S2, S1, and S0 to select a register for the bus
5. Signals to set, clear or complement the flip-flops

16
Booth's Multiplication Algorithm
The booth algorithm is a multiplication algorithm that allows us to multiply the two signed binary
integers in 2's complement, respectively. It is also used to speed up the performance of the multiplication
process. It is very efficient too. It works on the string bits 0's in the multiplier that requires no additional bit
only shift the right-most string bits and a string of 1's in a multiplier bit weight 2 k to weight 2m that can be
considered as 2k+ 1 - 2m.

Following is the pictorial representation of the Booth's Algorithm:

In the above flowchart, initially, AC and Qn + 1 bits are set to 0, and the SC is a sequence counter that
represents the total bits set n, which is equal to the number of bits in the multiplier. There are BR that
represent the multiplicand bits, and QR represents the multiplier bits. After that, we encountered two bits
of the multiplier as Qn and Qn + 1, where Qn represents the last bit of QR, and Q n + 1 represents the
incremented bit of Qn by 1. Suppose two bits of the multiplier is equal to 10; it means that we have to
subtract the multiplier from the partial product in the accumulator AC and then perform the arithmetic
shift operation (ashr). If the two of the multipliers equal to 01, it means we need to perform the addition of
the multiplicand to the partial product in accumulator AC and then perform the arithmetic shift operation
(ashr), including Qn + 1. The arithmetic shift operation is used in Booth's algorithm to shift AC and QR bits to
the right by one and remains the sign bit in AC unchanged. And the sequence counter is continuously
decremented till the computational loop is repeated, equal to the number of bits (n).
17
Working on the Booth Algorithm
1. Set the Multiplicand and Multiplier binary bits as M and Q, respectively.
2. Initially, we set the AC and Qn + 1 registers value to 0.
3. SC represents the number of Multiplier bits (Q), and it is a sequence counter that is continuously
decremented till equal to the number of bits (n) or reached to 0.
4. A Qn represents the last bit of the Q, and the Qn+1 shows the incremented bit of Qn by 1.
5. On each cycle of the booth algorithm, Qn and Qn + 1 bits will be checked on the following parameters
as follows:
i. When two bits Qn and Qn + 1 are 00 or 11, we simply perform the arithmetic shift right
operation (ashr) to the partial product AC. And the bits of Qn and Q n + 1 is incremented by 1
bit.

18
ii. If the bits of Qn and Qn + 1 is shows to 01, the multiplicand bits (M) will be added to the AC
(Accumulator register). After that, we perform the right shift operation to the AC and QR bits
by 1.
iii. If the bits of Qn and Qn + 1 is shows to 10, the multiplicand bits (M) will be subtracted from
the AC (Accumulator register). After that, we perform the right shift operation to the AC and
QR bits by 1.
6. The operation continuously works till we reached n - 1 bit in the booth algorithm.
7. Results of the Multiplication binary bits will be stored in the AC and QR registers.

19
UNIT-II

CONTROL UNIT

In a system or computer, most of the tasks are controlled with the help of a processor or
CPU (Central processing unit), which is the main component of a computer. The CPU usually has
two main systems: control unit (CU) and arithmetic and logic unit (ALU). The control unit (CU) is
used to synchronize the tasks with the help of sending timings and control signals. On the other
hand, mathematical and logical operations can be handled with the help of ALU. Micro
programmed control units and hardwired control units can be called two types of control units. We
can execute an instruction with the help of these two control units.

In the hardwired control unit, the execution of operations is much faster, but the implementation,
modification, and decoding are difficult. In contrast, implementing, modifying, decoding micro-
programmed control units is very easy. The micro-programmed control unit is also able to
handle complex instructions. With the help of control signals generated by micro-programmed
and hardwired control units, we are able to fetch and execute the instructions.

Control Signals
In order to generate the control signals, both the control signals were basically designed. The
functionality of a processor's hardware is operated with the help of these control signals. The
control signals are used to know about various types of things, which are described as follows:

o Control signals are used to know what operation is going to be performed.


o It is used to know about the sequence of operations that are performed by the processor.
o It is used to know about the timing at which an operation must be executed and many
other types of things.

Hardwired Control Unit


With the help of generating control signals, the hardwired control unit is able to execute the
instructions at a correct time and proper sequence. As compared to the micro-programmed, the
hardwired CU is generally faster. In this CU, the control signals are generated with the help of PLA
circuit and state counter. Here the Central processing unit requires all these control signals. With
the help of hardware, the hardwired control signals are generated, and it basically uses the circuitry
approach.

The image of a hardwired control unit is described as follows, which contains various components in the
form of circuitry. We will discuss them one by one so that we can properly understand the "generation of
control signals".

20
The instruction register is a type of processor register used to contain an instruction that is
currently in execution. As we can see, the instruction register is used to generate the OP-code bits
respective of the operation as well as the addressing mode of operands.

The above generated Op-code bits are received in the field of an instruction decoder. The
instruction decoder interprets the operation and instruction's addressing mode. Now on the basis
of the addressing mode of instruction and operation which exists in the instruction register, the
instruction decoder sets the corresponding Instruction signal INS i to 1. Some steps are used to
execute each instruction, i.e., instruction
fetch, decode, operand fetch, Arithmetic and logical unit, and memory store. Different books
might be contained different steps. But in general, we are able to execute an instruction with the
help of these five steps.

o The information about the current step of instruction must be known by the control unit.
Now the Step Counter is implemented, which is used to contain the signals from T1,…., T5.
Now on the basis of the step which contains the instruction, one of the signals of a step
counter will be set from T1 to T5 to 1.
o Now we have a question that how the step counter knows about the current step of
instruction? So to know the current step, a Clock is implemented. The one-clock cycle of the
clock will be completed for each step. For example, suppose that if the stop counter sets T3
to 1, then after completing one clock cycle, the step counter will set T4 to 1.
o Now we have a question, i.e., what will happen if the execution of an instruction is
interrupted for some reason? Will the step counter still be triggered by the clock? The
answer to this question is No. As long as the execution is current step is completed,
the Counter Enable will "disable" the Step Counter so that it will stop then increment to the
next step signal.
21
o Now we have a question, i.e., what if the execution of instruction depends on some
conditions? In this case, the Condition Signals will be used. There are various conditions in
which the signals are generated with the help of control signals that can be less than,
greater than, less than equal, greater than equal, and many more.
o The external input is the last one. It is used to tell the Control Signal Generator about the
interrupts, which will affect the execution of an instruction.

So, on the basis of the input obtained by the conditional signals, step counter, external inputs, and
instruction register, the control signals will be generated with the help of Control signal Generator.

Micro-programmed Control Unit


A micro-programmed control unit can be described as a simple logic circuit. We can use it in two
ways, i.e., it is able to execute each instruction with the help of generating control signals, and it is
also able to do sequencing through microinstructions. It will generate the control signals with the
help of programs. At the time of evolution of CISC architecture in the past, this approach was very
famous. The program which is used to create the control signals is known as the "Micro-program".
The micro-program is placed on the processor chip, which is a type of fast memory. This memory
is also known as the control store or control memory.

A micro-program is used to contain a set of microinstructions. Each microinstruction or control


word contains different bit patterns. The n bit words are contained by each microinstruction. On
the basis of the bit pattern of a control word, every control signals differ from each other.

Like the above, the instruction execution in a micro-programmed control unit is also performed in
steps. So for each step, the micro-program contains a control word/ microinstruction. If we want to
execute a particular instruction, we need a sequence of microinstructions. This process is known as
the micro-routine. The image of a micro-programmed control unit is described as follows. Here, we
will learn the organization of micro-program, micro-routine, and control word/ microinstruction.

o Instruction fetch is the first step. In this step, the instruction is fetched from the IR
(Instruction Register) with the help of a Microinstruction address register.

22
o Decode is the second step. In this step, the instructions obtained from the instruction
register will be decoded with the help of a microinstruction address generator. Here we will
also get the starting address of a micro-routine. With the help of this address, we can easily
perform the operation, which is mentioned in the instruction. It will also load the starting
address into the micro-program counter.
o Increment is the third step. In this step, the control word, which corresponds to the starting
address of a micro-program, will be read. When the execution proceeds, the value of the
micro-program counter will be increased so that it can read the successive control words of
a micro-routine.
o End bit is the fourth step. In this step, the microinstruction of a micro-routine contains a bit,
which is known as the end bit. The execution of the microinstruction will be successfully
completed when the end bit is set to 1.
o This is the last step, and in this step, the micro-program address generator will again go
back to Step 1 so that we can fetch a new instruction, and this process or cycle goes on.

So in the micro-programmed control unit, the micro-programs are stored with the help of Control
memory or Control store. The implementation of this CU is very easy and flexible, but it is slower as
compared to the Hardwired control unit.

Hardwired Control Unit Microprogrammed Control Unit

Hardwired control unit generates the control Microprogrammed control unit generates the
signals needed for the processor using logic control signals with the help of micro
circuits instructions stored in control memory

Hardwired control unit is faster when compared to


microprogrammed control unit as the required This is slower than the other as micro
control signals are generated with the help of instructions are used for generating signals
hardwares here

Difficult to modify as the control signals that need Easy to modify as the modification need to
to be generated are hard wired be done only at the instruction level

Less costlier than hardwired control as only


More costlier as everything has to be realized in micro instructions are used for generating
terms of logic gates control signals

It cannot handle complex instructions as the circuit


design for it becomes complex It can handle complex instructions

Only limited number of instructions are used due to Control signals for many instructions can be
the hardware implementation generated

23
Hardwired Control Unit Microprogrammed Control Unit

Used in computer that makes use of Reduced Used in computer that makes use of
Instruction Set Computers(RISC) Complex Instruction Set Computers(CISC)

CONTROL MEMORY:
A control memory is a part of the control unit. Any computer that involves
microprogrammed control consists of two memories. They are the main memory and the control
memory. Programs are usually stored in the main memory by the users. Whenever the programs
change, the data is also modified in the main memory. They consist of machine instructions and
data.
The control memory consists of microprograms that are fixed and cannot be modified frequently.
They contain microinstructions that specify the internal control signals required to execute register
micro-operations.
The machine instructions generate a chain of microinstructions in the control memory. Their
function is to generate micro-operations that can fetch instructions from the main memory,
compute the effective address, execute the operation, and return control to fetch phase and
continue the cycle.

Here, the control is presumed to be a Read-Only Memory (ROM), where all the control
information is stored permanently. ROM provides the address of the microinstruction. The other
register, that is, the control data register stores the microinstruction that is read from the memory.
It consists of a control word that holds one or more micro-operations for the data processor.
The next address must be computed once this operation is completed. It is computed in the next
address generator. Then, it is sent to the control address register to be read. The next address
generator is also known as the microprogram sequencer. Based on the inputs to a sequencer, it
determines the address of the next microinstruction. The microinstructions can be specified in
several ways.
The main functions of a microprogram sequencer are as follows −

 It can increment the control register by one.


 It can load the address from the control memory to the control address register.
 It can transfer an external address or load an initial address to begin the start operation.

ADDRESS SEQUENCING
24
The control memory is used to store the microinstructions in groups. Here each group is used to
specify a routine. The control memory of each computer has the instructions which contain their
micro-programs routine. These micro-programs are used to generate the micro-operations that
will be used to execute the instructions. Suppose the address sequencing of control memory is
controlled by the hardware. In that case, that hardware must be capable to branch from one
routine to another routine and also able to apply sequencing of microinstructions within a routine.
When we try to execute a single instruction of computer, the control must undergo the following
steps:

o When the power of a computer is turned on, we have to first load an initial address into the
CAR (control address register). This address can be described as the first microinstruction
address. With the help of this address, we are able to activate the instruction fetch routine.
o Then, the control memory will go through the routine, which will be used to find out the
effective address of operand.
o In the next step, a micro-operation will be generated, which will be used to execute the
instruction fetched from memory.

We are able to transform the bits of instruction code into an address with the help of control
memory where routine is located. This process can be called the mapping process. The control
memory required the capabilities of address sequencing, which is described as follows:

o On the basis of the status bit conditions, the address sequencing selects the conditional
branch or unconditional branch.
o Addressing sequence is able to increment the CAR (Control address register).
o It provides the facility for subroutine calls and returns.
o A mappings process is provided by the addressing sequence from the instructions bits to a
control memory address.

25
In the above diagram, we can see a block diagram of a control memory and associative hardware,
which is required for selecting the address of next microinstruction. The microinstruction is used to
contain a set of bits in the control memory. With the help of some bits, we are able to start the
micro-operations in a computer register. The remaining bits of microinstruction are used to specify
the method by which we are able to obtain the next address.

In this diagram, we can also see that the control address register are able to recover their address
with the help of four different directions. The CAR is incremented with the help of incrementer and
then chooses the next instruction. The branching address will be determined in the multiple fields
of microinstruction so that they can provide results in branching.

If there are status bits of microinstruction and we want to apply conditions on them, in this case,
we can use conditional branching. An external address can be shared with the help of a mapping
logic circuit. The return address will be saved by a special register. This saved address will be
helpful when the micro-program requires returning from the subroutine. At that time, it requires
the value from the unique register.

INSTRUCTION BASED ON CPU ORGANIZATION:


 Operation field specifies the operation to be performed like addition.
 Address field which contains the location of the operand, i.e., register or memory
location.
 Mode field which specifies how operand is to be founded.

Instruction is of variable length depending upon the number of addresses it contains.


Generally, CPU organization is of three types based on the number of address fields:
Single Accumulator organization
26
General register organization
Stack organization

(i) Single Accumulator Organization: In this type of organization all operations are performed
on an implied accumulator. The instruction format uses only one address field. For example, the
instruction that loads the accumulator with the contents of a memory location.
Load X
Where X is the address of the source operand. This results in the operation AC ÷— M (X). AC is the
accumulator and M(X) symbolizes the memory word located at address X.

(ii) General Register Organisation : In this organization, the instruction format needs 2 or 3
register address fields according to the operation.
For example, an instruction for addition may be written as
ADD R1, R2, R3,
It denotes the operation R1 <—R2 -f- R3
The same ADD instruction needs only two register address fields if the destination register is one of the
source registers, i.e. if the operation is
R1 R1 + R2
Then the instruction is ADD R1, R2
The instruction may also contain one memory address field and one register address field. For
example, the instruction,
ADD R1, X
Specifies the operation R1 —R1 + M [X]

(iii) Stack Organization : In this organization, the computers will have PUSH and POP instructions
which require an address field. For Kathplê, the instruction PUSH X will push the word at address Xonto
the top of the stack. The operation — type instructions do not need any address field. For example, the
instruction
ADD
Consists of only opcode and no address field. It has the effect of popping the top two numbers from
the stack, adding them, and pushing the sum onto the stack. Thus all the operands are implied to be in
the stack.

INSTRUCTION FORMAT
The set of instructions that manages the operation codes is called the format of instruction. The
design of bits in instruction is supported by the format of instruction. The length of instruction is generally
preserved in multiples of character, which is 8bits. The instruction format determines the behaviour and
complexity of instruction. Depending upon the number of addresses, the format of instruction is of variable
length.

27
Types Of Instruction Format
Types of instruction formats are :

1. Zero(0) Address Instruction format

 The instruction format in which there is no address field is called zero address [Link] zero address
instruction format, stacks are used
 In zero order instruction format, there is no operand
 Expression: X = (A+B)*(C+D)
 Postfixed : X = AB+CD+*
 TOP means top of stack
 M[X] is any memory location

PUSH A TOP = A

PUSH B TOP = B

ADD TOP = A+B

PUSH C TOP = C

PUSH D TOP = D

ADD TOP = C+D

MUL TOP = (C+D)*(A+B)

POP X M[X] = TOP

28
2. One(1) Address Instruction format

 The instruction format in which the instruction uses only one address field is called the one address instruction
format
 In this type of instruction format, one operand is in the accumulator and the other is in the memory location
 It has only one operand
 It has two special instructions LOAD and STORE
 Expression: X = (A+B)*(C+D)
 AC is accumulator
 M[] is any memory location
 M[T] is temporary location

LOAD A AC = M[A]

ADD B AC = AC + M[B]

STORE T M[T] = AC

LOAD C AC = M[C]

ADD D AC = AC + M[D]

MUL T AC = AC * M[T]

STORE X M[X] = AC

3. Two(2) Address Instruction format

 The instruction format in which the instruction uses only two address fields is called the two address instruction
format
 This type of instruction format is the most commonly used instruction format
 As in one address instruction format, the result is stored in the accumulator only, but in two addresses instruction
format the result can be stored in different locations
 This type of instruction format has two operands
 It requires shorter assembly language instructions
 Expression: X = (A+B)*(C+D)
 R1, R2 are registers
 M[] is any memory location

MOV R1, A R1 = M[A]

ADD R1, B R1 = R1 + M[B]


29
MOV R2, C R2 = C

ADD R2, D R2 = R2 + D

MUL R1, R2 R1 = R1 * R2

MOV X, R1 M[X] = R1

4. Three(3) Address Instruction format

 The instruction format in which the instruction uses the three address fields is called the three address
instruction format
 It has three operands
 It requires shorter assembly language instructions
 It requires more bits
 Expression: X = (A+B)*(C+D)
 R1, R2 are registers
 M[] is any memory location

ADD R1, A, B R1 = M[A] + M[B]

ADD R2, C, D R2 = M[C] + M[D]

MUL X, R1, R2 M[X] = R1 * R2

ADDRESSING MODES:
The term addressing modes refers to the way in which the operand of an
instruction is specified. The addressing mode specifies a rule for interpreting or modifying
the address field of the instruction before the operand is actually executed.
The operands of the instructions can be located either in the main memory or in the CPU registers.
If the operand is placed in the main memory, then the instruction provides the location address in the
operand field. Many methods are followed to specify the operand address. The different methods/modes for
specifying the operand address in the instructions are known as addressing modes.

Types of Addressing Modes


1. Implied / Implicit Addressing Mode
2. Stack Addressing Mode
3. Immediate Addressing Mode
4. Direct Addressing Mode
30
5. Indirect Addressing Mode
6. Register Direct Addressing Mode
7. Register Indirect Addressing Mode
8. Relative Addressing Mode
9. Indexed Addressing Mode
10. Base Register Addressing Mode
11. Auto-Increment Addressing Mode
12. Auto-Decrement Addressing Mode

1. Implied Addressing Mode-

In this addressing mode,


 The definition of the instruction itself specify the operands implicitly.
 It is also called as implicit addressing mode.
Examples-
 The instruction “Complement Accumulator” is an implied mode instruction.
 In a stack organized computer, Zero Address Instructions are implied mode instructions.
(since operands are always implied to be present on the top of the stack)

2. Stack Addressing Mode-

In this addressing mode,


 The operand is contained at the top of the stack.
Example-
ADD
 This instruction simply pops out two symbols contained at the top of the stack.
 The addition of those two operands is performed.
 The result so obtained after addition is pushed again at the top of the stack.

3. Immediate Addressing Mode-


In this addressing mode,
 The operand is specified in the instruction explicitly.
 Instead of address field, an operand field is pres

Examples-
 ADD 10 will increment the value stored in the accumulator by 10.
 MOV R #20 initializes register R to a constant value 20.
31
4. Direct Addressing Mode-
In this addressing mode,
 The address field of the instruction contains the effective address of the operand.
 Only one reference to memory is required to fetch the operand.
 It is also called as absolute addressing mode.

Example-

 ADD X will increment the value stored in the accumulator by the value stored at memory location X.
AC ← AC + [X]

5. Indirect Addressing Mode-


In this addressing mode,
 The address field of the instruction specifies the address of memory location that contains the
effective address of the operand.
 Two references to memory are required to fetch the operand.

Example-
 ADD X will increment the value stored in the accumulator by the value stored at memory location
specified by X.
AC ← AC + [[X]]

6. Register Direct Addressing Mode-

In this addressing mode,


 The operand is contained in a register set.
32
 The address field of the instruction refers to a CPU register that contains the operand.
 No reference to memory is required to fetch the operand.

Example-
 ADD R will increment the value stored in the accumulator by the content of register R.
AC ← AC + [R]

 This addressing mode is similar to direct addressing mode.


 The only difference is address field of the instruction refers to a CPU register instead of main
memory.

7. Register Indirect Addressing Mode-


In this addressing mode,
 The address field of the instruction refers to a CPU register that contains the effective address of the
operand.
 Only one reference to memory is required to fetch the operand.

Example-
 ADD R will increment the value stored in the accumulator by the content of memory location
specified in register R.
AC ← AC + [[R]]

NOTE-

It is interesting to note-
 This addressing mode is similar to indirect addressing mode. 33
 The only difference is address field of the instruction refers to a CPU register.

8. Relative Addressing Mode-

In this addressing mode,


 Effective address of the operand is obtained by adding the content of program counter with the
address part of the instruction.

Effective Address

= Content of Program Counter + Address part of the instruction

NOTE-

 Program counter (PC) always contains the address of the next instruction to be executed.
 After fetching the address of the instruction, the value of program counter immediately increases.
 The value increases irrespective of whether the fetched instruction has completely executed or not.

9. Indexed Addressing Mode-

In this addressing mode,


 Effective address of the operand is obtained by adding the content of index register with the address
part of the instruction.

Effective Address

= Content of Index Register + Address part of the instruction

34
10. Base Register Addressing Mode-
In this addressing mode,
 Effective address of the operand is obtained by adding the content of base register with the address
part of the instruction.

Effective Address

= Content of Base Register + Address part of the instruction

11. Auto-Increment Addressing Mode-


 This addressing mode is a special case of Register Indirect Addressing Mode where-

Effective Address of the Operand

= Content of Register

In this addressing mode,


 After accessing the operand, the content of the register is automatically incremented by step size ‘d’.
 Step size ‘d’ depends on the size of operand accessed.
 Only one reference to memory is required to fetch the operand.
35
Example-

Assume operand size = 2 bytes.


Here,
 After fetching the operand 6B, the instruction register R AUTO will be automatically incremented by 2.
 Then, updated value of RAUTO will be 3300 + 2 = 3302.
 At memory address 3302, the next operand will be found.

NOTE-

In auto-increment addressing mode,


 First, the operand value is fetched.
 Then, the instruction register R AUTO value is incremented by step size ‘d’.

12. Auto-Decrement Addressing Mode-

 This addressing mode is again a special case of Register Indirect Addressing Mode where-

Effective Address of the Operand

= Content of Register – Step Size

In this addressing mode,


 First, the content of the register is decremented by step size ‘d’.
 Step size ‘d’ depends on the size of operand accessed.
36
 After decrementing, the operand is read.
 Only one reference to memory is required to fetch the operand.

Example-

Assume operand size = 2 byt es.


Here,
 First, the instruction register R AUTO will be decremented by 2.
 Then, updated value of RAUTO will be 3302 – 2 = 3300.
 At memory address 3300, the operand will be found.

Data Transfer Instructions


Data transfer instructions transfer the data between memory and processor registers,
processor registers, and I/O devices, and from one processor register to another. There are eight
commonly used data transfer instructions. Each instruction is represented by a mnemonic symbol.
The table shows the eight data transfer instructions and their respective mnemonic symbols.
Data Transfer Instructions

Name Mnemonic Symbols

Load LD

Store ST

Move MOV

Exchange XCH

Input In

Output OUT
37
Name Mnemonic Symbols

Push PUSH

Pop POP
The instructions can be described as follows −

 Load − The load instruction is used to transfer data from the memory to a processor
register, which is usually an accumulator.
 Store − The store instruction transfers data from processor registers to memory.
 Move − The move instruction transfers data from processor register to memory or memory
to processor register or between processor registers itself.
 Exchange − The exchange instruction swaps information either between two registers or
between a register and a memory word.
 Input − The input instruction transfers data between the processor register and the input
terminal.
 Output − The output instruction transfers data between the processor register and the
output terminal.
 Push and Pop − The push and pop instructions transfer data between a processor register
and memory stack.

Data manipulation
Data manipulation instructions are those instructions that manipulate or change the content
of the data/registers/memory. It performs operations on data and provides the computational
capabilities of the Computer.
Data manipulation instructions can be categorized into three parts:
1) Arithmetic instruction
2) Logical and bit manipulation instructions
3) Shift instructions

Arithmetic Instruction
Arithmetic instructions include increment, decrement, add, subtract, multiply, divide, add with
Carry, subtract with Borrow, negate that is (2’s) two's complement. If there’s a negative number, it
is considered as negate (so two's complement).
The table given below shows the Arithmetic Instructions:
Name Mnemonic

Increment INC

Decrement DEC

Add ADD

Subtract SUB

Multiply MUL

Divide DIV
38
Add with carry ADDC

Subtract with borrow SUBB

Logical Instruction

We are having another list of instructions that is logical and bit manipulation instructions
starting with clear (that means clear the content of accumulator), complement the accumulator,
AND, OR, Exclusive-OR, Clear carry, Set carry, Complement carry, Enable interrupts, Disable
interrupts, all these are logical and bit manipulation instructions.
These logical instructions consider each operand bit individually and treat it as a Boolean
variable. Basically, logical instructions help perform binary operations on strings of bits stored in
registers.

Name Mnemonic

Clear CLR
Complement COM

AND AND

OR OR
Exclusive-OR XOR

Clear carry CLRC

Set Carry SETC

Complement Carry COMC

Enable Interrupt EI

Disable Interrupt DI

Shift Instructions
There are basically two types of shift instructions — arithmetic and logical. Arithmetic shifts
consider the contents of the memory byte or register to be a signed number. So, when the shift is
made, the number is arithmetically divided by two (right shift) or multiplied by two (left shift).
Logical shifts consider the contents of the register or memory byte to be just a bit pattern when the
shift is made.

Name Mnemonic

Logical Shift Right SHR

Logical Shift Left SHL

Arithmetic Shift Right SHRA

39
Arithmetic Shift Left SHLA

Rotate Right ROR

Rotate Left ROL

Rotate Right through carry RORC

Rotate Left through carry ROLC

Program Control Instructions

Program Control Instructions are the machine code that are used by machine or in
assembly language by user to command the processor act accordingly. These
instructions are of various types. These are used in assembly language by user also. But
in level language, user code is translated into machine code and thus instructions are
passed to instruct the processor do the task.
Types of Program Control Instructions:
There are different types of Program Control Instructions:
1. Compare Instruction:
Compare instruction is specifically provided, which is similar to a subtract instruction
except the result is not stored anywhere, but flags are set according to the result.
Example:
JUMP L2
Mov R3, R1 goto L2
3. Conditional Branch Instruction:
A conditional branch instruction is used to examine the values stored in the condition
code register to determine whether the specific condition exists and to branch if it does.

Example:
Assembly Code : BE R1, R2, L1
Compiler allocates R1 for x and R2 for y
High Level Code: if (x==y) goto L1;
4. Subroutines:
A subroutine is a program fragment that lives in user space, performs a well-defined task.
It is invoked by another user program and returns control to the calling program when
finished.

Example:
CALL and RET
5. Halting Instructions:

 NOP Instruction – NOP is no operation. It cause no change in the processor state


other than an advancement of the program counter. It can be used to synchronize
40
timing.

 HALT – It brings the processor to an orderly halt, remaining in an idle state until
restarted by interrupt, trace, reset or external action.

6. Interrupt Instructions:
Interrupt is a mechanism by which an I/O or an instruction can suspend the normal
execution of processor and get itself serviced.

 RESET – It reset the processor. This may include any or all setting registers to an
initial value or setting program counter to standard starting location.
 TRAP – It is non-maskable edge and level triggered interrupt. TRAP has the highest
priority and vectored interrupt.
 INTR – It is level triggered and maskable interrupt. It has the lowest priority. It can be
disabled by resetting the processor.

Types of Instruction Set


Generally, there are two types of instruction set used in computers.
Reduced Instruction set Computer (RISC)

A number of computer designers recommended that computers use fewer instructions with
simple constructs so that they can be executed much faster within the CPU without having to use
memory as often. This type of computer is called a Reduced Instruction Set Computer.
The concept of RISC involves an attempt to reduce execution time by simplifying the instruction
set of computers.

Characteristics of RISC
The characteristics of RISC are as follows −
 Relatively few instructions.
 Relatively few addressing modes.
 Memory access limited to load and store instructions.
 All operations done within the register of the CPU.
 Single-cycle instruction execution.
41
 Fixed length, easily decoded instruction format.
 Hardwired rather than micro programmed control.
A characteristic of RISC processors’ ability is to execute one instruction per clock cycle. This is
done by overlapping the fetch, decode and execute phases of two or three instructions by using a
procedure referred as pipelining.

Complex Instruction Set Computer (CISC)

CISC is a computer where a single instruction can perform numerous low-level operations
like a load from memory and a store from memory, etc. The CISC attempts to minimize the
number of instructions per program but at the cost of an increase in the number of cycles per
instruction.
The design of an instruction set for a computer must take into consideration not only machine
language constructs but also the requirements imposed on the use of high level programmin g
languages.
The goal of CISC is to attempt to provide a single machine instruction for each statement that is
written in a high level language.

Characteristics of CISC
The characteristics of CISC are as follows −
 A large number of instructions typically from 100 to 250 instructions.
 Some instructions that perform specialized tasks and are used infrequently.
 A large variety of addressing modes- typically from 5 to 20 different modes.
 Variable length instruction formats.
 Instructions that manipulate operands in memory.
Example
For performing an ADD operation, CISC will execute a single ADD command which will execute
all the required load and store operations.
RISC will execute each operation for loading data from memory, adding values and storing data
back to memory using different low-level instructions.

42
UNIT-III

Peripheral Devices:
The Input / output organization of computer depends upon the size of computer and the
peripherals connected to it. The I/O Subsystem of the computer, provides an efficient modeof
communication between the central system and the outside environment

The most common input output devices are:

i) Monitor

ii) Keyboard

iii) Mouse

iv) Printer

v) Magnetic tapes

The devices that are under the direct control of the computer are said to be connected
online.

Input - Output Interface


Input Output Interface provides a method for transferring information between internal
storage and external I/O devices.

Peripherals connected to a computer need special communication links for interfacing them
with the central processing unit.

The purpose of communication link is to resolve the differences that exist between the
central computer and each peripheral.

The Major Differences are:-

1. Peripherals are electromechnical and electromagnetic devices and CPU and


memory are electronic devices. Therefore, a conversion of signal values may be
needed.

2. The data transfer rate of peripherals is usually slower than the transfer rate of CPU
and consequently, a synchronization mechanism may be needed.

3. Data codes and formats in the peripherals differ from the word format in the CPU and
memory.
4. The operating modes of peripherals are different from each other and must be
controlled so as not to disturb the operation of other peripherals connected to
theCPU.

To Resolve these differences, computer systems include special hardware components 43


between the CPU and Peripherals to supervises and synchronizes all input and out transfers

 These components are called Interface Units because they interface between the
processor bus and the peripheral devices.

I/O BUS and Interface Module

It defines the typical link between the processor and several peripherals.

The I/O Bus consists of data lines, address lines and control lines. The

I/O bus from the processor is attached to all peripherals interface.

To communicate with a particular device, the processor places a device address on address
lines.

Each Interface decodes the address and control received from the I/O bus, interprets them for
peripherals and provides signals for the peripheral controller.

It is also synchronizes the data flow and supervises the transfer between peripheral and
processor.

Each peripheral has its own controller.

For example, the printer controller controls the paper motion, the print timing The

control lines are referred as I/O command. The commands are as following:

Control command- A control command is issued to activate the peripheral and to inform it
what to do.

Status command- A status command is used to test various status conditions in the interface
and the peripheral.

Data Output command- A data output command causes the interface to respond by
transferring data from the bus into one of its registers.

Data Input command- The data input command is the opposite of the data output.

In this case the interface receives on item of data from the peripheral and places it in its
buffer register. I/O Versus Memory Bus

44
To communicate with I/O, the processor must communicate with the memory unit. Like the
I/O bus, the memory bus contains data, address and read/write control lines. There are 3 ways
that computer buses can be used to communicate with memory and I/O:

i. Use two Separate buses , one for memory and other for I/O.

ii. Use one common bus for both memory and I/O but separate control lines for each.

iii. Use one common bus for memory and I/O with common control

lines.I/O Processor

In the first method, the computer has independent sets of data, address and control buses
one for accessing memory and other for I/O. This is done in computers that provides a
separate I/O processor (IOP). The purpose of IOP is to provide an independent pathway for
the transfer of information between external device and internal memory.

Asynchronous Data Transfer :


This Scheme is used when speed of I/O devices do not match with microprocessor,
and timing characteristics of I/O devices is not predictable. In this method, process initiates
the device and check its status. As a result, CPU has to wait till I/O device is ready to
transfer data. When device is ready CPU issues instruction for I/O transfer. In this method
two typesof techniques are used based on signals before data transfer.

i. Strobe Control

ii. Handshaking

45
Strobe Signal :
The strobe control method of Asynchronous data transfer employs a single control line to
time each transfer. The strobe may be activated by either the source or the destination unit.

Data Transfer Initiated by Source Unit:

In the block diagram fig. (a), the data bus carries the binary information from source to
destination unit. Typically, the bus has multiple lines to transfer an entire byte or word. The
strobe is a single line that informs the destination unit when a valid data word is available.

The timing diagram fig. (b) the source unit first places the data on the data
bus. The information on the data bus and strobe signal remain in the active state to allow the
destination unit to receive the data.

Data Transfer Initiated by Destination Unit:

In this method, the destination unit activates the strobe pulse, to informing the source to
provide the data. The source will respond by placing the requested binary information on the
data bus.

The data must be valid and remain in the bus long enough for the destination
unit to accept it. When accepted the destination unit then disables the strobe and the source
unit removes the data from the bus.

46
Disadvantage of Strobe Signal :

The disadvantage of the strobe method is that, the source unit initiates the transfer has no way
of knowing whether the destination unit has actually received the data item that was places in
the bus. Similarly, a destination unit that initiates the transfer has no way of knowing whether
the source unit has actually placed the data on bus. The Handshaking method solves this
problem.

Handshaking:

The handshaking method solves the problem of strobe method by introducing a second
control signal that provides a reply to the unit that initiates the transfer.

Principle of Handshaking:

The basic principle of the two-wire handshaking method of data transfer is as follow:

One control line is in the same direction as the data flows in the bus from the source to
destination. It is used by source unit to inform the destination unit whether there a valid data
in the bus. The other control line is in the other direction from the destination to the source. It
is used by the destination unit to inform the source whether it can accept the data. The
sequence of control during the transfer depends on the unit that initiates the transfer.

Source Initiated Transfer using Handshaking:

The sequence of events shows four possible states that the system can be at any given time.
The source unit initiates the transfer by placing the data on the bus and enabling its data valid
signal. The data accepted signal is activated by the destination unit after it accepts the data
from the bus. The source unit then disables its data accepted signal and the system goes into
its initial state.

47
Destination Initiated Transfer Using Handshaking:

The name of the signal generated by the destination unit has been changed to ready for data
to reflects its new meaning. The source unit in this case does not place data on the bus until
after it receives the ready for data signal from the destination unit. From there on, the
handshaking procedure follows the same pattern as in the source initiated case.

The only difference between the Source Initiated and the Destination Initiated transfer is in
their choice of Initial sate.

48
Advantage of the Handshaking method:

 The Handshaking scheme provides degree of flexibility and reliability because


thesuccessful completion of data transfer relies on active participation by both
units.

 If any of one unit is faulty, the data transfer will not be completed. Such an error
can be detected by means of a Timeout mechanism which provides an alarm if the
data isnot completed within time.

Asynchronous Serial Transmission:


The transfer of data between two units is serial or parallel. In parallel data transmission, n bit
in the message must be transmitted through n separate conductor path. In serial transmission,
each bit in the message is sent in sequence one at a time.

Parallel transmission is faster but it requires many wires. It is used for short distances and
where speed is important. Serial transmission is slower but is less expensive.

In Asynchronous serial transfer, each bit of message is sent a sequence at a time, and binary
information is transferred only when it is available. When there is no information to be
transferred, line remains idle.

In this technique each character consists of three points :

i. Start bit

ii. Character bit

iii. Stop bit

i. Start Bit- First bit, called start bit is always zero and used to indicate the beginning
character.

ii. Stop Bit- Last bit, called stop bit is always one and used to indicate end of
characters. Stop bit is always in the 1- state and frame the end of the characters
tosignify the idle or wait state.

iii. Character Bit- Bits in between the start bit and the stop bit are known as
characterbits. The character bits always follow the start bit.

Serial Transmission of Asynchronous is done by two ways:


49
a) Asynchronous Communication Interface

b) First In First out Buffer

Asynchronous Communication Interface:

It works as both a receiver and a transmitter. Its operation is initialized by CPU by sending a
byte to the control register.

The transmitter register accepts a data byte from CPU through the data bus and
transferred to a shift register for serial transmission.

The receive portion receives information into another shift register, and when a
complete data byte is received it is transferred to receiver register.

CPU can select the receiver register to read the byte through the data bus. Data in the
status register is used for input and output flags.

First In First Out Buffer (FIFO):

A First In First Out (FIFO) Buffer is a memory unit that stores information in such a manner
that the first item is in the item first out. A FIFO buffer comes with separate input and output
terminals. The important feature of this buffer is that it can input data and output data at two
different rates.

When placed between two units, the FIFO can accept data from the source unit at one rate,
rate of transfer and deliver the data to the destination unit at another rate.

If the source is faster than the destination, the FIFO is useful for source data arrive in
bursts that fills out the buffer. FIFO is useful in some applications when data are transferred
asynchronously.

Modes of Data Transfer :

Transfer of data is required between CPU and peripherals or memory or sometimes between
any two devices or units of your computer system. To transfer a data from one unit to
another one should be sure that both units have proper connection and at the time of data
transfer the receiving unit is not busy. This data transfer with the computer is Internal
Operation.

All the internal operations in a digital system are synchronized by means of clock pulses
supplied by a common clock pulse Generator. The data transfer can be

i. Synchronous or

ii. Asynchronous

When both the transmitting and receiving units use same clock pulse then such a data transfer
is called Synchronous process. On the other hand, if the there is not concept of clock pulses
50
and the sender operates at different moment than the receiver then such a data transfer is
called Asynchronous data transfer.

The data transfer can be handled by various modes. some of the modes use CPU as an
intermediate path, others transfer the data directly to and from the memory unit and this can
be handled by 3 following ways:

i. Programmed I/O

ii. Interrupt-Initiated I/O

iii. Direct Memory Access (DMA)

Programmed I/O Mode:

In this mode of data transfer the operations are the results in I/O instructions which is a
part of computer program. Each data transfer is initiated by a instruction in the program.
Normally the transfer is from a CPU register to peripheral device or vice-versa.

Once the data is initiated the CPU starts monitoring the interface to see when next transfer
can made. The instructions of the program keep close tabs on everything that takes place in
the interface unit and the I/O devices.

⚫ The transfer of data requires three instructions:

51
In this technique CPU is responsible for executing data from the memory for output
and storing data in memory for executing of Programmed I/O as shown in Flowchart-:

Drawback of the Programmed I/O :

The main drawback of the Program Initiated I/O was that the CPU has to monitor the units all
the times when the program is executing. Thus the CPU stays in a program loop until the I/O
unit indicates that it is ready for data transfer. This is a time consuming process and the CPU
time is wasted a lot in keeping an eye to the executing of program.

To remove this problem an Interrupt facility and special commands are used.

Interrupt-Initiated I/O :

In this method an interrupt facility an interrupt command is used to inform the device about
the start and end of transfer. In the meantime the CPU executes other program. When the
interface determines that the device is ready for data transfer it generates an Interrupt Request
and sends it to the computer.

When the CPU receives such an signal, it temporarily stops the execution of the program and
branches to a service program to process the I/O transfer and after completing it returns back
to task, what it was originally performing.

⚫ In this type of IO, computer does not check the flag. It continue to perform its task.

52
⚫ Whenever any device wants the attention, it sends the interrupt signal to the CPU.

⚫ CPU then deviates from what it was doing, store the return address from PC
andbranch to the address of the subroutine.

⚫ There are two ways of choosing the branch address:

⚫ Vectored Interrupt

⚫ Non-vectored Interrupt

⚫ In vectored interrupt the source that interrupt the CPU provides the
branchinformation. This information is called interrupt vectored.

⚫ In non-vectored interrupt, the branch address is assigned to the fixed address in


thememory.

Priority Interrupt:
⚫ There are number of IO devices attached to the computer.

⚫ They are all capable of generating the interrupt.

⚫ When the interrupt is generated from more than one device, priority interrupt
systemis used to determine which device is to be serviced first.

⚫ Devices with high speed transfer are given higher priority and slow devices are
givenlower priority.

⚫ Establishing the priority can be done in two ways:

⚫ Using Software

⚫ Using Hardware

⚫ A pooling procedure is used to identify highest priority in software means.

Polling Procedure :
⚫ There is one common branch address for all interrupts.

⚫ Branch address contain the code that polls the interrupt sources in sequence.
Thehighest priority is tested first.

⚫ The particular service routine of the highest priority device is served.

⚫ The disadvantage is that time required to poll them can exceed the time to serve
themin large number of IO devices.

Using Hardware:

⚫ Hardware priority system function as an overall manager.


53
⚫ It accepts interrupt request and determine the priorities.

⚫ To speed up the operation each interrupting devices has its own interrupt vector.

⚫ No polling is required, all decision are established by hardware priority interrupt unit.

⚫ It can be established by serial or parallel connection of interrupt lines.

Serial or Daisy Chaining Priority:

⚫ Device with highest priority is placed first.

⚫ Device that wants the attention send the interrupt request to the CPU.

⚫ CPU then sends the INTACK signal which is applied to PI(priority in) of the first
device.

⚫ If it had requested the attention, it place its VAD(vector address) on the bus. And
itblock the signal by placing 0 in PO(priority out)

⚫ If not it pass the signal to next device through PO(priority out) by placing 1.

⚫ This process is continued until appropriate device is found.

⚫ The device whose PI is 1 and PO is 0 is the device that send the interrupt request.

Parallel Priority Interrupt :

⚫ It consist of interrupt register whose bits are set separately by the interrupting devices.

⚫ Priority is established according to the position of the bits in the register.

54
⚫ Mask register is used to provide facility for the higher priority devices to
interruptwhen lower priority device is being serviced or disable all lower priority
devices when higher is being serviced.

⚫ Corresponding interrupt bit and mask bit are ANDed and applied to priority encoder.

⚫ Priority encoder generates two bits of vector address.

⚫ Another output from it sets IST(interrupt status flip flop).

The Execution process of Interrupt–Initiated I/O is represented in the flowchart

55
Direct Memory Access (DMA):
In the Direct Memory Access (DMA) the interface transfer the data into and out of the
memory unit through the memory bus. The transfer of data between a fast storage device such
as magnetic disk and memory is often limited by the speed of the CPU. Removing the CPU
from the path and letting the peripheral device manage the memory buses directly would
improve the speed of transfer. This transfer technique is called Direct Memory Access
(DMA).

During the DMA transfer, the CPU is idle and has no control of the memory buses. A DMA
Controller takes over the buses to manage the transfer directly between the I/O device and
memory.

The CPU may be placed in an idle state in a variety of ways. One common method
extensively used in microprocessor is to disable the buses through special control signals
such as:

 Bus Request (BR)

 Bus Grant (BG)

These two control signals in the CPU that facilitates the DMA transfer. The Bus Request
(BR) input is used by the DMA controller to request the CPU. When this input is active, the
CPU terminates the execution of the current instruction and places the address bus, data bus

56
and read write lines into a high Impedance state. High Impedance state means that the

outputis disconnected.

The CPU activates the Bus Grant (BG) output to inform the external DMA that the Bus
Request (BR) can now take control of the buses to conduct memory transfer without
processor.

When the DMA terminates the transfer, it disables the Bus Request (BR) line. The CPU
disables the Bus Grant (BG), takes control of the buses and return to its normal operation.

The transfer can be made in several ways that are:

i. DMA Burst

ii. Cycle Stealing

i) DMA Burst :- In DMA Burst transfer, a block sequence consisting of a number of


memory words is transferred in continuous burst while the DMA controller is
master of the memory buses.

ii) Cycle Stealing :- Cycle stealing allows the DMA controller to transfer one data word
at a time, after which it must returns control of the buses to the CPU.

DMA Controller:

The DMA controller needs the usual circuits of an interface to communicate with the
CPU and I/O device. The DMA controller has three registers:

i. Address Register

ii. Word Count Register

iii. Control Register


57
i. Address Register :- Address Register contains an address to specify the
desiredlocation in memory.

ii. Word Count Register :- WC holds the number of words to be transferred. The
register is incre/decre by one after each word transfer and internally tested for
zero.

i. Control Register :- Control Register specifies the mode of transfer

The unit communicates with the CPU via the data bus and control lines. The
registers in the DMA are selected by the CPU through the address bus by enabling the
DS (DMA select) and RS (Register select) inputs. The RD (read) and WR (write)
inputs are bidirectional.
When the BG (Bus Grant) input is 0, the CPU can communicate
with the DMA registers through the data bus to read from or write to the DMA
registers. When BG =1, the DMA can communicate directly with the memory by
specifying an address in the address bus and activating the RD or WR control.

DMA Transfer:

The CPU communicates with the DMA through the address and data buses as with
any interface unit. The DMA has its own address, which activates the DS and RS
lines. The CPU initializes the DMA through the data bus. Once the DMA receives the
start control command, it can transfer between the peripheral and the memory.

58
When BG = 0 the RD and WR are input lines allowing the CPU to
communicate with the internal DMA registers. When BG=1, the RD and WR are
output lines from the DMA controller to the random access memory to specify the
read or write operation of data.
Summary :
 Interface is the point where a connection is made between two different parts
of asystem.
 The strobe control method of Asynchronous data transfer employs a single
controlline to time each transfer.
 The handshaking method solves the problem of strobe method by introducing
asecond control signal that provides a reply to the unit that initiates the
transfer.
 Programmed I/O mode of data transfer the operations are the results in
I/Oinstructions which is a part of computer program.
 In the Interrupt Initiated I/O method an interrupt facility an interrupt command is
usedto inform the device about the start and end of transfer.
 In the Direct Memory Access (DMA) the interface transfer the data into and out of
thememory unit through the memory bus.

Input-Output Processor:

⚫ It is a processor with direct memory access capability that communicates with


IOdevices.

⚫ IOP is similar to CPU except that it is designed to handle the details of IO operation.

⚫ Unlike DMA which is initialized by CPU, IOP can fetch and execute its own
instructions.

⚫ IOP instruction are specially designed to handle IO operation.

59
⚫ Memory occupies the central position and can communicate with each processor
byDMA.

⚫ CPU is responsible for processing data.

⚫ IOP provides the path for transfer of data between various peripheral devices
andmemory.

⚫ Data formats of peripherals differ from CPU and memory. IOP maintain such
problems.

⚫ Data are transfer from IOP to memory by stealing one memory cycle.

⚫ Instructions that are read from memory by IOP are called commands to
distinguishthem from instructions that are read by the CPU.

Instruction that are read from memory by an IOP

» Distinguish from instructions that are read by the CPU

» Commands are prepared by experienced programmers and are


stored in memory

» Command word = IOP program

60
MEMORY HEIRARCHY
The Computer memory hierarchy looks like a pyramid structure which is used to describe the differences among memory
types. It separates the computer storage based on hierarchy.
Level 0: CPU registers
Level 1: Cache memory
Level 2: Main memory or primary memory
Level 3: Magnetic disks or secondary memory
Level 4: Optical disks or magnetic types or tertiary Memory

In Memory Hierarchy the cost of memory, capacity is inversely proportional to speed. Here the devices are arranged in a
manner Fast to slow, that is form register to Tertiary memory.
Let us discuss each level in detail:
Level-0 − Registers
The registers are present inside the CPU. As they are present inside the CPU, they have least access time. Registers are
most expensive and smallest in size generally in kilobytes. They are implemented by using Flip-Flops.
Level-1 − Cache
Cache memory is used to store the segments of a program that are frequently accessed by the processor. It is expensive
and smaller in size generally in Megabytes and is implemented by using static RAM.
Level-2 − Primary or Main Memory
It directly communicates with the CPU and with auxiliary memory devices through an I/O processor. Main memory is less
expensive than cache memory and larger in size generally in Gigabytes. This memory is implemented by using dynamic
RAM.
Level-3 − Secondary storage

61
Secondary storage devices like Magnetic Disk are present at level 3. They are used as backup storage. They are
cheaper than main memory and larger in size generally in a few TB.
Level-4 − Tertiary storage
Tertiary storage devices like magnetic tape are present at level 4. They are used to store removable files and are the
cheapest and largest in size (1-20 TB).
Let us see the memory levels in terms of size, access time, bandwidth.
Level Register Cache Primary memory Secondary
memory
Bandwidth 4k to 32k MB/sec 800 to 5k MB/sec 400 to 2k MB/sec 4 to 32 MB/sec
Size Less than 1KB Less than 4MB Less than 2 GB Greater than 2 GB
Access time 2 to 5nsec 3 to 10 nsec 80 to 400 nsec 5ms
Managed by Compiler Hardware Operating system OS or user
Why memory Hierarchy is used in systems?
Memory hierarchy is arranging different kinds of storage present on a computing device based on speed of access. At the
very top, the highest performing storage is CPU registers which are the fastest to read and write to. Next is cache
memory followed by conventional DRAM memory, followed by disk storage with different levels of performance including
SSD, optical and magnetic disk drives.
To bridge the processor memory performance gap, hardware designers are increasingly relying on memory at the top of
the memory hierarchy to close / reduce the performance gap. This is done through increasingly larger cache hierarchies
(which can be accessed by processors much faster), reducing the dependency on main memory which is slower.

Main Memory

The main memory is the fundamental storage unit in a computer system. It is associatively large and quick memory and
saves programs and information during computer operations. The technology that makes the main memory work is based
on semiconductor integrated circuits.

RAM is the main memory. Integrated circuit Random Access Memory (RAM) chips are applicable i n
two possible operating modes are as follows −

 Static − It consists of internal flip-flops, which store the binary information. The stored data remains
solid considering power is provided to the unit. The static RAM is simple to use and has smaller read
and write cycles.
 Dynamic − It saves the binary data in the structure of electric charges that are used to capacitors. The
capacitors are made available inside the chip by Metal Oxide Semiconductor (MOS) transistors. The
stored value on the capacitors contributes to discharge with time and thus, the capacitors should be
regularly recharged through stimulating the dynamic memory.

Random Access Memory


The term Random Access Memory or RAM is typically used to refer to memory that is easily read
from and written to by the microprocessor. For a memory to be called random access, it should be
possible to access any address at any time. This differentiates RAM from storage devices such as
tapes or hard drives where the data is accessed sequentially.
RAM is the main memory of a computer. Its objective is to store data and applications that are
currently in use. The operating system controls the usage of this memory. It gives instructions like
when the items are to be loaded into RAM, where they are to be located in RA M, and when they
need to be removed from RAM.

62
Read-Only Memory
In each computer system, there should be a segment of memory that is fixed and unaffected by
power failure. This type of memory is known as Read-Only Memory or ROM.

SRAM
RAMs that are made up of circuits and can preserve the information as long as power is supplied are
referred to as Static Random Access Memories (SRAM). Flip-flops form the basic memory elements
in an SRAM device. An SRAM consists of an array of flip-flops, one for each bit. SRAM consists of
an array of flip-flops, a large number of flip-flops are needed to provide higher capacity memory.
Because of this, simpler flip-flop circuits, BJT, and MOS transistors are used for SRAM.

DRAM
SRAMs are faster but their cost is high because their cells require many transistors. RAMs can be
obtained at a lower cost if simpler cells are used. A MOS storage cell based on capacitors can be
used to replace the SRAM cells. Such a storage cell cannot preserve the charge (that is, data)
indefinitely and must be recharged periodically. Therefore, these cells are called dynamic storage
cells. RAMs using these cells are referred to as Dynamic RAMs or simply DRAMs.

Auxiliary Memory

An Auxiliary memory is referred to as the lowest-cost, highest-space, and slowest-approach storage in a computer
system. It is where programs and information are preserved for long-term storage or when not in direct use. The most
typical auxiliary memory devices used in computer systems are magnetic disks and tapes.

Magnetic Disks
A magnetic disk is a round plate generated of metal or plastic coated with magnetized material. There are both sides of
the disk are used and multiple disks can be stacked on one spindle with read/write heads accessible on each surface.
All disks revolve together at high speed and are not stopped or initiated for access purposes. Bits are saved in the
magnetized surface in marks along concentric circles known as tracks. The tracks are frequently divided into areas
known as sectors.
In this system, the lowest quantity of data that can be sent is a sector. The subdivision of one disk surface into tracks and
sectors is displayed in the figure.

63
Magnetic Tape
Magnetic tape transport includes the robotic, mechanical, and electronic components to support the methods and control
structure for a magnetic tape unit. The tape is a layer of plastic coated with a magnetic documentation medium.
Bits are listed as a magnetic stain on the tape along various tracks. There are seven or nine bits are recorded together to
form a character together with a parity bit. Read/write heads are mounted one in each track therefore that information can
be recorded and read as a series of characters.
Magnetic tape units can be stopped, initiated to move forward, or in the opposite, or it can be reversed. However, they
cannot be initiated or stopped fast enough between single characters. For this reason, data is recorded in blocks defined
as records. Gaps of unrecorded tape are added between records where the tape can be stopped.
The tape begins affecting while in a gap and achieves its permanent speed by the time it arrives at the next record. Each
record on tape has a recognition bit design at the starting and end. By reading the bit design at the starting, the tape
control recognizes the data number.

ASSOCIATIVE MEMORY/

CAM(Content Addressable Memory)


 An associative memory can be considered as a memory unit whose stored data
can be identified for access by the content of the data itself rather than by an
address or memory location.
 A memory unit accessed by content is called an associative memory or

CAM(Content Addressable Memory).


 When a write operation is performed on associative memory, no address or memory loc
the word. The memory itself is capable of finding an empty unused location to store the

64
 On the other hand, when the word is to be read from an associative memory, the
content of the word, or part of the word, is specified. The words which match the
specified content are located by the memory and are marked for reading .
Advantages of Associative memory :- Disadvantages of Associative memory :-
1. It is used where search time needs to be 1. It is more expensive than RAM.
less or short. 2. Each cell must have storage capability and
2. It is suitable for parallel searches. logical circuits for matching its content with
3. It is often used to speedup databases. external argument.

ARGUMENTS REGISTER- It contains words to be


searched.

Key Register- It specifies which parts of the


argument word needs to be compared with
words in memory.

Associative memory array - It contains the word


that are to be compared with the argument
word in parallel.

Match Register- It has m bits one bits


corresponding to each word in the memory
array.

65
Cache Memory

Cache Memory is a special very high-speed memory. It is used to speed up and


synchronize with high-speed CPU. Cache memory is costlier than main memory or disk
memory but more economical than CPU registers. Cache memory is an extremely fast
memory type that acts as a buffer between RAM and the CPU. It holds frequently requested
data and instructions so that they are immediately available to the CPU when needed. Cache
memory is used to reduce the average time to access data from the Main memory. The cache
is a smaller and faster memory that stores copies of the data from frequently used main
memory locations. There are various different independent caches in a CPU, which store
instructions and data.

Cache Performance: When the processor needs to read or write a location in main
memory, it first checks for a corresponding entry in the cache.
 If the processor finds that the memory location is in the cache, a cache hit has occurred
and data is read from the cache.
 If the processor does not find the memory location in the cache, a cache miss has
occurred. For a cache miss, the cache allocates a new entry and copies in data from
main memory, then the request is fulfilled from the contents of the cache.
The performance of cache memory is frequently measured in terms of a quantity called Hit
ratio.
Hit ratio = hit / (hit + miss) = no. of hits/total accesses
We can improve Cache performance using higher cache block size, and higher
associativity, reduce miss rate, reduce miss penalty, and reduce the time to hit in the
cache.

66
Virtual Memory

Virtual memory is the partition of logical memory from physical memory. This partition supports large virtual memory for
programmers when only limited physical memory is available.
Virtual memory can give programmers the deception that they have a very high memory although the computer has a
small main memory. It creates the function of programming easier because the programmer no longer requires to worry
about the multiple physical memory available.
Virtual memory works similarly, but at one level up in the memory hierarchy. A memory management unit (MMU)
transfers data between physical memory and some gradual storage device, generally a disk. This storage area can be
defined as a swap disk or swap file, based on its execution. Retrieving data from physical memory is much faster than
accessing data from the swap disk.

There are two primary methods for implementing virtual memory are as follows −

 Paging
Paging is a technique of memory management where small fixed-length pages are allocated instead of a single large
variable-length contiguous block in the case of the dynamic allocation technique. In a paged system, each process is
divided into several fixed-size ‘chunks’ called pages, typically 4k bytes in length. The memory space is also divided into
blocks of the equal size known as frames.

Advantages of Paging
There are the following advantages of Paging are −
 In Paging, there is no requirement for external fragmentation.
 In Paging, the swapping among equal-size pages and page frames is clear.
 Paging is a simple approach that it can use for memory management.

Disadvantage of Paging
There are the following disadvantages of Paging are −
 In Paging, there can be a chance of Internal Fragmentation.
 In Paging, the page table employs more memory.
 Because of Multi-level Paging, there can be a chance of memory reference overhead.

Segmentation

67
UNIT-V
Parallel Processing
Parallel processing can be described as a class of techniques which enables the system to
achieve simultaneous data-processing tasks to increase the computational speed of a computer
system.

A parallel processing system can carry out simultaneous data-processing to achieve faster execution
time. For instance, while an instruction is being processed in the ALU component of the CPU, the next
instruction can be read from memory.

The primary purpose of parallel processing is to enhance the computer processing capability and
increase its throughput, i.e. the amount of processing that can be accomplished during a given
interval of time.

A parallel processing system can be achieved by having a multiplicity of functional units that perform
identical or different operations simultaneously. The data can be distributed among various multiple
functional units.

The following diagram shows one possible way of separating the execution unit into eight functional
units operating in parallel.

The operation performed in each functional unit is indicated in each block if the diagram:

o The adder and integer multiplier performs the arithmetic operation with integer numbers.
o The floating-point operations are separated into three circuits operating in parallel.

68
o The logic, shift, and increment operations can be performed concurrently on different data. All
units are independent of each other, so one number can be shifted while another number is
being incremented.

Amdahl’s law
It is named after computer scientist Gene Amdahl( a computer architect from IBM and
Amdahl corporation) and was presented at the AFIPS Spring Joint Computer Conference in
1967. It is also known as Amdahl’s argument.
It is a formula that gives the theoretical speedup in latency of the execution of a task at a
fixed workload that can be expected of a system whose resources are improved. In other
words, it is a formula used to find the maximum improvement possible by just improving a
particular part of a system. It is often used in parallel computing to predict the theoretical
speedup when using multiple processors.
Speedup- Speedup is defined as the ratio of performance for the entire task using the
enhancement and performance for the entire task without using the enhancement or
speedup can be defined as the ratio of execution time for the entire task without using the
enhancement and execution time for the entire task using the enhancement. If Pe is the
performance for the entire task using the enhancement when possible, Pw is the
performance for the entire task without using the enhancement, Ew is the execution time for
the entire task without using the enhancement and Ee is the execution time for the entire
task using the enhancement when possible then,
Speedup = Pe/Pw or Speedup = Ew/Ee

Amdahl’s law uses two factors to find speedup from some enhancement:

 Fraction enhanced – The fraction of the computation time in the original computer that
can be converted to take advantage of the enhancement. For example- if 10 seconds of
the execution time of a program that takes 40 seconds in total can use an enhancement,
the fraction is 10/40. This obtained value is Fraction Enhanced. Fraction enhanced is
always less than 1.
 Speedup enhanced – The improvement gained by the enhanced execution mode; that
is, how much faster the task would run if the enhanced mode were used for the entire
program. For example – If the enhanced mode takes, say 3 seconds for a portion of the
program, while it is 6 seconds in the original mode, the improvement is 6/3. This value is
Speedup enhanced. Speedup Enhanced is always greater than 1.

The overall Speedup is the ratio of the execution time:-

69
The formula for Amdahl’s law is:
S = 1 / (1 – P + (P / N))
Where:
S is the speedup of the system
P is the proportion of the system that can be improved
N is the number of processors in the system
For example, if a system has a single bottleneck that occupies 20% of the total execution
time, and we add 4 more processors to the system, the speedup would be:
S = 1 / (1 – 0.2 + (0.2 / 5))
S = 1 / (0.8 + 0.04)
S = 1 / 0.84
S = 1.19
This means that the overall performance of the system would improve by about 19% with
the addition of the 4 processors.

Pipelining
The term Pipelining refers to a technique of decomposing a sequential process into sub-operations,
with each sub-operation being executed in a dedicated segment that operates concurrently with all
other segments.

The most important characteristic of a pipeline technique is that several computations can be in
progress in distinct segments at the same time. The overlapping of computation is made possible by
associating a register with each segment in the pipeline. The registers provide isolation between each
segment so that each can operate on distinct data simultaneously.

The structure of a pipeline organization can be represented simply by including an input register for
each segment followed by a combinational circuit.

Let us consider an example of combined multiplication and addition operation to get a better
understanding of the pipeline organization.

The combined multiplication and addition operation is done with a stream of numbers such as:

Ai* Bi + Ci for i = 1, 2, 3, ......., 7

The operation to be performed on the numbers is decomposed into sub-operations with each sub-
operation to be implemented in a segment within a pipeline.

The sub-operations performed in each segment of the pipeline are defined as:

R1 ← Ai, R2 ← Bi Input Ai, and Bi


R3 ← R1 * R2, R4 ← Ci Multiply, and input C i
R5 ← R3 + R4 Add Ci to product

70
The following block diagram represents the combined as well as the sub-operations performed in
each segment of the pipeline.

Registers R1, R2, R3, and R4 hold the data and the combinational circuits operate in a particular
segment.

The output generated by the combinational circuit in a given segment is applied as an input register
of the next segment. For instance, from the block diagram, we can see that the register R3 is used as
one of the input registers for the combinational adder circuit.

In general, the pipeline organization is applicable for two areas of computer design which includes:

1. Arithmetic Pipeline
2. Instruction Pipeline

Arithmetic Pipeline

71
Arithmetic Pipelines are mostly used in high-speed computers. They are used to implement floating-
point operations, multiplication of fixed-point numbers, and similar computations encountered in
scientific problems.

To understand the concepts of arithmetic pipeline in a more convenient way, let us consider an
example of a pipeline unit for floating-point addition and subtraction.

The inputs to the floating-point adder pipeline are two normalized floating-point binary numbers
defined as:

X = A * 2a = 0.9504 * 103
Y = B * 2b = 0.8200 * 102

Where A and B are two fractions that represent the mantissa and a and b are the exponents.

The combined operation of floating-point addition and subtraction is divided into four segments.
Each segment contains the corresponding suboperation to be performed in the given pipeline. The
suboperations that are shown in the four segments are:

1. Compare the exponents by subtraction.


2. Align the mantissas.
3. Add or subtract the mantissas.
4. Normalize the result.

We will discuss each suboperation in a more detailed manner later in this section.

The following block diagram represents the suboperations performed in each segment of the
pipeline.

72
1. Compare
exponents by subtraction:
The exponents are compared by subtracting them to determine their difference. The larger exponent
is chosen as the exponent of the result.

73
The difference of the exponents, i.e., 3 - 2 = 1 determines how many times the mantissa associated
with the smaller exponent must be shifted to the right.

2. Align the mantissas:


The mantissa associated with the smaller exponent is shifted according to the difference of exponents
determined in segment one.

X = 0.9504 * 103
Y = 0.08200 * 103

3. Add mantissas:
The two mantissas are added in segment three.

Z = X + Y = 1.0324 * 10 3

4. Normalize the result:


After normalization, the result is written as:

Z = 0.1324 * 104

Instruction Pipeline
Pipeline processing can occur not only in the data stream but in the instruction stream as well.

Most of the digital computers with complex instructions require instruction pipeline to carry out
operations like fetch, decode and execute instructions.

In general, the computer needs to process each instruction with the following sequence of steps.

1. Fetch instruction from memory.


2. Decode the instruction.
3. Calculate the effective address.
4. Fetch the operands from memory.
5. Execute the instruction.
6. Store the result in the proper place.

Each step is executed in a particular segment, and there are times when different segments may take
different times to operate on the incoming information. Moreover, there are times when two or more
segments may require memory access at the same time, causing one segment to wait until another is
finished with the memory.

74
The organization of an instruction pipeline will be more efficient if the instruction cycle is divided into
segments of equal duration. One of the most common examples of this type of organization is
a Four-segment instruction pipeline.

A four-segment instruction pipeline combines two or more different segments and makes it as a
single one. For instance, the decoding of the instruction can be combined with the calculation of the
effective address into one segment.

The following block diagram shows a typical example of a four-segment instruction pipeline. The
instruction cycle is completed in four segments.

Segment 1:

The instruction fetch segment can be implemented using first in, first out (FIFO) buffer.

75
Segment 2:

The instruction fetched from memory is decoded in the second segment, and eventually, the effective
address is calculated in a separate arithmetic circuit.

Segment 3:

An operand from memory is fetched in the third segment.

Segment 4:

The instructions are finally executed in the last segment of the pipeline organization.

Flynn's Classification of Computers


M.J. Flynn proposed a classification for the organization of a computer system by the number of
instructions and data items that are manipulated simultaneously.

The sequence of instructions read from memory constitutes an instruction stream.

The operations performed on the data in the processor constitute a data stream.

Flynn's classification divides computers into four major groups that are:
1. Single instruction stream, single data stream (SISD)
2. Single instruction stream, multiple data stream (SIMD)
3. Multiple instruction stream, single data stream (MISD)
4. Multiple instruction stream, multiple data stream (MIMD)

76
SISD
SISD stands for 'Single Instruction and Single Data Stream'. It represents the organization of a
single computer containing a control unit, a processor unit, and a memory unit.

Instructions are executed sequentially, and the system may or may not have internal parallel
processing capabilities.

Most conventional computers have SISD architecture like the traditional Von-Neumann computers.

Parallel processing, in this case, may be achieved by means of multiple functional units or by pipeline
processing.

1. Where, CU = Control Unit, PE = Processing Element, M = Memory

Instructions are decoded by the Control Unit and then the Control Unit sends the instructions to the
processing units for execution.

Data Stream flows between the processors and memory bi-directionally.

Examples:

Older generation computers, minicomputers, and workstations

SIMD

77
SIMD stands for 'Single Instruction and Multiple Data Stream'. It represents an organization that
includes many processing units under the supervision of a common control unit.

All processors receive the same instruction from the control unit but operate on different items of
data.

The shared memory unit must contain multiple modules so that it can communicate with all the
processors simultaneously.

SIMD is mainly dedicated to array processing machines. However, vector processors can also be seen as a part
of this group.

MISD
MISD stands for 'Multiple Instruction and Single Data stream'.

MISD structure is only of theoretical interest since no practical system has been constructed using this
organization.

In MISD, multiple processing units operate on one single-data stream. Each processing unit operates
on the data independently via separate instruction stream.

78
1. Where, M = Memory Modules, CU = Control Unit, P = Processor Units
The experimental Carnegie-Mellon [Link] computer (1971)

MIMD
MIMD stands for 'Multiple Instruction and Multiple Data Stream'.

In this organization, all processors in a parallel computer can execute different instructions and
operate on various data at the same time.

In MIMD, each processor has a separate program and an instruction stream is generated from each
program.

1. Where, M = Memory Module, PE = Processing Element, and CU = Control Unit


Cray T90, Cray T3E, IBM-SP2

79

You might also like