0% found this document useful (0 votes)
39 views30 pages

Fourthmodulenotes

Uploaded by

Bhavya Sri G
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views30 pages

Fourthmodulenotes

Uploaded by

Bhavya Sri G
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Lecture Notes Digital Design and Computer Organization

Module 4
INPUT/OUTPUT ORGANIZATION

➢ Accessing I/O devices:


⚫ A simple arrangement to connect I/O devices to a computer is to use a single bus
arrangement.
⚫ The bus enables all the devices connected to it to exchange information.
⚫ Bus consists of three sets of lines used to carry
Address
Data
Control Signals.
⚫ Each I/0 device is assigned a unique set of addresses.
⚫ When the processor places a particular address on the address lines, the device that
recognizes this address responds to the commands issued on the control lines.
⚫ The processor requests either a read or a write operation, and requested data are
transferred over the data lines.
⚫ When I/0 devices and the memory share the same address space, the arrangement is
called memory mapped I/0.
⚫ With memory mapped I/O, any instruction that can access memory can be used to
transfer data to or from an I/) device.
⚫ For example if DATAIN is the address of the input buffer associated with the keyboard,
the instruction
MOVE DATAIN,R0
⚫ Reads the data from DATAIN and stores them into processor register R0.
⚫ Similarly the instruction
MOVE R0,DATAOUT
⚫ sends the contents of register R0 to location DATAOUT,
⚫ Most computer systems use memory-mapped I/O.
⚫ Some processors have special IN and OUT instructions to perform I/O transfers.

Page 1
Lecture Notes Digital Design and Computer Organization

Page 2
Lecture Notes Digital Design and Computer Organization

I/O Interface For Input Device


⚫ The figure shows the hardware required to connect an I/O device to the bus.
Address decoder:
⚫ The address decoder enables the device to recognize its address on the address lines.
Data and Status registers:
⚫ The data register holds the data being transferred to or from the processor
⚫ The status register contains information relevant to the operation of the I/O device.
Control Circuits:
⚫ Control circuitry required to coordinate I/O transfers.

Program controlled I/O


⚫ Consider a simple example of interfacing a keyboard and display to the CPU.
⚫ The four registers shown in the figure are used in the data transfer operations.
⚫ Register status contains two control flags, SIN and SOUT, which provides status
information for the keyboard and the display unit respectively.

Page 3
Lecture Notes Digital Design and Computer Organization

⚫ The program shown in the figure reads a line characters from the keyboard and stores it
in a memory buffer starting at location LINE.
⚫ As each character is read , it is echoed back to the display.
⚫ Register R0 is used as a pointer to the memory buffer area.
⚫ The contents of R0 are updated using the auto increment mode so that successive
characters are stored in successive memory locations.
⚫ Each character is checked to see if it is the carriage return (CR) character ,which has
ASCII code 0D(hex).
⚫ If it is, line feed character (ASCII Code 0A) is sent to move the cursor one lime down on
the display, otherwise the program loops back to wait for another character from the
keyboard.
⚫ This example illustrates program controlled I/O in which the processor repeatedly checks
a status flag to achieve the required synchronization between the processor and an input
or output device.

Page 4
Lecture Notes Digital Design and Computer Organization

➢ INTERRUPTS

⚫ In program controlled I/O technique, the processor initiates the action by checking the
status of the device by entering into a wait loop.
⚫ During this period, the processor is not performing any useful computation.
⚫ There are many situations where other tasks can be performed while waiting for an I/O
device to become ready.
⚫ To allow this to happen, we can arrange for the I/O device to alert the processor when it
becomes ready.
⚫ It can do so by sending a hardware signal called an INTERRUPT to the processor.
⚫ At least one of the bus control lines, called an INTERRUPT REQUEST LINE is usually
dedicated for this purpose.
⚫ Using interrupts waiting periods can ideally be eliminated.

EXAMPLE:
⚫ Consider the task that requires some computations to be performed and the results to be
printed on a line printer.
⚫ Let the program consist of two routines, COMPUTE and PRINT.
⚫ Assume that COMPUTE produces a set of “N” lines of output, to be printed by the
PRINT routine.
⚫ But the printer accepts only one line of text at a time.
⚫ First COMPUTE routine is executed to produce the first “N” lines of output.
Then the PRINT routine is executed to send the first line of text to the printer., at this time
instead of waiting for the line to be printed, the print routine may be temporarily suspended
and execution of the COMPUTE routine continued
⚫ whenever printer becomes ready, it alerts the processor by sending an interrupt request
signal.
⚫ In response, the processor interrupts the execution of the COMPUTE routine and
transfers control to the PRINT routine.
⚫ The PRINT routine sends the second line to the printer and is again suspended.

Page 5
Lecture Notes Digital Design and Computer Organization

⚫ Then the interrupted COMPUTE routine resumes at the point of interruption.


⚫ This process continues until all “N” lines have been printed and the PRINT routine ends.

⚫ This figure depicts the concept of interrupts


⚫ The routine executes in response to an interrupt request is called the INTERRUPT
SERVICE ROUTINE.
⚫ Assume that an interrupt request arrives during the execution of instruction “i” in figure.
⚫ The processor first completes the execution of instruction “i”. Then it loads the program
counter with the address of the first instruction of the interrupt service routine.
⚫ After the execution of the interrupt service routine, the processor has to come to
instruction “i+1”.
⚫ Therefore when interrupt occurs, the current contents of the PC, which point to the
instruction “i+1” must be put in temporary storage .
⚫ A return from interrupt instruction at the end of the interrupt-service routine reloads the
PC from that temporary storage location.
⚫ The processor must inform the device that its request has been recognized so that it may
remove its interrupt- request signal.
⚫ This may be accomplished by means of a special control signal on the bus.
⚫ An interrupt–Acknowledge signal used in some of the the interrupt schemes.
⚫ So far, treatment of an interrupt-service routine is very similar to that of a subroutine.
⚫ A subroutine performs a function required by the program from which it is called.
⚫ Subroutine and calling program belong to the same task.

Page 6
Lecture Notes Digital Design and Computer Organization

⚫ But an ISR may not have anything in common with the program being executed at the
time interrupt request is received.
⚫ In fact, the two programs often belong to different tasks.
⚫ Therefore, before starting the execution of the interrupt service routine, any information
that may be altered during the execution of the interrupted program is resumed.
⚫ The task of saving and restoring information can be done automatically by the processor
or by program instructions.
⚫ Saving registers increases the delay between the time an interrupt request is received and
the start the start of execution of the interrupt-service routine.
⚫ This delay is called INTERRUPT LATENCY.
⚫ In some earlier processors, particularly those with small number of registers , all registers
are saved automatically by the processor at the time an interrupt request is accepted.
⚫ The data saved are restored to their respective registers as part of the execution of the
return from interrupt instruction.
⚫ Some computers provide two types of interrupts
1) one saves all register contents
2) the other does not.

✓ Interrupt Hardware

⚫ We discussed that an I/O device requests an interrupt by activating a bus line called
interrupt request line.

⚫ Most computers are likely to have several I/O devices that can request an interrupt.

⚫ A single interrupt request line may be used to serve ‘n’ devices as shown in the figure.

⚫ All devices are connected to the line via switches to ground.

⚫ To request an interrupt, a device closes its associated switch.

⚫ Thus if all interrupt request signals are inactive, that is if all switches are open, the
voltage on the line will be equal to vdd, this is inactive state of the line.

⚫ When device requests its an interrupt by closing its switch, the voltage on line drops to
[Link] interrupt request line INTR received by the processor to go to 1.

Page 7
Lecture Notes Digital Design and Computer Organization

✓ Enabling And Disabling Interrupts

⚫ When an interrupt arrives the processor suspends the execution of one program and
begins the execution of another program requested by an I/O device.

⚫ Because interrupts can arrive at any time, they may alter the sequence of events.

⚫ Hence, the interruption of program execution must be carefully controlled.

⚫ A fundamental facility found in all computers is the ability to enable and disable such
interrupts.

⚫ There are many situations in which processor should ignore interrupt requests.

⚫ For these reasons, some means for enabling and disabling interrupts must be available for
programmer.

⚫ A simple way is to provide machine instructions, such as interrupt enable and interrupt
disable, that performs these functions.

⚫ Let us consider in detail the specific case of a single interrupt request from one device.

⚫ When device activates the interrupt request signal, it keeps the signal activated until it
learns that the processor has accepted its request.

⚫ It is essential to ensure that this active request signal does not lead to successive
interruptions, causing system to enter an infinite loop from which it cannot recover.

⚫ Several mechanisms are available to solve this problem.

⚫ There are three possibilities.

Page 8
Lecture Notes Digital Design and Computer Organization

First Possibility:

⚫ The processor hardware ignores the interrupt-request line until the execution of the first
instruction of interrupt-service routine has been completed.

⚫ Then, by using interrupt disable instruction as the first instruction in the interrupt-service
routine.

⚫ Typically the interrupt-enable instruction will be the last instruction in the interrupt-
service routine.

Second Possibility:

⚫ The processor automatically disables the interrupts before starting the execution of the
ISR.

⚫ Prior to disabling, the processor should save the contents of PC and PROCESSOR
STATUS REGISTER(PS) on the stack.

⚫ The processor status register has one bit called interrupt-enable which will enable
interrupts when set to 1.

⚫ After saving the contents of the PS on the stack, the processor clears the interrupt-enable
bit in its PS register, thus disabling further interrupts.

⚫ When return from interrupt instruction is executed, the contents of the PS are restored from
the stack, setting the interrupt enable bit back to 1, hence interrupts are again enabled.

Third Possibility:

⚫ The processor has special interrupt request line for which the interrupt-handling circuit
responds only to the leading edge of the signal.

⚫ Such a line is said to be edge-triggered.

⚫ In this case processor will receive only one request, regardless of how long the line is
activated.

⚫ Hence there is no danger of multiple interruptions and no need to explicitly disable


interrupt requests from this line.

Page 9
Lecture Notes Digital Design and Computer Organization

✓ Handling Multiple Devices

⚫ Let us consider the situation where a number of devices capable of initiating interrupts
are connected to the processor.

⚫ Because these devices are operationally independent, there is no definite order in which
they will generate interrupts.

⚫ For example, device X may request an interrupt while an interrupt caused by Y is being
serviced or several devices may request interrupts at exactly the same time.

⚫ This gives rise to a number of questions.

⚫ How processor recognize the device requesting an interrupt?

⚫ Given that different devices are likely to require different interrupt-service routines, how
can processor obtain the starting address of the appropriate routine in each case?

⚫ Should a device be allowed to interrupt the processor while another interrupt is being
serviced?

⚫ How should two or more simultaneous interrupt requests be handled?

Polling Technique

⚫ When a device raises an interrupt request, it sets to 1 one of the bits in its status register,
which we call the IRQ bit.

⚫ For example bits KIRQ and DIRQ are the interrupt request bits for the keyboard and the
display.

⚫ The first device encountered with its IRQ bit set is the device that should be serviced.

⚫ The polling technique is easy to implement.


DISADVANTAGE:

⚫ Its main disadvantage is the time spent interrogating the IRQ bits of all devices that may
not be requesting any service.

⚫ An alternative approach is to use vectored interrupts.

Page 10
Lecture Notes Digital Design and Computer Organization

Vectored Interrupts

⚫ To reduce the time involved in the polling process, a device requesting an interrupt may
identify itself directly to the processor.

⚫ Then, the processor can immediately start executing the corresponding interrupt-service
routine.

⚫ A device requesting an interrupt can identify itself by sending a special code to the
processor over the bus.

⚫ This enables the processor to identify individual devices even if they share a single
interrupt-request line.

⚫ The code supplied by the device may represent the starting address of the interrupt- service
routine for that device.

⚫ The code length is typically in the range of 4 to 8 bits.

⚫ The location pointed to by the interrupting device is used to store the starting address of
the interrupt-service routine.

⚫ The processor reads this address, called the Interrupt vector.

⚫ When a device sends an interrupt request, the processor may not be ready to receive the
interrupt-vector code immediately.

⚫ The interrupting device must wait to put data on the bus only when the processor is ready
to receive it.

⚫ When processor is ready to receive the interrupt-vector code, it activates the interrupt-
acknowledge line, INTA.

Interrupt nesting

⚫ we discussed that interrupts should be disabled during the execution of an interrupt- service
routine, to ensure that a request from one device will not cause more than one interruption.

⚫ The same arrangement is often used when several devices are involved, in which case
execution of a given interrupt-service routine , once started always continues to completion
before the processor accepts an interrupt request from a second device.

⚫ Interrupt service routines are typically short, and the delay they may cause is acceptable
for most simple devices.

Page 11
Lecture Notes Digital Design and Computer Organization

⚫ For some devices, however a long delay in responding to an interrupt request may cause
errors.

⚫ Consider, for example a computer that keeps track of the time of day using real-time clock.

⚫ This is a device that sends interrupt requests to the processor at regular intervals.

⚫ For each of these requests, the processor executes a short interrupt-service routine to
increment a set of counters in the memory that keep track of time in seconds, minutes and
so on.

⚫ It may be necessary to accept an interrupt request from the clock during the execution of
an interrupt-service routine for another device.

⚫ This example suggests that I/O devices should be organized in a priority structure.

⚫ An interrupt request from a high- priority device should be accepted while the processor
is servicing another request from a lower –priority device.

⚫ A multi-level priority organization means that during execution of an interrupt service


routine, interrupt requests will be accepted from some devices but not from others,
depending upon the devices' priority.

⚫ To implement this scheme, we can assign a priority level to the processor that can be
changed under program control.

⚫ The priority level of the processor is the priority of the program that is currently being
executed.

⚫ The processor accepts interrupts only from devices that have priorities higher than its own.

⚫ At the time the execution of an interrupt–service routine for some device is started , the
priority of the processor is raised to that of the device.

⚫ This action disables interrupts from the devices at the same level of priority or lower.

⚫ The processor’s priority is usually encoded in a few bits of the processor status word.

⚫ It can be changed by program instructions, called privileged instructions.

⚫ A multiple-priority scheme can be implemented easily by using separate interrupt-request


and interrupt-acknowledge lines for each device as shown in the figure.

⚫ Each of the interrupt request lines is assigned a different priority level.

Page 12
Lecture Notes Digital Design and Computer Organization

⚫ Interrupt requests received over these lines are sent to a priority arbitration circuit in the
processor.

⚫ A request is accepted only if it has a higher priority level than that currently assigned to
the processor.

Simultaneous Requests

⚫ When multiple requests are received over a single request line at the same time.

⚫ The processor must have some means of deciding which request to service first.

Daisy-Chain:

⚫ So, method called Daisy-chain is a commonly used hardware arrangement for handling
many requests over a single interrupt-request line.

⚫ The structure is shown in the figure.

In this method, priority is determined by the order in which these devices are polled.

⚫ The interrupt request line is common to all devices.

Page 13
Lecture Notes Digital Design and Computer Organization

⚫ The interrupt-acknowledge line (INTA) is connected in a daisy-chain fashion, such that


INTA signal propagates serially through devices.

⚫ When several devices raise an interrupt request and the INTR line is activated, the
processor responds by setting the INTA line to 1.

⚫ This signal is received by device 1.

⚫ Device 1 passes the signal on to device 2 only if it does not require any service.

⚫ If device 1 has pending request for interrupt, it blocks the INTA signal and proceeds to
put its identifying code on the data lines.

⚫ Therefore, in the daisy-chain arrangement, the device that is electrically closest to the
processor has the highest priority, and so on.

Priority Groups

⚫ Devices are organized in groups, and each group is connected at a different priority level.

⚫ Within a group, devices are connected in a daisy-chain, this organization is used in many
systems.

Page 14
Lecture Notes Digital Design and Computer Organization

➢ Direct Memory Access(DMA)


⚫ A special control circuit is used to transfer a block of data directly between an external
devices and main memory, without continuous intervention by the processor. This
approach is called Direct memory Access or DMA.

⚫ DMA transfers are performed by a control circuit that is part of the I/O device interface.
We refer to this circuit as DMA Controller.

⚫ To initiate the transfer of a block of words, the processor sends the starting address, the
number of words in the block, and direction of the transfer.

⚫ On receiving this information, the DMA controller proceeds to perform the requested
operation.

⚫ When the entire block has been transferred, the controller informs the processor by raising
an interrupt signal.

⚫ While a DMA transfer is taking place, the program that requested the transfer cannot
continue, and the processor can be used to execute another program.

Page 15
Lecture Notes Digital Design and Computer Organization

⚫ After the DMA transfer is completed, the processor can return to the program that
requested the transfer.

⚫ I/O operations are always performed by the Operating System of the computer.

⚫ The OS is also responsible for suspending the execution of one program and starting the
another.

⚫ Thus for I/O operation involving DMA, the OS puts the program that requested the transfer
in the blocked state, and imitates the DMA operation, and starts the execution of another
program.

⚫ When the transfer is completed, the DMA controller informs the processor by sending an
interrupt request.

⚫ In response, the OS puts the suspended program in the RUNNABLE state.

⚫ FIGURE shows an example of the DMA controller registers that are accessed by the
processor to initiate transfer operations.

⚫ Two registers are used for storing the starting address and word count.

⚫ The third register contains status and control flags.


The R/W bit determines the direction of transfer

⚫ When this bit is 1 the controller performs the read operation. Otherwise it performs the
write operation.

⚫ When the controller has completed transferring a block of data it sets the DONE flag to
1.

⚫ Bit 30 is the Interrupt-enable flag, IE.

⚫ When this flag is set to 1, it causes the controller to raise an interrupt after it has
completed transferring a block of data.

⚫ The controller sets the IRQ bit to 1 when it has requested an interrupt.

Page 16
Lecture Notes Digital Design and Computer Organization

⚫ An example of computer system is given in the figure showing how DMA controllers
may be used.

⚫ The DMA controller which controls two disks, also has DMA capability and provides
two DMA channels.

⚫ It can perform two independent DMA operations, as if each disk has its own DMA
controller.

⚫ The registers needed to store the memory address, the word count, and soon are
duplicated so that one set can be used with each device.

Page 17
Lecture Notes Digital Design and Computer Organization

⚫ To start a DMA transfer of a block of data from the main memory to one of the disks, a
processor sends the address and word count information into the registers of the
corresponding channel of the disk controller.

⚫ When the DMA transfer is completed, this fact is recorded in the status and control register
of the DMA channel by setting the DONE bit.

⚫ Requests by DMA devices for using bus are always given higher priority than processor
requests.

⚫ Since the processor originates most memory access cycles, the DMA controller can be said
to “STEAL” memory cycles from the processor. Hence this technique is called CYCLE
STEALING.

BLOCK/BURST Mode:

⚫ The DMA controller may be given exclusive access to the main memory to transfer a
block of data without interruption.

Most DMA controllers contain a data storage buffer. In the case of the network interface
in the figure for example, the DMA controller reads a block of data from main memory
and stores it into its input buffer, then the data in the buffer is transmitter over the network.

Bus Arbitration

⚫ A conflict may arise if both processor and a DMA controller or two DMA controllers try
to use the bus at the same time to access the main memory.

⚫ To resolve these conflicts, an arbitration procedure is implemented on the bus to


coordinate the activities of all devices requesting memory transfers.

⚫ The device that is allowed to initiate data transfers on the bus at any given time is called
the BUS MASTER.

⚫ When the current bus master relinquishes control of the bus, another device can acquire
this status.

It is process by which the next device to become bus master is selected and bus mastership
is transferred to it.

There are two approaches to bus arbitration

Page 18
Lecture Notes Digital Design and Computer Organization

1) Centralized Arbitration

2) Distributed Arbitration

⚫ In centralized arbitration, a single bus arbiter performs the required arbitration.

⚫ In distributed arbitration, all devices participate in the selection of the next bus master.
Centralized Arbitration

⚫ In centralized arbitration, the bus master may be the processor or a separate unit
connected to the bus.

⚫ Figure shows a basic arrangement in which processor contains the bus arbitration circuit.

⚫ In this case, the processor is normally the bus master unless it grants bus mastership to
one of the DMA controllers.

⚫ A DMA controller indicates that it needs to become the bus master by activating the
BUS request line, BR.

⚫ When the bus request line is activated, the processor activates the bus grant signal,BG1
indicating to the DMA controllers that they may use the bus when it becomes free.

⚫ This signal is connected to all DMA controllers using a DAISY-CHAIN arrangement.

Page 19
Lecture Notes Digital Design and Computer Organization

⚫ Thus, if DMA controller 1 is requesting the bus, it blocks the propagation of the grant
signal to the other devices, otherwise, it passes the grant signal to next device.

⚫ The current bus master indicates to all devices that it is using bus by activating another
line called BUS-BUSY(BBSY).

⚫ Hence, after receiving the BUS –grant signal, a DMA controller waits for BUS-BUSY to
become inactive, then it gets the BUS Mastership. at this time it activates BUS-BUSY.

⚫ The timing diagram in the figure shows the sequence of events for the devices

Distributed Arbitration

⚫ In distributed arbitration all devices participate in the selection of next bus master.

⚫ A simple method for distributed arbitration is shown in the figure

⚫ Each device on the bus is assigned a 4-bit identification number.

⚫ When one or more devices request the bus, they assert the start arbitration signal and
place their 4-bit identification numbers on four lines, ARB0 through ARb3.

Page 20
Lecture Notes Digital Design and Computer Organization

⚫ A winner is selected as a result of the interaction among the signals transmitted over
these lines by all contenders.

⚫ If one device puts 1 on the bus and another device puts 0 on the same bus line, the bus
line status will be 0.

⚫ Consider that two devices A and B having ID numbers 5 and 6 respectively are
requesting the use of bus.

⚫ Device A transmits the pattern 0101, and device B transmits the pattern 0110.

⚫ The code seen by both devices is 0111.

⚫ Each device compares the pattern on the arbitration lines to its own ID, starting from the
most significant bit.

⚫ If it detects a difference at any bit position , it disables its drivers at that bit position and
for all lower-order bits.

⚫ It does so by placing 0 at the input of these drivers.

Page 21
Lecture Notes Digital Design and Computer Organization

⚫ In our example device A detects the difference on the line ARB1 , hence it disables its
drivers on lines ARB1 and ARB0. this causes the pattern on the arbitration lines to change
to 0110, which means that device B has won the contention.

➢ Speed, Size and Cost

The memory hierarchy is as shown in the fig:


• Registers: The fastest access is to data held in registers. Hence registers are part of the
memory hierarchy. More speed, small size and cost per bit is also more.
• At the next level of hierarchy, small amount of memory can be directly implemented on
the processor chip. This memory is called as processor cache. It holds the copy of data
and instructions.
• There are 2 levels of caches viz level-1 and level-2. Level-1 cache is part of the processor
and level-2 cache is placed in between level-1 cache and main memory.
• The level-2 cache is implemented using SRAM chips.
• The next level in the memory hierarchy is called as main memory. It is implemented using
dynamic memory components. The main memory is larger but slower than cache memory.
The access time for main memory is ten times longer than the cache memory

Page 22
Lecture Notes Digital Design and Computer Organization

• The level next in the memory hierarchy is called as secondary memory. It holds huge
amount of data.

➢ Cache Memories
• It is the fast access time located in between processor and main memory as
shown in the fig. It is designed to reduce the access time.

• The cache memory holds the copy of data and instructions.


• The processor needs less access time to read the data and instructions from the cache
memory as compared to main memory.
• Hence by incorporating cache memory, in between processor and main memory, it is
possible to enhance the performance of the system.

• Many instructions in localized areas of the program are executed repeatedly during some
time period and the remainder of the program is accessed relatively [Link] is
referred to as Locality of Reference.
• The memory control circuitry is designed to take advantage of the property of locality of
Reference.
• The Temporal aspect of the locality of Reference suggests that whenever an information
item is first needed this item should be brought into the cache where it will hopefully
remain until it is needed again.
• The Spatial aspect suggests that instead of fetching just one item from the main memory to
the cache it is useful to fetch several items that reside at adjacent addresses as [Link]
will use the term block to refer to a set of contiguous address locations of some size.

Page 23
Lecture Notes Digital Design and Computer Organization

• The processor does not need to know explicitly about the existence of the cache.
• The cache control circuitry determines whether the requested word currently exists in the
cache.
• If it does , the Read or write operation is performed on the appropriate cache location. It
is referred to as Read or write Hit.
• In a Read operation the main memory is not involved. For a write operation the system can
proceed in 2 ways.
• In the first technique called the write through protocol the cache location and the main
memory location are updated simultaneously.
• The second technique is to update only the cache location and to mark it as updated with
an associated flag bit often called the dirty or modified bit. The main memory location of
the word is updated later when the block containing this marked word is to be removed
from the cache to make room for a new [Link] technique is known as the write-back
or copy-back proctocol.
• The write back protocol may also result in unnecessary write operations because when a
cache block is written back to the memory all words of the block are written back even if
only a single word has been changed while the block was in the cache.
• When the addressed word in a Read operation is not in the cache a Read miss occurs. The
block of words that contains the requested word is copied from the main memory into the
[Link] the entire block is loaded into the cache,the particular word requested is
forwarded to the processor

Page 24
Lecture Notes Digital Design and Computer Organization

Mapping functions
• There are 3 techniques to map main memory blocks into cache memory.
1. Direct mapped cache

Direct mapped cache

• The simplest way to determine cache locations in which to store memory blocks is the
direct mapping technique as shown in the figure.
• The block 0 , 128 and block 256 are mapped into block-0 cache location.
Similarly blocks 1,129 and 257 from main memory are loaded into cache
block-1.
• It is note that contention may arise when more than one memory blocks are loaded into
single cache block, even when the cache is not full.

Page 25
Lecture Notes Digital Design and Computer Organization

• The main memory block is loaded into cache block by means of memory address. The
main memory address consists of 3 fields as shown in the figure.
• Each block consists of 16 words. Hence least significant 4 bits are used to select one of the
16 words.
• The 7bits of memory address are used to specify the position of the cache block, location.
The most significant 5 bits of the memory address are stored in the tag bits. The tag bits
are used to map one of 25 = 32 blocks into cache block location.
• The higher order 5 bits of memory address are compared with the tag bits. If they match,
then the desired word is in that block of the cache.
• If there is no match, then the block containing the required word must first be read from
the main memory and loaded into the cache. It is very easy to implement, but not flexible.

2. Associative Mapping
• It is also called as associative mapped cache. It is much more flexible.
• In this technique main memory block can be placed into any cache block position.
• In this case , 12 tag bits are required to identify a memory block when it is resident of the
cache memory.
• The Associative Mapping technique is illustrated as shown in the fig.

Page 26
Lecture Notes Digital Design and Computer Organization

• In this technique 12 bits of address generated by the processor are compared with the tag
bits of each block of the cache to see if the desired block is present. This is called as
associative mapping technique.
• It gives more flexibility to choose the cache location in which to place the memory block.

3. Set Associative Mapping


• It is the combination of direct and associative mapping techniques.
• The blocks of cache are divided into several groups. Such a groups are called as sets.
• Each set consists of two cache blocks. A memory block is loaded into one of the cache
sets.
• The main memory address consists of three fields, as shown in the figure.
• The lower 4 bits of memory address are used to select a word from a 16 words.
Page 27
Lecture Notes Digital Design and Computer Organization

• A cache consists of 64 sets as shown in the figure. Hence 6 bit set field is used to select a
cache set from 64 sets.
• The tag field (6 bits) of memory address is compared with the tag fields of each set to
determine whether memory block is available or not.
• In this case memory blocks 0,64,128,---4032 map into cache set 0 and they can occupy
either of the 2 block positions within that set.
• The following figure clearly describes the working principle of Set Associative Mapping
technique.
• Hence the contention problem of the direct method is erased by having a few choices for
block [Link] the same time the hardware cost is reduced by decreasing the size of
the associative search.

Page 28
Lecture Notes Digital Design and Computer Organization

Page 29
Lecture Notes Digital Design and Computer Organization

Page 30

You might also like