0% found this document useful (0 votes)
62 views

04 Cache Memory Comparc

Cache memory is a small amount of fast memory located between the processor and main memory. It stores copies of frequently accessed data from main memory to allow for faster access by the processor. Cache memory has characteristics like location, capacity, unit of transfer, access method, performance, physical type, physical characteristics, and organization. The memory hierarchy consists of different levels of memory from fastest and smallest registers to larger and slower main memory, disk cache, disk, and tape. As memory moves down the hierarchy, access time increases but capacity and cost per bit decrease.

Uploaded by

Mekonnen Wubshet
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

04 Cache Memory Comparc

Cache memory is a small amount of fast memory located between the processor and main memory. It stores copies of frequently accessed data from main memory to allow for faster access by the processor. Cache memory has characteristics like location, capacity, unit of transfer, access method, performance, physical type, physical characteristics, and organization. The memory hierarchy consists of different levels of memory from fastest and smallest registers to larger and slower main memory, disk cache, disk, and tape. As memory moves down the hierarchy, access time increases but capacity and cost per bit decrease.

Uploaded by

Mekonnen Wubshet
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Chapter 4

Cache Memory
Cache memory
 Cache memory is a small amount of fast memory
 Placed between two levels of memory hierarchy
 Between processor and main memory (our focus)
 Between main memory and disk (disk cache)
 Expected to behave like a large amount of fast memory

 Characteristics cache memory


• Location
• Capacity
• Unit of transfer
• Access method
• Performance
• Physical type
• Physical characteristics
• Organisation
Location
• CPU:the memory like registers is included
within the processor and termed as
processor memory
• Internal:it is often termed as main memory
and resides within the CPU.
• External:It consists of peripheral storage
devices such as disk and magnetic tape that
are accessible to processor via i/o
controllers
Capacity
• Word size
—The natural unit of organisation
• Number of words
—or Bytes
Unit of Transfer
• Internal
—Usually governed by data bus width
—For internal memory, the unit of transfer is
equal to the number of data lines
• into and out of the memory module.
• External
—Usually a block which is much larger than a
word
• Addressable unit
—Smallest location which can be uniquely
addressed
—Word internally
Access Methods
• Sequential access
—Start at the beginning and read through in
order
—Access time depends on location of data and
previous location
—e.g. tape
• Direct access
—Individual blocks have unique address
—Access is by jumping to vicinity plus
sequential search
—Access time depends on location and previous
location
—e.g. disk
Cont…
• Random access
—Individual addresses identify locations exactly
— The time Access time is independent of location
or previous access
—e.g. RAM
• Associative access
—Data is located by a comparison with contents of
a portion of the store
—Access time is independent of location or previous
access
—e.g. cache
Memory Hierarchy
• Registers
— In CPU
• Internal or Main memory
— May include one or more levels of cache
— “RAM”
• External memory
— Backing store
• The memory hierarchy system consists of all storage devices
employed in a computer system from the slow by high-capacity
auxiliary memory to a relatively faster main memory, to an even
smaller and faster cache memory.
• Capacity, cost and speed of different types of memory play a vital
role while designing a memory system for computers.
• If the memory has larger capacity, more application will get space
to run smoothly. It's better to have fastest memory as far as
possible to achieve a greater performance.
Memory Hierarchy - Diagram
Cont….
• Faster access time, greater cost per bit
• Greater capacity, smaller cost per bit
• Greater capacity, slower access time
 As one goes down the hierarchy, the
following occur:
A. Decreasing cost per bit
B. Increasing capacity
C. Increasing access time
D. Decreasing frequency of access of the
memory by the processor
Performance
• Access time
—Time between presenting the address and
getting the valid data
• Memory Cycle time
—Time may be required for the memory to
“recover” before next access
—Cycle time = access time+ recovery
• Transfer Rate
—Rate at which data can be moved
Physical Types
• Semiconductor
— RAM
• Magnetic
— Disk & Tape
• Optical
— CD & DVD
• Flash or solid state chip based drives
— USB drive. SD cards.
• Others
— Bubble is a type of non-volatile computer memory that uses a thin film of a
magnetic material to hold small magnetized areas.

— HologramHolographic storage is computer storage that uses lasers to store


computer-generated data in three dimensions
Physical Characteristics
• Decay: Information decays mean data
loss.
• Volatility: Information decays when
electrical power is switched off.
• Erasable: Erasable means permission to
erase.
• Power consumption: how much power
consumes?
Organisation
• Physical arrangement of bits into words
• Not always obvious
• e.g. interleaved
• interleaved memory is a design which
compensates for the relatively slow speed
of dynamic random-access memory
(DRAM) or core memory, by spreading
memory addresses evenly across memory
banks.
Hierarchy List
• Registers are small storage locations used by the
Registers:
CPU to store instructions and data.

L1 cache, or primary cache, is extremely fast but relatively small,


and is usually embedded in the processor chip as CPU cache
• L2 cache, or secondary cache, is often more capacious than L1.
L2 cache may be embedded on the CPU, or it can be on a
separate chip or coprocessor and have a high-speed alternative
system bus connecting the cache and CPU. That way it doesn't
get slowed by traffic on the main system bus.

Level 3 cache is specialized memory developed to improve the


performance of L1 and L2. L1 or L2 can be significantly faster than
L3, though L3 is usually double the speed of DRAM. With multicore
processors, each core can have dedicated L1 and L2 cache, but
they can share an L3 cache. If an L3 cache references an
instruction, it is usually elevated to a higher level of cache.
Hierarchy List
• Main memory
• The main memory is the central unit of the computer system. It is relatively large
and fast memory to store programs and data during the computer operation. These
memories employ semiconductor integrated circuits. The basic element of the
semiconductor memory is the memory cell.

• Disk cache
• You can also find the cache memory on your hard drive. This is called the “disk
cache”. It is the slowest of all cache levels because the data is loaded from disk to
memory

• Disk
magnetic disk is a circular plate constructed with metal or plastic coated with
magnetic material often both side of disk are used and several disk stacked on one
spindle which Read/write head available on each surface.
• Optical
• The huge commercial success of CD enabled the development of low cost optical
disk storage technology that has revolutionized computer data storage.
• Tape
• A magnetic tape is the strip of plastic coated with a magnetic recording medium.
Data can be recorded and read as a sequence of character through read / write
head.
Cache and Main Memory
 The cache is a smaller, faster memory which stores
copies of the data from the most frequently used main
memory locations. Sits between normal main memory
and CPU.
 May be located on CPU chip or module.
Cache/Main Memory Structure
Cache operation – overview
• CPU requests contents of memory location
• Check cache for this data
• If present, get from cache (fast)
• If not present, read required block from
main memory to cache
• Then deliver from cache to CPU
• Cache includes tags to identify which
block of main memory is in each cache
slot
Cache Read Operation - Flowchart
Typical Cache Organization
Cont…
• In this organization, the cache connects to
the processor via data, control, and
address lines.
• The data and address lines also attach
to data and address buffers, which attach
to a system bus from which main memory
is reached.
• When a cache hit occurs, the data and
address buffers are disabled and
communication is only between processor
and cache, with no system bus traffic.
• When a cache miss occurs, the desired address is loaded onto
the system bus and the data are returned through the data
buffer to both the cache and the processor.
• In other organizations, the cache is physically interposed
between the processor and the main memory for all data,
address, and control lines.
• In this latter case, for a cache miss, the desired word is first
read into the cache and then transferred from cache to
processor.
How does the CPU cache work?
• The CPU needs to access ultra-fast memory to get the most out
of its performance. The CPU cache works with this memory and
the CPU first looks into the cache memory when it wants to
access data.
• If the data is not there in cache memory, that is called
a miss, and if it is found, it will be called a hit. Then it goes and
searches the main memory
Cache Design
• Addressing
• Size
• Mapping Function
• Replacement Algorithm
• Write Policy
• Block Size
• Number of Caches
Cache Addressing
• Where does cache sit? 2 types:
— Between processor and virtual memory management
unit
— Between MMU and main memory
• Logical cache (virtual cache) stores data using
virtual addresses
— Processor accesses cache directly, not thorough physical
cache
— Cache access faster, before MMU address translation
— Virtual addresses use same address space for different
applications
– Must flush cache on each context switch
• Physical cache stores data using main memory
physical addresses
Comparison of Cache Sizes
Year of
Processor Type L1 cache L2 cache L3 cache
Introduction
IBM 360/85 Mainframe 1968 16 to 32 KB — —
PDP-11/70 Minicomputer 1975 1 KB — —
VAX 11/780 Minicomputer 1978 16 KB — —
IBM 3033 Mainframe 1978 64 KB — —
IBM 3090 Mainframe 1985 128 to 256 KB — —
Intel 80486 PC 1989 8 KB — —
Pentium PC 1993 8 KB/8 KB 256 to 512 KB —
PowerPC 601 PC 1993 32 KB — —
PowerPC 620 PC 1996 32 KB/32 KB — —
PowerPC G4 PC/server 1999 32 KB/32 KB 256 KB to 1 MB 2 MB
IBM S/390 G4 Mainframe 1997 32 KB 256 KB 2 MB
IBM S/390 G6 Mainframe 1999 256 KB 8 MB —
Pentium 4 PC/server 2000 8 KB/8 KB 256 KB —
High-end server/
IBM SP 2000 64 KB/32 KB 8 MB —
supercomputer
CRAY MTAb Supercomputer 2000 8 KB 2 MB —
Itanium PC/server 2001 16 KB/16 KB 96 KB 4 MB
SGI Origin 2001 High-end server 2001 32 KB/32 KB 4 MB —
Itanium 2 PC/server 2002 32 KB 256 KB 6 MB
IBM POWER5 High-end server 2003 64 KB 1.9 MB 36 MB
CRAY XD-1 Supercomputer 2004 64 KB/64 KB 1MB —
Mapping Function
• Determines how memory blocks are mapped to cache lines
• The effectiveness of the cache mechanism is based on a
property of computer programs called locality of reference.
• The process /technique of bringing data of main memory
blocks into the cache block is termed as cache mapping
 Three types
• ∗ Direct mapping
• » Specifies a single cache line for each memory block
• ∗ Set-associative mapping
• » Specifies a set of cache lines for each memory block
• ∗ Associative mapping
• » No restrictions
• – Any cache line can be used for any memory block
Direct Mapping
• Each block of main memory maps to only
one cache line
—i.e. if a block is in cache, it must be in one
specific place
• Address is in two parts
• Least Significant w bits identify unique
word
• Most Significant s bits specify one
memory block
• The MSBs are split into a cache line field r
and a tag of s-r (most significant)
Direct Mapping
Address Structure

Tag s-r Line or Slot r Word w


8 14 2

• 24 bit address
• 2 bit word identifier (4 byte block)
• 22 bit block identifier
— 8 bit tag (=22-14)
— 14 bit slot or line
• No two blocks in the same line have the same Tag field
• Check contents of cache by finding line and checking Tag
Direct Mapping from Cache to Main Memory
Direct Mapping Cache Organization
Direct Mapping Summary

Direct Mapping advantage &
disadvantages
 advantages
• Simple method
• Inexpensive
 disadvantages
• Fixed location for given block
—If a program accesses 2 blocks that map to
the same line repeatedly, cache misses are
very high
Associative Mapping
• A main memory block can load into any line
of cache.
• Memory address is interpreted as tag and
word
• Tag uniquely identifies block of memory.
• Every line’s tag is examined for a match
• Cache searching gets expensive
Associative Mapping from
Cache to Main Memory
Fully Associative Cache Organization
Associative Mapping
Address Structure

Word
Tag 22 bit 2 bit
• 22 bit tag stored with each 32 bit block of data
• Compare tag field with tag entry in cache to
check for hit
• Least significant 2 bits of address identify which
16 bit word is required from 32 bit data block
Associative Mapping Summary

Set Associative Mapping
• Cache is divided into a number of sets
• Each set contains a number of lines
• A given block maps to any line in a given
set
—e.g. Block B can be in any line of set i
• e.g. 2 lines per set
—2 way associative mapping
—A given block can be in one of 2 lines in only
one set
Mapping From Main Memory to Cache:
v Associative
K-Way Set Associative Cache
Organization
Set Associative Mapping
Address Structure

Word
Tag 9 bit Set 13 bit 2 bit

• Use set field to determine cache set to


look in
• Compare tag field to see if we have a hit
Set Associative Mapping Summary

Cont..
A cache's write policy is the behavior of a
cache while performing a write operation.
A cache's write policy plays a central part
in all the variety of different
characteristics exposed by the cache
 Replacement Algorithms (1)
Direct mapping
 Each block only maps to one line
 Replace that line
Replacement Algorithms (2)
Associative & Set Associative
• Hardware implemented algorithm (speed)
• Least Recently used (LRU)
• e.g. in 2 way set associative
—Which of the 2 block is lru?
• First in first out (FIFO)
—replace block that has been in cache longest
• Least frequently used
—replace block which has had fewest hits
Cont….

You might also like