Assignment
Computer Architecture
Submitted to:
Mam Arousa
Submitted by:
Abdullah Amir
Section:
Roll Number:
2023-BSCS-151
Department of Computer Science
MNS University of Engineering and Technology, Multan
What is Cache Design?
1. Introduction
A cache is a small but very fast type of memory placed between the CPU and main memory
(RAM).
Why we need it: CPUs are extremely fast, but RAM is relatively slow. Without
cache, the CPU would spend a lot of time waiting for data to arrive.
Analogy: Imagine you are working at your desk. Instead of walking to the library
(RAM) for every book, you keep the books you use most often right on your desk
(cache).
When the CPU needs some data, two cases happen:
Cache Hit: The data is found in the cache → fast access.
Cache Miss: The data is not in the cache → the CPU fetches it from RAM (slow).
Goal of cache design: maximize hits, minimize misses, and keep cost reasonable.
2. Principles of Cache Design
a) Locality of Reference
Caches rely on the fact that programs don’t use memory randomly, but in predictable ways:
Temporal locality: Recently used data or instructions will likely be reused soon.
Spatial locality: Data stored close together in memory is often used together.
Example:
If a loop is running, the same instructions are fetched repeatedly (temporal). Also, when you
access an array, you often access nearby elements (spatial).
b) Mapping Techniques
Mapping decides where a block of main memory will be placed in the cache.
1. Direct Mapping
o Each memory block maps to exactly one cache line.
o Example: Block 5 always goes to cache line 1.
o Simple, fast, but causes conflict misses if multiple blocks compete for the
same line.
2. Fully Associative Mapping
o Any memory block can go to any cache line.
o Very flexible, avoids conflict misses, but requires more complex hardware to
search the entire cache.
3. Set Associative Mapping
o A compromise. Cache is divided into sets (like small groups). Each block
maps to a particular set, but can go into any line within that set.
o Example: 4-way set associative means each set has 4 possible spots.
o Balances speed and flexibility.
c) Replacement Policies
If the cache is full and new data must be stored, one block has to be replaced.
LRU (Least Recently Used): Replace the block not used for the longest time.
FIFO (First In First Out): Replace the block that was loaded first.
Random: Replace any block randomly.
LFU (Least Frequently Used): Replace the block that is used least often.
d) Write Policies
When the CPU writes data, should it update RAM immediately? Two main choices:
1. Write-through
o Every write to cache also writes to RAM.
o Keeps data consistent but slower.
2. Write-back
o Write only to cache. RAM is updated later when the block is replaced.
o Faster, but needs extra logic to keep track of changes (a "dirty bit").
3. Issues in Cache Design
When designing a cache, engineers must answer:
1. Cache Size
o Larger caches hold more data → fewer misses.
o But larger caches are slower, costlier, and use more power.
o Example: L1 cache is very small (like 32KB) but super fast, while L3 can be
several MB but slower.
2. Block Size
o Small blocks → less wasted space, but more misses (less spatial locality).
o Large blocks → better spatial locality, but risk bringing unused data (called
cache pollution).
3. Associativity
o Direct mapped is fast but prone to conflicts.
o Fully associative avoids conflicts but is expensive.
o Set associative is the most practical.
4. Miss Penalty
o A miss costs time because the CPU waits for data from RAM.
o To reduce this: use multi-level caches (L1, L2, L3).
L1: Smallest, fastest, closest to CPU.
L2: Larger, slower than L1.
L3: Even larger, shared across cores, slower but still faster than RAM.
5. Power Consumption
o Large caches drain more power, which is critical in smartphones and laptops.
Designers must balance speed vs. battery life.
4. Cache Algorithms (Replacement)
1. FIFO (First In First Out)
Removes the oldest block in the cache.
Simple, but may evict a block that’s still useful.
2. LRU (Least Recently Used)
Removes the block that hasn’t been used for the longest time.
Matches temporal locality well.
Needs extra hardware to track usage order.
3. LFU (Least Frequently Used)
Removes the block used least often.
Works well for stable workloads, but not for short-term bursts.
4. Random
Removes a block at random.
Very simple, sometimes works almost as well as LRU.
5. Optimal (Theoretical)
Removes the block that won’t be used for the longest time in the future.
Perfect, but impossible to implement in real hardware (since we can’t predict the
future).
Used only for research comparison.
Summary:
Cache design is about creating a "shortcut memory" that keeps the CPU busy instead of
waiting on RAM. The designer has to decide:
How big the cache should be
How memory maps into the cache
What to replace when it’s full
How to handle writes
The tradeoff is always between speed, cost, and complexity.