What is Caching
Caching
a. What is caching?
a. What is Caching?
Caching is a technique used in computing to temporarily store copies of data or
computations in a fast-access storage location, called a cache, so that future requests for
that data can be served faster.
✅ In Simple Terms:
Imagine you often check your favorite website. Instead of downloading everything every
time, your browser saves some parts (like images, styles) locally. Next time you visit, it loads
faster — that’s caching!
🔧 Key Points:
Purpose: Improve performance and reduce load time or server load.
Common Usage Areas:
Web Browsers: Cache static files (HTML, CSS, JS, images).
Databases: Cache query results.
CDNs: Cache resources closer to users.
Applications: Cache frequently used data (like API responses).
🧠 Example:
If you search something on Google, and then hit the same query again, Google might serve
the results from its cache instead of reprocessing everything from scratch.
Let me know if you want to explore how caching works in web apps, APIs, or with tools like
Redis, localStorage, etc.
b. Locality of Reference
1/14
b. Locality of Reference
Locality of Reference is a concept in computer science that refers to the tendency of
programs to access the same set of memory locations repeatedly over a short period of time.
📚 Two Main Types:
1. Temporal Locality
👉 If a memory location is accessed, it is likely to be accessed again soon.
🧠 “If I used it recently, I’ll probably use it again.”
Example: Accessing the same variable in a loop.
2. Spatial Locality
👉 If a memory location is accessed, nearby memory locations are likely to be accessed
soon.
🧠 “If I used this, I might use its neighbors too.”
Example: Iterating through an array — accessing index i , then i+1 , i+2 , etc.
🌀 Why It Matters (especially in caching):
Caching systems are designed to take advantage of locality of reference:
Temporal locality helps with keeping recently used data in cache.
Spatial locality encourages preloading neighboring data into cache blocks.
🧠 Analogy:
Imagine you’re cooking. You used salt just now (temporal), and you’ll likely use pepper or
other spices nearby (spatial). So you keep them all on the counter — that’s caching with
locality!
Let me know if you want a code-level example too.
c. Cache hit, cache miss, hit rate, miss rate
2/14
c. Cache Hit, Cache Miss, Hit Rate, Miss Rate
These terms help describe how well a cache is performing.
⚡ Cache Hit
A cache hit happens when the data requested by the program is found in the cache.
✅ Fast response.
Example: Your browser loads an image from cache instead of downloading it again.
❌ Cache Miss
A cache miss occurs when the requested data is not in the cache, so it has to be fetched
from a slower source (like RAM, disk, or a server).
🚫 Slower performance.
Example: You open a website for the first time — nothing is cached yet, so everything loads
from scratch.
📈 Hit Rate
The hit rate is the percentage of requests that result in a cache hit.
Formula:
Number of Cache Hits
Hit Rate = ( ) × 100%
Total Cache Accesses
📉 Miss Rate
The miss rate is the percentage of requests that result in a cache miss.
Formula:
3/14
Miss Rate = 100% − Hit Rate
🧠 Example:
Let’s say a system makes 1000 memory requests:
800 of them are cache hits
200 are cache misses
Then:
Hit Rate = (800 / 1000) × 100% = 80%
Miss Rate = 100% - 80% = 20%
Let me know if you'd like to see how this works in code or real-world applications like CPU
caches or Redis!
1/2
d. Directly Mapped Caching (How it works, tags and indices,
etc.) with example
d. Directly Mapped Caching
Directly Mapped Cache is the simplest type of cache organization. In this approach, each
block of main memory maps to exactly one cache line.
🔧 How It Works:
The cache is divided into lines (slots).
Each block in main memory maps to exactly one cache line using a formula.
If two blocks map to the same line, the new one replaces the old one — even if both are
used frequently.
4/14
🧮 Mapping Formula:
Cache Line Index = (Block Address) mod (Number of Cache Lines)
🏷 Tags and Indices:
Index: Tells which cache line to look at.
Tag: Stored along with the data to identify which memory block is currently in that line.
Data: The actual data fetched from memory.
When accessing:
1. Use the index to find the line.
2. Compare the tag in the cache with the tag from memory address.
If tags match ➝ Cache Hit
Else ➝ Cache Miss (data is loaded from memory and replaces current data in that
line)
🧠 Example:
Let’s say:
Cache has 4 lines
Memory has blocks: 0 to 15
We access memory block 6 .
Step 1: Find Cache Line Index
6 mod 4 = 2 ⇒ Use cache line 2
Step 2: Tag Let’s say the memory address is 6 (binary: 0110 )
Cache index is last 2 bits: 10 → index 2
Remaining bits are the tag: 01
Now:
5/14
Go to cache line 2
If tag 01 is there → hit
Else → miss, and block 6 is loaded with tag 01
📊 Cache Line Table (simplified)
Cache Line Tag Data (Block)
0 00 Block 0
1 01 Block 5
2 01 Block 6
3 10 Block 3
🔁 Collision Example:
Block 6 and Block 10 both map to line 2:
6 mod 4 = 2, 10 mod 4 = 2
Accessing block 10 would replace block 6, causing a cache miss if block 6 is needed
again.
Let me know if you'd like an illustration or want to compare this with associative or set-
associative caches!
f. Handling read and write misses, write-back and write
through, write
allocate and no-write allocate
f. Handling Read and Write Misses
6/14
When working with cache memory, especially in write operations, special techniques are
used to handle misses (when data isn't found in cache). These techniques help ensure data
consistency between cache and main memory.
🔄 Read Miss
Read miss: When the processor tries to read data not in the cache.
Solution: Fetch the block from main memory and load it into the cache.
✍️ Write Miss
Write miss: When the processor tries to write to a location not in the cache.
There are two main strategies to handle it:
🧮 1. Write Policies
🟢 a. Write-Through
✅ Every write to the cache also immediately updates main memory.
Keeps memory updated at all times.
Slower, but simple and consistent.
📌 Example:
Write X = 10
→ Update cache and main memory right away.
🟡 b. Write-Back
✅ Write only updates cache initially.
Memory is updated only when the block is replaced (evicted).
7/14
Needs a dirty bit to track modified blocks.
Faster (less frequent memory writes), but more complex.
📌 Example:
Write X = 10
→ Only cache is updated.
Later, if the block is evicted, write X = 10 to main memory.
🔧 2. Allocation Policies (on Write Miss)
🟢 a. Write-Allocate
On a write miss, the block is loaded into cache, and then written.
Works well with write-back.
Assumes the block might be used again soon.
📌 Example:
Miss on writing X = 10
→ Load X into cache, then write.
🔴 b. No-Write-Allocate (a.k.a. Write-No-Allocate)
On a write miss, the block is not brought into the cache.
Write is sent directly to main memory.
Works well with write-through.
📌 Example:
Miss on writing X = 10
→ Directly write to memory, do not cache X .
🧠 Summary Table
8/14
Policy Type Description Good With
Write-Through Write to both cache and memory No-Write-Allocate
Write-Back Write only to cache, memory updated on eviction Write-Allocate
Write-Allocate Load block into cache on write miss Write-Back
No-Write-Allocate Write directly to memory on write miss Write-Through
Let me know if you'd like a diagram or code snippet to visualize these concepts better!
differenciate between Write-Through and Write-Back
Here's a clear comparison between Write-Through and Write-Back caching policies:
🔁 Write-Through vs Write-Back
Feature Write-Through Write-Back
Data Update Writes data to both cache and main Writes data only to cache; memory is
memory simultaneously. updated later (on eviction).
Speed Slower (more memory writes). Faster (fewer memory writes).
Consistency High — memory always has the latest Lower — cache may be more up-to-
data. date than memory.
Complexity Simpler to implement. More complex (needs tracking of
changes via dirty bits).
Write Miss Often used with No-Write-Allocate Often used with Write-Allocate policy.
Handling policy.
Memory Higher bandwidth usage (frequent Lower bandwidth usage (writes occur
Bandwidth memory writes). only on eviction).
9/14
Feature Write-Through Write-Back
Reliability More reliable in case of system crashes Riskier — if system crashes, dirty
(data always in memory). cache data may be lost.
🧠 Simple Analogy:
Write-Through: Like updating both your notebook and your cloud doc at the same time.
Write-Back: Like writing only in your notebook, and uploading to the cloud later when
you’re done.
Let me know if you want examples, diagrams, or how this applies in real-world systems like
CPUs or databases!
e. Set Associative and Fully associative caching, block
replacement
schemes
e. Set Associative and Fully Associative Caching + Block Replacement
Schemes
🧠 Caching Overview
When a block of data from main memory is loaded into the cache, we need to decide where
it should go. This decision depends on the cache mapping technique:
1️⃣ Fully Associative Cache
A block from memory can be stored in any line of the cache.
Most flexible, but also most expensive in terms of hardware.
10/14
✅ Pros:
No conflict misses due to fixed mapping.
Best use of cache space.
❌ Cons:
Requires searching all cache lines to find a block (costly lookup).
Needs special hardware for parallel comparisons (called comparators).
2️⃣ Set Associative Cache
Compromise between direct-mapped and fully associative.
Cache is divided into sets, and each set has multiple lines (ways).
A memory block maps to one set only, but can go into any line (way) within that set.
Example:
8 cache lines, 2-way set associative → 4 sets, 2 lines per set.
✅ Pros:
Fewer conflict misses than direct-mapped.
More efficient lookup than fully associative.
❌ Cons:
Slightly more complex than direct-mapped.
Still needs multiple tag comparisons within a set.
📊 Comparison Table
Feature Direct-Mapped Set-Associative Fully Associative
Mapping Fixed to one line Fixed to a set, flexible within Can go anywhere in
Flexibility set cache
11/14
Feature Direct-Mapped Set-Associative Fully Associative
Lookup Fast (simple Medium (check within set) Slow (check all lines)
Complexity index)
Conflict Misses High Medium Low
Hardware Cost Low Medium High
🔁 Block Replacement Schemes
When a new block needs to be loaded but all possible places (lines) are full, a replacement
policy decides which block to remove.
Common Replacement Policies:
1. LRU (Least Recently Used)
Replaces the block that was used least recently.
🧠 Good temporal locality assumption.
2. FIFO (First-In, First-Out)
Replaces the block that has been in the cache the longest.
3. Random Replacement
Randomly selects a block to evict.
🤷♂️ Simple, fast — but can lead to poor performance.
4. LFU (Least Frequently Used) (less common in hardware)
Replaces block used least number of times.
🧠 Example (2-Way Set Associative Cache):
Let’s say:
12/14
Set 0 contains: Block A (used recently), Block B (not used recently)
A new block C maps to Set 0
If using LRU → Block B will be replaced.
Let me know if you'd like to see this visualized with a small memory/cache example!
differenciate between Set Associative and Fully associative
caching
✅ Difference Between Set Associative and Fully Associative Caching
Feature Set Associative Cache Fully Associative Cache
Placement Block can go to any line within a Block can go to any line in the
Flexibility specific set entire cache
Mapping Memory block maps to one set, then No specific mapping; complete
placed in any line (way) in that set flexibility
Number of Compare tags with all lines in a set Compare tags with all lines in
Comparisons only (fewer checks) cache (more checks)
Hardware Moderate (requires comparators for High (needs comparators for every
Complexity each line in a set) cache line)
Speed Faster than fully associative (less tag Slower due to checking entire
checking) cache
Conflict Misses Fewer than direct-mapped, but can still Very few, almost none — best for
happen minimizing conflict misses
Implementation Balanced between speed and cost More expensive to build and
Cost power-hungry
🧠 Real-World Analogy:
13/14
Set Associative: Like parking in a designated section of a parking lot, but choosing any
space within that section.
Fully Associative: You can park anywhere in the entire lot — complete freedom.
Let me know if you'd like a visual example or comparison using numbers (e.g., 8-line cache,
2-way set associative vs fully associative).
14/14