Lecture 5
Lecture 5
Virtual Memory,
the Stack and the Heap
memory of a process
The OS maintains a separate page table for each process in the system. The VPs of a process
can be in physical memory (“cached”) or on disk (“uncached”). Shared pages allow to avoid
duplicating code in memory, such as the functions of the C library that are used by every
program on Linux. Pages when unused can be moved to disk by the OS.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 7
Focus on the stack 8
MEMORY ALLOCATION
ON LINUX X86-64
Stack “Top”
int global = 0;
local 0x00007ffe4d3be87c
p1 0x00007f7262a1e010
p3 0x00007f7162a1d010 Shared
p4 0x000000008359d120 Libraries
p2 0x000000008359d010
big_array 0x0000000080601060
huge_array 0x0000000000601060
main() 0x000000000040060c
useless() 0x0000000000400590
Heap
Note: the memory blocks pointed to by p1,
p2, p3 and p4 are in the heap (dynamic
memory) but the 4 pointers themselves are Data
in the stack (local variables).
Text
000000
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 16
Carnegie Mellon
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 20
Carnegie Mellon
Caching
Cache: A smaller, faster storage device that acts as a storing
area for a subset of the data in a larger, slower device.
Information in use copied from slower to faster storage
temporarily
Fastest storage (cache) checked first to determine if
information is there
▪ If it is, information used directly from the cache (fast)
▪ If not, data copied to cache and used there
Cache smaller than storage being cached
▪ Thus cache size and replacement policy are important problems
A cache miss is a failed attempt to read / write a piece of data in
the cache (data is not there), which results in a lower-level
memory access with much longer latency. Two kinds of cache
read misses: instruction read miss and data read miss.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 21
Carnegie Mellon
the circumstances
In particular, multiprocessor/multicore execution platforms
ensure cache coherency such that all CPUs have the most
recent value in their cache
Temporal locality:
▪ Recently referenced items are likely
to be referenced again in the near future
(e.g., using the same variables several times)
Spatial locality:
▪ Items with nearby addresses tend
to be referenced close together in time
(think of iterating an array)
1st row
L2 unified cache:
L1 L1 L1 L1
256 KB, 8-way,
d-cache i-cache
… d-cache i-cache
Access: 10 CPU cycles
The
higher,
Here stride = 16 bytes the
better
Probably caused by other data
or code blocks in the cache
Terminology
Memory 0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
With the above policy (i.e., i mod 4), where would blocks 1,5,9,12
at level k+1 be stored at level k ? Use figure on slide #34
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 36
Carnegie Mellon
set
line
2x indicates that
S = 2s sets a quantity is a
power of two’s
S = 2s sets
tag set block
index offset
v tag 0 1 2 3 4 5 6 7
t bits 0…01 100
v tag 0 1 2 3 4 5 6 7
find set
S = 2s sets
v tag 0 1 2 3 4 5 6 7
Address of int:
valid? + Tag matches = hit
t bits 0…01 100
v tag 0 1 2 3 4 5 6 7
set index
v tag 0 1 2 3 4 5 6 7
set index
block offset
v tag 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7
v tag 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7
v tag 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 47
Carnegie Mellon
v tag 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7
block offset
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 48
Carnegie Mellon
v tag 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7
block offset
✓ Fully associative cache Address trace (reads, one byte per read):
consists of a single set 0 [00002], miss
that contains all the hit
1 [00012],
caches lines.
7 [01112], miss
✓ Limited to small caches
8 [10002], miss
as it would be too slow
with a large number of 0 [00002] hit
tags to check
✓ Eviction policy can be v Tag Block
global to the cache and 0 ? ?
Set 0 1 00 M[0-1]
thus very efficient 0
1 10 M[8-9]
0
1 01 M[6-7]
Set 1
0
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 50
Carnegie Mellon
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 51
Carnegie Mellon