15-740/18-740 Computer Architecture Lecture 3: Performance: Carnegie Mellon University
15-740/18-740 Computer Architecture Lecture 3: Performance: Carnegie Mellon University
15-740/18-740 Computer Architecture Lecture 3: Performance: Carnegie Mellon University
Computer Architecture
Lecture 3: Performance
2
Review: ISA-level Tradeoffs: Number of Registers
Affects:
Number of bits used for encoding register address
Number of values kept in fast storage (register file)
(uarch) Size, access time, power consumption of register file
3
ISA-level Tradeoffs: Addressing Modes
Addressing mode specifies how to obtain an operand of an
instruction
Register
Immediate
Memory (displacement, register indirect, indexed, absolute,
memory indirect, autoincrement, autodecrement, …)
More modes:
+ help better support programming constructs (arrays, pointer-
based accesses)
-- make it harder for the architect to design
-- too many choices for the compiler?
Many ways to do the same thing complicates compiler design
Read Wulf, “Compilers and Computer Architecture”
4
x86 vs. Alpha Instruction Formats
x86:
Alpha:
5
x86
register
indirect
absolute
register +
displacement
register
6
x86
indexed
(base +
index)
scaled
(base +
index*4)
7
Other ISA-level Tradeoffs
Load/store vs. Memory/Memory
Condition codes vs. condition registers vs. compare&test
Hardware interlocks vs. software-guaranteed interlocking
VLIW vs. single instruction
0, 1, 2, 3 address machines
Precise vs. imprecise exceptions
Virtual memory vs. not
Aligned vs. unaligned access
Supported data types
Software vs. hardware managed page fault handling
Granularity of atomicity
Cache coherence (hardware vs. software)
…
8
Programmer vs. (Micro)architect
Many ISA features designed to aid programmers
But, complicate the hardware designer’s job
Virtual memory
vs. overlay programming
Should the programmer be concerned about the size of code
blocks?
Unaligned memory access
Compile/programmer needs to align data
Transactional memory?
9
Transactional Memory
THREAD 1 THREAD 2
begin-transaction begin-transaction
… …
enqueue (Q, v); //no locks enqueue (Q, v); //no locks
… …
end-transaction end-transaction
10
Transactional Memory
A transaction is executed atomically: ALL or NONE
11
ISA-level Tradeoff: Supporting TM
Still under research
Pros:
Could make programming with threads easier
Could improve parallel program performance vs. locks. Why?
Cons:
What if it does not pan out?
All future microarchitectures might have to support the new
instructions (for backward compatibility reasons)
Complexity?
13
The Von-Neumann Model
MEMORY
Mem Addr Reg
PROCESSING UNIT
INPUT OUTPUT
ALU TEMP
CONTROL UNIT
IP Inst Register
14
The Von-Neumann Model
Stored program computer (instructions in memory)
One instruction at a time
Sequential execution
Unified memory
The interpretation of a stored value depends on the control
signals
time
Execution time =
program
Algorithm Microarchitecture
Program ISA Logic design
ISA Microarchitecture Circuit implementation
Compiler Technology
17
Improving Performance
Reducing instructions/program
18
Improving Performance (Reducing Exec Time)
Reducing instructions/program
More efficient algorithms and programs
Better ISA?
19
Improving Performance: Semantic Gap
Reducing instructions/program
Complex instructions: small code size (+)
Simple instructions: large code size (--)