+
William Stallings
Computer Organization
and Architecture
10th Edition
© 2016 Pearson Education, Inc., Hoboken,
NJ. All rights reserved.
+ Chapter 2
Performance Issues
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Designing for Performance
The cost of computer systems continues to drop dramatically, while the performance and capacity
of those systems continue to rise equally dramatically
Today’s laptops have the computing power of an IBM mainframe from 10 or 15 years ago
Processors are so inexpensive that we now have microprocessors we throw away
Desktop applications that require the great power of today’s microprocessor-based systems include:
Image processing
Three-dimensional rendering
Speech recognition
Videoconferencing
Multimedia authoring
Voice and video annotation of files
Simulation modeling
Businesses are relying on increasingly powerful servers to handle transaction and database
processing and to support massive client/server networks that have replaced the huge
mainframe computer centers of yesteryear
Cloud service providers use massive high-performance banks of servers to satisfy high-
volume, high-transaction-rate applications for a broad spectrum of clients
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Microprocessor Speed
Techniques built into contemporary processors include:
Pipelining • Processor moves data or instructions into a conceptual pipe
with all stages of the pipe processing simultaneously
• Processor looks ahead in the instruction code fetched from
Branch prediction memory and predicts which branches, or groups of
instructions, are likely to be processed next
• This is the ability to issue more than one instruction in every
Superscalar execution processor clock cycle. (In effect, multiple parallel pipelines
are used.)
• Processor analyzes which instructions are dependent on each
Data flow analysis other’s results, or data, to create an optimized schedule of
instructions
• Using branch prediction and data flow analysis, some
processors speculatively execute instructions ahead of their
Speculative execution actual appearance in the program execution, holding the
results in temporary locations, keeping execution engines as
busy as possible
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Improvements in Chip Organization
and Architecture
Increase hardware speed of processor
Fundamentally due to shrinking logic gate size
More gates, packed more tightly, increasing clock rate
Propagation time for signals reduced
Increase size and speed of caches
Dedicating part of processor chip
Cache access times drop significantly
Change processor organization and architecture
Increase effective speed of instruction execution
Parallelism
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
The use of multiple processors on
the same chip provides the
potential to increase performance
Multicore without increasing the clock rate
Strategy is to use two simpler
processors on the chip rather than
one more complex processor
With two processors larger
caches are justified
As caches became larger it made
performance sense to create two
and then three levels of cache on
a chip
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+ Gene Amdahl
Deals with the potential speedup of a
program using multiple processors compared
to a single processor
Amdahl’s Illustrates the problems facing industry in the
Law
development of multi-core machines
Software must be adapted to a highly
parallel execution environment to exploit
the power of parallel processing
Can be generalized to evaluate and design
technical improvement in a computer system
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Little’s Law
Fundamental and simple relation with broad applications
Can be applied to almost any system that is statistically in steady
state, and in which there is no leakage
Queuing system
If server is idle an item is served immediately, otherwise an arriving item
joins a queue
There can be a single queue for a single server or for multiple servers, or
multiple queues with one being for each of multiple servers
Average number of items in a queuing system equals the average
rate at which items arrive multiplied by the time that an item
spends in the system
Relationship requires very few assumptions
Because of its simplicity and generality it is extremely useful
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+ Summary
Chapter 2
Designing for performance
Microprocessor speed
Performance balance
Improvements in chip
organization and architecture
Multicore
Amdahl’s Law
Little’s Law
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.