0% found this document useful (0 votes)
8 views

Week 10 Part 02 - Processor Performance (Answers)

Processor

Uploaded by

dewierbarbell0n
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Week 10 Part 02 - Processor Performance (Answers)

Processor

Uploaded by

dewierbarbell0n
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

COM1031 Computer logic

Week 10
Processor Performance
Defining Performance
Airplane Passenger capacity Cruising range Cruising Speed Passenger
(miles) (m.p.h.) throughput
(passengers x
m.p.h.)
Boeing 777 375 4630 610
Boeing 747 470 4150 610
BAC/Sud Concorde 132 4000 1350
Douglas CD-8-50 146 8720 544
Defining Performance
Airplane Passenger capacity Cruising range Cruising Speed Passenger
(miles) (m.p.h.) throughput
(passengers x
m.p.h.)
Boeing 777 375 4630 610 228 750
Boeing 747 470 4150 610 286 700
BAC/Sud Concorde 132 4000 1350 178 200
Douglas CD-8-50 146 8720 544 79 424

• Time to do the task


• execution time, response time
• Tasks per day, hour, week, sec, ns., …
• Throughput / bandwidth
Throughput and Response Time
• By replacing the processor in a computer with a faster version, will this
change increase throughput, decrease response time, or both?
• Both response time and throughput are improved.
• By adding additional processors to a system that uses multiple processors
for separate tasks – for example, search the web. Would this change
increase throughput, decrease response time, or both?
• No one task gets work done faster (assume no task queuing up); the
throughput increases.
• However, if the demand for processing is large, the system puts requests to
queue up, then in this case, increasing throughput could also improve
response time.
Performance & Execution Time
!
PerformanceX=
"#$%&'()* +(,$-

PerformanceX > PerformanceY

! !
>
"#$%&'()* +(,$. "#$%&'()* +(,$-
Relative Performance
• Performance for a program on a particular machine

!"#$%#&'()" + -.")/01%( 21&" ,


Relative Performance (X/Y) = !"#$%#&'()" ,
= -.")/01%( 21&" +
= n
• X is n times faster than Y
• X is n times as fast as Y

• Example: Machines Orange and Grape run a program.


Orange takes 5 seconds; Grape takes 10 seconds.
• Orange is _____ times faster than Grape.
Measuring Time
• Execution time is the amount of time it takes the program to
execute in seconds.
• Time (computers do several tasks!)
• Elapsed time
• counts everything (disk and memory accesses, I/O , etc.)
• a useful number, but often not good for comparison purposes
• CPU time is time spent executing this program
• Excluding waiting for I/O, or other programs
Measuring Amounts
• 1 bit
• 8 bits = 1 byte
• 1024bytes = 1 kilobyte = 1KByte = 1K
• 1024KBytes = 1 megabyte = 1MB
• 1024MB = 1 gigabyte = 1GB
• 1024GB = 1 terrabyte
• and more …
Measuring Times
• Duration
• 1 second
• 1/1000 second = 1 millisec = 1ms = 10-3 s
• 1/1,000,000 s = 1 microsec = 10-6 s
• 1/1,000,000,000s = 1 nanosec = 10-9 s
• Frequency
• 1 Hz = 1 cycle per second
• 1 MHz = 1,000,000 cycles per sec
• 100MHz = 100,000,000 cycles per sec.
Computer Clock Times
• Computers run according to a clock that runs at a steady rate
• The time interval is called a clock cycle (e.g., 10ns).
• The clock rate is the reciprocal of clock cycle - a frequency, how many
cycles per sec (e.g., 100MHz).
• 10 ns = 1/100,000,000 (clock cycle), same as:-
• 1/10ns = 100,000,000 = 100MHz (clock rate).
Purchasing Decision
• Computer A has a 100MHz processor
• Computer B has a 300MHz processor
• So, B is faster, right?
CPU Execution Time
CPU Execution Time
for a program = CPU Clock Cycles for a program ⨉ Clock cycle time

CPU Execution Time CPU Clock Cycles for a program


for a program =
Clock rate
Example
• Computer A, which has a 2GHz clock, run our favorite program in 10
seconds.
• We are trying to help a computer designer build a computer, B, which
will run this program in 6 seconds.
• The designer has determined that a substantial increase in the clock
rate is possible, but this increase will affect the rest of the CPU design,
causing computer B to require 1.2 times as many clock cycles as
computer A for this program. What clock rate should we tell the
designer to target?
Example
• Computer A, which has a 2GHz clock, run our favorite program in 10
seconds.
• We are trying to help a computer designer build a computer, B, which will
run this program in 6 seconds.
• The designer has determined that a substantial increase in the clock rate is
possible, but this increase will affect the rest of the CPU design, causing
computer B to require 1.2 times as many clock cycles as computer A for this
program. What clock rate should we tell the designer to target?

CPU Execution Time CPU Clock Cycles for a program


for a program =
Clock rate
Answer
• Find the number of clock cycles required for the program on A:
HIJ %K)%L %M%K$NO
CPU TimeA = HK)%L PQ'$O

HIJ %K)%L %M%K$NO


10 seconds =
R ⨉!T! %M%K$N/N$%)*VN

CPU clock cyclesA = 10 seconds ⨉ 2 ⨉ 109 cycles/second


= 20 ⨉ 109 cycles
Answer
• CPU time for B can be found using this equation:
!.#⨉%&' ()*(+ (,()-./
CPU TimeB = %)*(+ 012-3

!.#⨉#4⨉!4!(,()-.
6 seconds = %)*(+ 012-3

!.#⨉#4⨉!4!(,()-. 8⨉!4!(,()-.
Clock rateB = 5 .-(*67.
= .-(*67
= 4GHz

• To run the program in 6 seconds, B must have twice the clock rate of A.
Instruction Performance
• The execution time must depend on the number of instructions in a
program.

Average clock cycles per


CPU clock cycles = Instructions for a program ⨉
instruction

• Clock cycles per instruction (CPI) is the average number of clock cycles
each instruction takes to execute.
Example
• Suppose we have two implementations of the same instruction set
architecture.
• Computer A has a clock cycle time of 250 ps and a CPI of 2.0 for some
program, and computer B has a clock cycle time of 500 ps and a CPI of
1.2 for the same program.
• Which computer is faster for this program and by how much?
Example
• Suppose we have two implementations of the same instruction set
architecture.
• Computer A has a clock cycle time of 250 ps and a CPI of 2.0 for some
program, and computer B has a clock cycle time of 500 ps and a CPI of
1.2 for the same program.
• Which computer is faster for this program and by how much?
Average clock cycles per
CPU clock cycles = Instructions for a program ⨉
instruction
CPU Execution
Time for a program = CPU Clock Cycles for a program ⨉ Clock cycle time
CPU clock cycles = Instructions for a program ⨉ Average clock cycles per instruction

Answer
CPU Execution Time = CPU Clock Cycles for a program ⨉ Clock cycle time
for a program

Computer A has a clock cycle time of 250 ps and a CPI of 2.0 for
some program, and computer B has a clock cycle time of 500 ps
and a CPI of 1.2 for the same program.

• We know that each computer executes the same number of instructions


for the program; let’s call this number I. First, find the number of processor
clock cycles for each computer:
CPU clock cyclesA = I ⨉ 2.0
CPU clock cyclesB = I ⨉ 1.2
CPU TimeA = CPU clock cyclesA ⨉ Clock cycle time
= I ⨉ 2.0 ⨉ 250 ps = 500 ⨉ I ps
CPU TimeB = I ⨉ 1.2 ⨉ 500 ps = 600 ⨉ I ps
Answer
• Computer A is faster.
• The amount faster is given by the ratio of the execution times:

CPU performance O Execu\on \me W 600 ⨉ I ps


= = = 1.2
CPU performance W Execu\on \me O 500 ⨉ I ps

• Computer A is 1.2 times as fast as computer B for this program.


CPU Performance Equation
CPU Execution Time
for a program = CPU Clock Cycles for a program ⨉ Clock cycle time

CPU Time = Instruction Count ⨉ CPI ⨉ Clock cycle time

Instruction Count ⨉ CPI


CPU Time =
Clock rate
Computing CPI
• Different types of instructions can take very different amounts of
cycles: Memory accesses, integer math, floating point, control flow
• CPI = ∑'MX$N(𝐶𝑦𝑐𝑙𝑒𝑠_𝑡𝑦𝑝𝑒 ∗ 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦_𝑡𝑦𝑝𝑒)
Instruction Type Type Cycles Type Frequency Cycles * Freq
ALU 1 50%
Load 5 20%
Store 3 10%
Branch 2 20%
CPI
Comparing Code Segments Example
• A compiler designer is trying to decide between two code sequences for a particular computer.
The hardware designers have supplied the following facts:

CPI for each instruction class


A B C
CPI 1 2 3

• For a particular high-level language statement, the compiler writer is considering two code
sequences that require the following instruction counts:

Code Instruction counts for each instruction class


sequence A B C
1 2 1 2
2 4 1 1

• Which code sequence executes the most instructions? Which will be faster? What is the CPI
for each sequence?
CPI for each instruction class
A B C

Answer CPI 1 2 3

Code Instruction counts for each instruction class


Which code sequence executes the most instructions? sequence A B C
Which will be faster? What is the CPI for each sequence? 1 2 1 2
2 4 1 1

• Sequence 1 executes 2+1+2=5 instructions.


• Sequence 2 executes 4+1+1=6 instructions.
• Sequence 1 executes fewer instructions.
• We can use the equation for CPU clock cycles based on instruction count and CPI
to find the total number of clock cycles for each sequence:
• CPU clock cycles =∑(134(𝐶𝑃𝐼𝑖⨉𝐶𝑖 )
• CPU clock cycles1 = (2⨉1)+(1⨉2)+(2⨉3) = 2+2+6 = 10 cycles
• CPU clock cycles2 = (4⨉1)+(1⨉2)+(1⨉3) = 4+2+3 = 9 cycles
• Code sequence 2 is faster
Answer
• Since code sequence 2 takes fewer overall clock cycles but has more
instructions, it must have a lower CPI. The CPI values can be computed by

Average clock cycles per


CPU clock cycles = Instructions for a program ⨉
instruction

CPU clock cycles


CPI =
InstrucZon count

CPU clock cycles 45


CPI1 = InstrucZon count1 = 6 = 2
1
CPU clock cycles2 7
CPI2 = = = 1.5
InstrucZon count2 8
Basic components of performance
Components of performance Units of measure
CPU execution time for a Seconds for the program
program
Instruction count Instructions executed for the program
Clock cycles per instruction (CPI) Average number of clock cycles per instruction
Clock cycle time Seconds per clock cycle

CPU Execution Time = Seconds / Program = !"#$%&'$()"# ⨉ ./)'0 '2'/3#



43')"5#
*%)+%,- !"#$%&'$()"# ./)'0 '2'/3#
Other Performance Metrics
• MIPS
• Million Instructions Per Second = instruction count / (exec time * 106)
• For example, a program that executes 3 million instructions in 2 seconds has a MIPS
rating of 1.5
• Advantage : Easy to understand and measure
• Disadvantages : May not reflect actual performance, since simple instructions do
better.
• MFLOPS
• Million Floating Point Operations Per Second
• For example, a program that executes 4 million fp. instructions in 5 seconds has a
MFLOPS rating of 0.8
• Advantage : Easy to understand and measure
• Disadvantages : Same as MIPS, only measures floating point
Other Performance Metrics
• Benchmarks: System Performance Evaluation Cooperative (SPECs)
• Provides a comment set of real applications along with strict guidelines for
how to run them
• Provides a relatively unbiased means to compare machines
• Average Performance over a set of example programs
MIPS Example
• Two different compilers are being tested for a 500 MHz. Machine with
three different classes of instructions: Class A, Class B, and Class C,
which require one, two, and three cycles (respectively). Both
compilers are used to produce code for a large piece of software. The
first compiler's code uses 5 billions Class A instructions, 1 billion Class
B instructions, and 1 billion Class C instructions. The second
compiler's code uses 10 billions Class A instructions, 1 billion Class B
instructions, and 1 billion Class C instructions.
• Which sequence will be faster according to MIPS?
• Which sequence will be faster according to execution time?
.*6 '/)'0 '2'/3#7
CPU TimeA =
MIPS Example - Answer ./)'0 %,$37

Million Instructions Per Second =


Two different compilers are being tested for a 500 MHz. instruction count / (exec time * 106)
Class A, Class B, and Class C, which require one, two, and three
cycles (respectively).

Instruction counts (in billions) for each instruction class

Code from A B C
Compiler 1 5 1 1
Compiler 2 10 1 1

CPU Clock cycles 1 = (5 x 1 + 1x 2 + 1x 3) x 109 = 10x 109


CPU Clock cycles 2 = (10 x 1 + 1 x 2 + 1 x 3) x 109 = 15 x 109
CPU time 1= 10 x 109/ 500 x 106= 20 seconds
CPU time 2= 15 x 109/ 500 x 106= 30 seconds
MIPS1 = (5 + 1 + 1) x 109/ (20 x 106 )= 350
MIPS2 = (10 + 1 + 1) x 109/ (30 x 106) = 400
Question:
Consider three different processors P1, P2, and P3 executing the same instruction
set.
P1 has a 3 GHz clock rate and a CPI of 1.5.
P2 has a 2.5 GHz clock rate and a CPI of 1.0.
P3 has a 4.0 GHz clock rate and has a CPI of 2.2.
A. Which processor has the highest performance expressed in instructions per
second?
B. If the processors each execute a program in 10 seconds, find the number of
cycles and the number of instructions.
C. We are trying to reduce the execution time by 30% but this leads to an increase
of 20% in the CPI. What clock rate should we have to get this time reduction?
Answer
P1 has a 3 GHz clock rate and a CPI of 1.5.
P2 has a 2.5 GHz clock rate and a CPI of 1.0.
P3 has a 4.0 GHz clock rate and has a CPI of 2.2.

A. Which processor has the highest performance expressed in instructions per second?

• The clock rate is the reciprocal of clock cycle - a frequency, how many cycles per sec (e.g., 100MHz).
• Clock cycles per instruction (CPI) is the average number of clock cycles each instruction takes to execute.

Answer for (A)


performance of P1 (instructions/sec) = 3 × 109/1.5 = 2 × 109
performance of P2 (instructions/sec) = 2.5 × 109/1.0 = 2.5 × 109
performance of P3 (instructions/sec) = 4 × 109/2.2 = 1.8 × 109
CPU Execution Time for a CPU Clock Cycles for a program
program =

Answer
Clock rate

CPU clock cycles = Instructions for a program ⨉ Average clock cycles


P1 has a 3 GHz clock rate and a CPI of 1.5. per instruction
P2 has a 2.5 GHz clock rate and a CPI of 1.0.
P3 has a 4.0 GHz clock rate and has a CPI of 2.2.

(B) If the processors each execute a program in 10 seconds, find the number
of cycles and the number of instructions.

Answer for (B)


cycles(P1) = 10 × 3 × 109 = 30 × 109
cycles(P2) = 10 × 2.5 × 109 = 25 × 109
cycles(P3) = 10 × 4 × 109 = 40 × 109
No. instructions(P1) = 30 × 109 /1.5 = 20 × 109
No. instructions(P2) = 25 × 109 /1 = 25 × 109
No. instructions(P3) = 40 × 109 /2.2 = 18.18 × 109
Answer
P1 has a 3 GHz clock rate and a CPI of 1.5.
P2 has a 2.5 GHz clock rate and a CPI of 1.0.
P3 has a 4.0 GHz clock rate and has a CPI of 2.2.

(C) We are trying to reduce the execution time by 30% but this leads to an increase of
20% in the CPI. What clock rate should we have to get this time reduction?

Answer for (C)


CPInew = CPIold × 1.2, then
CPI(P1) = 1.8, CPI(P2) = 1.2, CPI(P3) = 2.6
Clock rate = No. instr. × CPI / exec time, then
Clock rate(P1) = 20 × 109 × 1.8/7 = 5.14 GHz
Clock rate(P2) = 25 × 109 × 1.2/7 = 4.28 GHz
Clock rate(P3) = 18.18 × 109 × 2.6/7 = 6.75 GHz

You might also like