0% found this document useful (0 votes)

77 views13 pages

Program Design and Analysis Program-Level Performance Analysis

Embedded

Uploaded by

Adal Arasu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views13 pages

Program Design and Analysis Program-Level Performance Analysis

Embedded

Uploaded by

Adal Arasu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

3/11/2015

Program design and 1

Program-level performance 2

analysis analysis

Program-level performance analysis. Need to understand

performance in detail:
Optimizing for: Real-time behavior, not
just typical.
Execution time.
On complex platforms.
Energy/power. Program performance
Program size. CPU performance:
Pipeline, cache are
Program validation and testing. windows into program.
We must analyze the entire
program.

Complexities of program 3
How to measure program 4

performance performance

Varies with input data: Simulate execution of the CPU.

Different-length paths. Makes CPU state visible.
Cache effects.
C Measure on real C
CPU
U using
g timer.
Instruction-level performance variations: Requires modifying the program to control
Pipeline interlocks. the timer.
Fetch times. Measure on real CPU using logic
analyzer.
Requires events visible on the pins.

1
3/11/2015

Program performance 5
Elements of program 6

metrics performance

Average-case execution time. Basic program execution time formula:

Typically used in application programming. execution time = program path + instruction timing
Solving these problems independently helps
Worst-case execution time.
simplify
i lif analysis.
l i
A component in deadline satisfaction. Easier to separate on simpler CPUs.
Best-case execution time. Accurate performance analysis requires:
Task-level interactions can cause best-case Assembly/binary code.
program behavior to result in worst-case Execution platform.
system behavior.

Data-dependent paths in 7 8

an if statement Paths in a loop

if (a || b) { /* T1 */ a b c path for (i=0, f=0; i<N; i++) i=0
if ( c ) /* T2 */ 0 0 0 T1=F, T3=F: no assignments
f=0
f = f + c[i] * x[i];
x = r*s+t; /* A1 */ 0 0 1 T1=F, T3=T: A4
else y=r+s; /* A2 */ 0 1 0 T1=T, T2=F: A2, A3 N
z = r+s+u; /* A3 */ 0 1 1 T1=T, T2=T: A1, A3 i N
i=N
} 1 0 0 T1=T, T2=F: A2, A3
Y
else { 1 0 1 T1=T, T2=T: A1, A3
if ( c ) /* T3 */ 1 1 0 T1=T, T2=F: A2, A3
f = f + c[i] * x[i]
y = r-t; /* A4 */ 1 1 1 T1=T, T2=T: A1, A3
} i=i+1

2
3/11/2015

Instruction Timing
9
Mesaurement-driven 10

Performance Analysis
Not all instructions take the same amount of time.
Multi-cycle instructions. Not so easy as it sounds:
Fetches. Must actually have access to the CPU.
Execution times of instructions are not Must know data inputs that give worst/best
depe de t
independent. case pe
performance.
o a ce
Pipeline interlocks. Must make state visible.
Cache effects.
Still an important method for performance
Execution times may vary with operand value.
analysis.
Floating-point operations.
Some multi-cycle integer operations.

11 12

Trace-driven Measurement Physical Measurement

Trace-driven: In-circuit emulator allows tracing.
Instrument the program. Affects execution timing.
Save information about the path. Logic analyzer can measure behavior at
Requires modifying the program.
program p
pins.
Trace files are large. Address bus can be analyzed to look for
events.
Widely used for cache analysis. Code can be modified to make events visible.
Particularly important for real-world input
streams.

3
3/11/2015

Performance Optimization 13
Programs and Performance 14

Motivation Analysis
Embedded systems must often meet Best results come from analyzing optimized
deadlines. instructions, not high-level language code:
Faster may not be fast enough. Non-obvious translations of HLL statements into
Need
N d tto b
be able
bl tto analyze
l execution
ti instructions;
Code may move;
time.
Cache effects are hard to predict.
Worst-case, not typical.
Need techniques for reliably improving
execution time.

15 16
Loop Optimizations
Code Motion
Loops are good targets for
optimization. for (i=0; i<N*M; i++)
i=0; Xi=0;
= N*M
z[i] = a[i] + b[i];
Basic loop optimizations: N
i<N*M
i<X
Code motion; Y
Induction-variable elimination; z[i] = a[i] + b[i];

Strength reduction (x*2 -> x<<1).

i = i+1;

4
3/11/2015

Induction Variable 17
Cache Analysis
18

Elimination
Induction variable: loop index. Loop nest: set of loops, one inside
Consider loop: other.
for (i=0; i<N; i++) Perfect loop nest: no conditionals in
f (j=0;
for (j 0 jj<M;
M jj++)) nest.
z[i,j] = b[i,j];
Because loops use large quantities of
Rather than recompute i*M+j for each array
data, cache conflicts are common.
in each iteration, share induction variable
between arrays, increment at end of loop
body.

Array Conflicts in Cache

19 20

Array conflicts, contd.

Array elements conflict because they are

a[0,0] 1024 in the same line, even if not mapped to
1024 4099 same location.
Solutions:
b[0,0] 4099 ... move one array;
pad array.

Main Memory Cache

5
3/11/2015

Performance Optimization 21
Energy/power Optimization
22

Hints
Use registers efficiently. Energy: ability to do work.
Most important in battery-powered systems.
Use page mode memory accesses.
Power: energy per unit time.
Analyze cache behavior: Important even in wall-plug
wall plug systems---power
systems power
Instruction conflicts can be handled by becomes heat.
rewriting code, rescheudling;
Conflicting scalar data can easily be
moved;
Conflicting array data can be moved,
padded.

Measuring Energy 23
Sources of Energy 24

Consumption Consumption
Relative energy per operation (Catthoor et
Execute a small loop, measure current:
al):
I
Memory transfer: 33
External
E t l I/O
I/O: 10
SRAM write: 9
while (TRUE)
a(); SRAM read: 4.4
Multiply: 3.6
Add: 1

6
3/11/2015

Cache Behavior is Important Cache Sweet Spot

25 26

Energy consumption has a sweet

spot as cache size changes:
Cache too small
Program thrashes
thrashes, burning energy on
external memory accesses;
Cache too large
Cache itself burns too much power.
[Li98] 1998 IEEE

Optimizing for Energy Optimizing for Energy

27 28

First-order optimization: Use registers efficiently.

High performance = low energy. Identify and eliminate cache conflicts.
Not many instructions trade speed Moderate loop unrolling eliminates some
for energy.
energy loopp overhead instructions.
Eliminate pipeline stalls.
In lining procedures may help: reduces
linkage, but may increase cache thrashing.

7
3/11/2015

Efficient Loops Single-instruction Repeat

29 30

Loop Example
General rules:
STM #4000h,AR2
Dont use function calls. ; load pointer to source
Keep loop body small to enable local STM #100h,AR3
repeat (only forward branches)
branches). ; load
l d pointer
i t tto ddestination
ti ti
Use unsigned integer for loop counter. RPT #(1024-1)
Use <= to test loop counter. MVDD *AR2+,*AR3+
; move
Make use of compiler---global
optimization, software pipelining.

Optimizing for Program Size

31 32
Data Size Minimization
Goal: Reuse constants, variables, data
Reduce hardware cost of memory; buffers in different parts of code.
Reduce power consumption of memory Requires careful verification of
units. correctness.
Two opportunities: Generate data using instructions.
Data;
Instructions.

8
3/11/2015

Reducing Code Size

33
Program Validation and 34

Testing
Avoid function inlining.
Choose CPU with compact instructions. But does it work?
Use specialized instructions where possible. Concentrate here on functional
verification.
Major testing strategies:
Black box doesnt look at the source code.
Clear box (white box) does look at the source
code.

Clear-box Testing
35
Controlling and Observing 36

Programs
Examine the source code to determine whether it
works: firout = 0.0;
Controllability:
Can you actually exercise a path? for (j=curr, k=0; j<N; j++, k++)
firout += buff[j] * c[k]; Must fill circular buffer
Do you get the value you expect along a path? for (j=0; j<curr; j++, k++) with desired N values.
Testing procedure: firout +=
+ buff[j] * c[k]; Other code governs
if (firout > 100.0) firout = 100.0; how we access the
Controllability: arovide program with inputs.
if (firout < -100.0) firout = -100.0;
Execute. buffer.
Observability: examine outputs. Observability:
Want to examine
firout before limit
testing.

9
3/11/2015

Execution Paths and Testing

37
Choosing the Paths to Test 38

Paths are important in functional testing as

well as performance analysis. Possible criteria:
Execute every
In general, an exponential number of paths statement at least
through the program. once. not covered
Show that some paths dominate others
others. Execute every
Heuristically limit paths. branch direction at
least once.
Equivalent for
structured programs.
Not true for gotos.

Cyclomatic Complexity
39 40
Basis Paths
Approximate CDFG Cyclomatic
with undirected complexity is a bound
graph. on the size of basis
Undirected graphs sets:
have basis p
paths: e = # edges
g
All paths are linear n = # nodes
combinations of basis p = number of graph
paths. components
M = e n + 2p.

10
3/11/2015

41 42

Branch Testing Branch Testing Example

Heuristic for testing branches.
Correct: Test:
Exercise true and false branches of if (a || (b >= c)) { a=F
conditional. printf(OK\n); } (b >=c) = T
Exercise every simple condition at least once
once. Incorrect: E
Example:
l
if (a && (b >= c)) { Correct: [0 || (3 >= 2)]
printf(OK\n); } =T
Incorrect: [0 && (3 >=
2)] = F

Another Branch Testing 43 44

Example Domain Testing

Correct: Incorrect code Heuristic test for

if ((x == good_pointer) && changes pointer. linear inequalities.
x->field1 == 3)) { printf(got
the value\n); } Assignment returns Test on each side +
new LHS in C
C. b
boundary
d off
Incorrect:
if ((x = good_pointer) && x-
Test that catches inequality.
>field1 == 3)) { printf(got error:
the value\n); }
(x != good_pointer)
&& x->field1 = 3)

11
3/11/2015

45
Loop Testing 46

Def-use Pairs Loops need specialized tests to be tested

efficiently.
Variable def-use:
Heuristic testing strategy:
Def when value is
assigned (defined). Skip loop entirely.
Use when used on One loop iteration
iteration.
right-hand side. Two loop iterations.
Exercise each def- # iterations much below max.
use pair.
n-1, n, n+1 iterations where n is max.
Requires testing
correct path.

47 48

Black-box Testing Black-box Test Vectors

Complements clear-box testing. Random tests.

May require a large number of tests. May weight distribution based on software
Tests software in different ways.
y specification.
Regression tests.
Tests of previous versions, bugs, etc.
May be clear-box tests of previous versions.

12
3/11/2015

How much testing is 49

enough?

Exhaustive testing is impractical.

One important measure of test quality---bugs
escaping into field.
Good organizations can test software to give
very low field bug report rates.
Error injection measures test quality:
Add known bugs.
Run your tests.
Determine % injected bugs that are caught.

Code Optimization for Developers
No ratings yet
Code Optimization for Developers
11 pages
Optimization of Computer Programs in C
No ratings yet
Optimization of Computer Programs in C
37 pages
C Program Optimization Guide
No ratings yet
C Program Optimization Guide
2 pages
Program Optimization Techniques
No ratings yet
Program Optimization Techniques
35 pages
Unit.3
No ratings yet
Unit.3
46 pages
Ecprogramiii Opt Tool
No ratings yet
Ecprogramiii Opt Tool
47 pages
3 1
No ratings yet
3 1
12 pages
C Programming Optimization Techniques
No ratings yet
C Programming Optimization Techniques
79 pages
Data and Instruction Locality in Caches
No ratings yet
Data and Instruction Locality in Caches
78 pages
Unit.3
No ratings yet
Unit.3
37 pages
CS3350B Computer Architecture CPU Performance and Profiling: Marc Moreno Maza
No ratings yet
CS3350B Computer Architecture CPU Performance and Profiling: Marc Moreno Maza
28 pages
Software Optimization for Developers
No ratings yet
Software Optimization for Developers
14 pages
CH5 1
No ratings yet
CH5 1
22 pages
CH5 1
No ratings yet
CH5 1
23 pages
Optimal Code Compiling in C: Nitika Gupta Nistha Seth Prabhat Verma
No ratings yet
Optimal Code Compiling in C: Nitika Gupta Nistha Seth Prabhat Verma
8 pages
Lec01 1 Introduction
No ratings yet
Lec01 1 Introduction
36 pages
Lecture 8
No ratings yet
Lecture 8
37 pages
The 8051 Microcontrollers
No ratings yet
The 8051 Microcontrollers
34 pages
Lecture # 01
No ratings yet
Lecture # 01
30 pages
Platform-Level Performance Analysis: Rohini College of Engineering & Technology
No ratings yet
Platform-Level Performance Analysis: Rohini College of Engineering & Technology
8 pages
Code Optimization Techniques
No ratings yet
Code Optimization Techniques
27 pages
CODE Optimization
No ratings yet
CODE Optimization
50 pages
Lecture 11
No ratings yet
Lecture 11
26 pages
Bounding Worst-Case Instruction Cache Performance
No ratings yet
Bounding Worst-Case Instruction Cache Performance
10 pages
Optimization Techniques Code Optimizations
No ratings yet
Optimization Techniques Code Optimizations
10 pages
Optimizing C Code for Microcontrollers
No ratings yet
Optimizing C Code for Microcontrollers
21 pages
Writing Optimized C Code For Microcontroller Applications
No ratings yet
Writing Optimized C Code For Microcontroller Applications
21 pages
Handout Chapter-1 PBK
No ratings yet
Handout Chapter-1 PBK
14 pages
4 Performance
No ratings yet
4 Performance
67 pages
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
No ratings yet
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
43 pages
Performance Issues
No ratings yet
Performance Issues
19 pages
The Software Optimization Cookbook: Richard Gerber Aart J.C. Bik Kevin B. Smith Xinmin Tian
No ratings yet
The Software Optimization Cookbook: Richard Gerber Aart J.C. Bik Kevin B. Smith Xinmin Tian
13 pages
High-Performance Managed Languages Guide
No ratings yet
High-Performance Managed Languages Guide
107 pages
1576101746
No ratings yet
1576101746
1,450 pages
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
No ratings yet
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
43 pages
Hpca Notes
No ratings yet
Hpca Notes
216 pages
Software Testing & Optimization Guide
No ratings yet
Software Testing & Optimization Guide
21 pages
Imp Topics
No ratings yet
Imp Topics
5 pages
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
No ratings yet
CIS775: Computer Architecture: Chapter 1: Fundamentals of Computer Design
43 pages
Lecture 2: Performance/Power, MIPS Instructions
No ratings yet
Lecture 2: Performance/Power, MIPS Instructions
28 pages
Computer Organization & Design The Hardware/Software Interface, 2nd Edition Patterson & Hennessy
80% (5)
Computer Organization & Design The Hardware/Software Interface, 2nd Edition Patterson & Hennessy
118 pages
CSC 415 Assignment
No ratings yet
CSC 415 Assignment
6 pages
Cache
No ratings yet
Cache
31 pages
What Do You Mean by Code Optimization
No ratings yet
What Do You Mean by Code Optimization
3 pages
Data Oriented Design for Efficient CPU Processing
No ratings yet
Data Oriented Design for Efficient CPU Processing
17 pages
Roadmap For Embedded Engineers
No ratings yet
Roadmap For Embedded Engineers
13 pages
CMP2008 L1
No ratings yet
CMP2008 L1
47 pages
Unit 5 Cd.
No ratings yet
Unit 5 Cd.
27 pages
Mod6 2 PDF
No ratings yet
Mod6 2 PDF
15 pages
Code Optimization in Embedded Systems
No ratings yet
Code Optimization in Embedded Systems
52 pages
Code Optimization and Target Generation
No ratings yet
Code Optimization and Target Generation
24 pages
Compiler Code Optimization Guide
No ratings yet
Compiler Code Optimization Guide
25 pages
Performance and Tuning of Openmp Programs
No ratings yet
Performance and Tuning of Openmp Programs
76 pages
Optimising Serial Code
No ratings yet
Optimising Serial Code
101 pages
Clock Cycle and Performance Metrics
No ratings yet
Clock Cycle and Performance Metrics
15 pages
Parallel Programming Platforms: Alexandre David 1.2.05
No ratings yet
Parallel Programming Platforms: Alexandre David 1.2.05
30 pages
EIE101R01: Basic Electronics Engineering: Textbook and Materials
No ratings yet
EIE101R01: Basic Electronics Engineering: Textbook and Materials
9 pages
Electronics Engineering Basics
No ratings yet
Electronics Engineering Basics
7 pages
EIE 101R01: Basic Electronics Engineering
No ratings yet
EIE 101R01: Basic Electronics Engineering
20 pages
Out PDF
No ratings yet
Out PDF
107 pages
EIE101R01: Basic Electronics Engineering: Textbook and Materials
No ratings yet
EIE101R01: Basic Electronics Engineering: Textbook and Materials
6 pages
BEE Lab Manual - 2019 PDF
No ratings yet
BEE Lab Manual - 2019 PDF
47 pages
RUSA Curriculum
No ratings yet
RUSA Curriculum
63 pages
Ae Lab Ivsem Ece
No ratings yet
Ae Lab Ivsem Ece
60 pages
Medical Imaging Techniques Overview
No ratings yet
Medical Imaging Techniques Overview
6 pages
Exercise 1
100% (1)
Exercise 1
22 pages
MRI Techniques and Components Overview
No ratings yet
MRI Techniques and Components Overview
28 pages
Biomedical Instrumentation Guide
No ratings yet
Biomedical Instrumentation Guide
25 pages
EIE Syllabus: Sensors Overview
No ratings yet
EIE Syllabus: Sensors Overview
11 pages
EEG Feature Analysis and Slope Data
No ratings yet
EEG Feature Analysis and Slope Data
40 pages
Dr. K. Adalarasu: KA - MIT - Unit III - March, 2019, Sastra Deemed To Be University
No ratings yet
Dr. K. Adalarasu: KA - MIT - Unit III - March, 2019, Sastra Deemed To Be University
41 pages
MRI Detection and Imaging Systems
No ratings yet
MRI Detection and Imaging Systems
56 pages
Medical Imaging Techniques
No ratings yet
Medical Imaging Techniques
19 pages
Biomedical Instrumentation: Reference
No ratings yet
Biomedical Instrumentation: Reference
23 pages
Biomedical Engineering: Bio-Potentials
No ratings yet
Biomedical Engineering: Bio-Potentials
11 pages
Assignment 3
No ratings yet
Assignment 3
2 pages
Medical Imaging Techniques
No ratings yet
Medical Imaging Techniques
14 pages
Medical Imaging Techniques: Textbook and Materials
No ratings yet
Medical Imaging Techniques: Textbook and Materials
12 pages
Balasubramaniam ICNE 2014
No ratings yet
Balasubramaniam ICNE 2014
1 page
Diabetic Retinopathy Detection
No ratings yet
Diabetic Retinopathy Detection
7 pages
LabVIEW Workshop for Biomed Engineers
No ratings yet
LabVIEW Workshop for Biomed Engineers
4 pages
Algorithms and Flowcharts 1
100% (1)
Algorithms and Flowcharts 1
32 pages
Dimetra R8.2 Configuration and Administration Tasksheet (Student) v1.5 PDF
No ratings yet
Dimetra R8.2 Configuration and Administration Tasksheet (Student) v1.5 PDF
20 pages
UiTM Kedah Graphic Design Course
100% (1)
UiTM Kedah Graphic Design Course
12 pages
Notes Id Vault and Shared Login
No ratings yet
Notes Id Vault and Shared Login
41 pages
A Novel Method For Image Enhancement
No ratings yet
A Novel Method For Image Enhancement
3 pages
Iot PDF
No ratings yet
Iot PDF
6 pages
Chap15 MC Questions&Answers
100% (1)
Chap15 MC Questions&Answers
5 pages
Strings: ©the Mcgraw-Hill Companies, Inc. Permission Required For Reproduction or Display
No ratings yet
Strings: ©the Mcgraw-Hill Companies, Inc. Permission Required For Reproduction or Display
18 pages
2700 Recetas Thermomix
No ratings yet
2700 Recetas Thermomix
1,144 pages
ABAP On SAP HANA
100% (1)
ABAP On SAP HANA
76 pages
Atari Graphics Memory Guide
No ratings yet
Atari Graphics Memory Guide
2 pages
SQL Database and Table Commands Guide
0% (1)
SQL Database and Table Commands Guide
8 pages
Amber 2015 Reference Manual: (Covers Amber14 and Ambertools15)
No ratings yet
Amber 2015 Reference Manual: (Covers Amber14 and Ambertools15)
10 pages
Recursion Ninja
No ratings yet
Recursion Ninja
5 pages
Implementation of A High Speed Single Precision Floating Point Unit Using Verilog
No ratings yet
Implementation of A High Speed Single Precision Floating Point Unit Using Verilog
5 pages
Teradata Magazine-The Lazy DBA
No ratings yet
Teradata Magazine-The Lazy DBA
3 pages
Naresh I Technologies Hyderabad: Under The Guidance of
No ratings yet
Naresh I Technologies Hyderabad: Under The Guidance of
38 pages
1.0.1.2 Design Hierarchy Instructions - IG
100% (2)
1.0.1.2 Design Hierarchy Instructions - IG
4 pages
E Last Alert
No ratings yet
E Last Alert
53 pages
A Computational Study of The Pseudoflow and Push-Relabel Algorithms For The Maximum Flow Problem PDF
No ratings yet
A Computational Study of The Pseudoflow and Push-Relabel Algorithms For The Maximum Flow Problem PDF
19 pages
Custom Sorter and Filter in Sapui5 Table
No ratings yet
Custom Sorter and Filter in Sapui5 Table
6 pages
Exercise - 1 SQL Basics - DDL
0% (1)
Exercise - 1 SQL Basics - DDL
3 pages
Statement Coverage, Branch Coverage, Condition Coverage Tutorials For Software Testers
No ratings yet
Statement Coverage, Branch Coverage, Condition Coverage Tutorials For Software Testers
7 pages
Wildfire PDF
No ratings yet
Wildfire PDF
4 pages
Networking Basics for IT Beginners
No ratings yet
Networking Basics for IT Beginners
25 pages
Spring JPA & Hibernate Guide
No ratings yet
Spring JPA & Hibernate Guide
98 pages
Hibernate Architecture
No ratings yet
Hibernate Architecture
18 pages
Graph Algorithms & Optimization Guide
No ratings yet
Graph Algorithms & Optimization Guide
8 pages
E-Payroll System Overview and Benefits
No ratings yet
E-Payroll System Overview and Benefits
54 pages

Program Design and Analysis Program-Level Performance Analysis

Uploaded by

Program Design and Analysis Program-Level Performance Analysis

Uploaded by

3/11/2015

Program design and 1

Program-level performance analysis. Need to understand

Varies with input data: Simulate execution of the CPU.

Average-case execution time. Basic program execution time formula:

an if statement Paths in a loop

Trace-driven Measurement Physical Measurement

Strength reduction (x*2 -> x<<1).

Array Conflicts in Cache

Array conflicts, contd.

Array elements conflict because they are

Main Memory Cache

Cache Behavior is Important Cache Sweet Spot

Energy consumption has a sweet

Optimizing for Energy Optimizing for Energy

First-order optimization: Use registers efficiently.

Efficient Loops Single-instruction Repeat

Optimizing for Program Size

Reducing Code Size

Execution Paths and Testing

Paths are important in functional testing as

Branch Testing Branch Testing Example

Another Branch Testing 43 44

Example Domain Testing

Correct: Incorrect code Heuristic test for

Def-use Pairs Loops need specialized tests to be tested

Black-box Testing Black-box Test Vectors

Complements clear-box testing. Random tests.

How much testing is 49

Exhaustive testing is impractical.

You might also like