4 Data-Testing PDF
4 Data-Testing PDF
Topics in
Data-Flow Testing
© SERG
Data-Flow Testing
• Data-flow testing uses the control
flowgraph to explore the unreasonable
things that can happen to data (i.e.,
anomalies).
• Consideration of data-flow anomalies leads
to test path selection strategies that fill the
gaps between complete path testing and
branch or statement testing.
© SERG
Data-Flow Testing (Cont’d)
• Data-flow testing is the name given to a
family of test strategies based on selecting
paths through the program’s control flow in
order to explore sequences of events related
to the status of data objects.
• E.g., Pick enough paths to assure that:
– Every data object has been initialized prior to its
use.
– All defined objects have been used at least once.
© SERG
Data Object Categories
• (d) Defined, Created, Initialized
• (k) Killed, Undefined, Released
• (u) Used:
– (c) Used in a calculation
– (p) Used in a predicate
© SERG
(d) Defined Objects
• An object (e.g., variable) is defined when it:
– appears in a data declaration
– is assigned a new value
– is a file that has been opened
– is dynamically allocated
– ...
© SERG
(k) Killed Objects
• An object is killed when it is:
– released (e.g., free) or otherwise made
unavailable (e.g., out of scope)
– a loop control variable when the loop exits
– a file that has been closed
– ...
© SERG
(u) Used Objects
• An object is used when it is part of a
computation or a predicate.
• A variable is used for a computation (c)
when it appears on the RHS (sometimes
even the LHS in case of array indices) of an
assignment statement.
• A variable is used in a predicate (p) when it
appears directly in that predicate.
© SERG
Example: Definition and Uses
What are the definitions and uses for the program
below?!
© SERG
Example: Definition and Uses
Def C-use P-use
1. read (x, y); x, y
2. z = x + 2; z x
3. if (z < y) z, y
4 w = x + 1;
w x
else
5. y = y + 1; y y
6. print (x, y, w, z); x, y,
w, z
© SERG
Data-Flow Anomalies
• A data-flow anomaly is denoted by a two
character sequence of actions. E.g.,
– ku: Means that an object is killed and then
used.
– dd: Means that an object is defined twice
without an intervening usage.
© SERG
Example
• E.g., of a valid (not anomalous) scenario
where variable A is a dpd:
A = C + D;
if(A > 0)
X = 1;
else
X = -1;
A = B + C;
© SERG
Two Letter Combinations for
dku
• dd: Probably harmless, but suspicious.
• dk: Probably a bug.
• du: Normal situation.
• kd: Normal situation.
• kk: Harmless, but probably a bug.
• ku: Definitely a bug.
• ud: Normal situation (reassignment).
• uk: Normal situation.
• uu: Normal situation.
© SERG
Single Letter Situations
• A leading dash means that nothing of
interest (d, k, u) occurs prior to the action
noted along the entry-exit path of interest.
• A trailing dash means that nothing of
interest happens after the point of action
until the exit.
© SERG
Single Letter Situations
• -k: Possibly anomalous:
– Killing a variable that does not exist.
– Killing a variable that is global.
• -d: Normal situation.
• -u: Possibly anomalous, unless variable is
global.
• k-: Normal situation.
• d-: Possibly anomalous, unless variable is
global.
• u-: Normal situation.
© SERG
Data-Flow Anomaly State Graph
state of variable
action
K
k u,k
d anomalous
state
d d,k
u U D A
u
k,u,d
© SERG
Data-Flow Anomaly State Graph
with Variable Redemption
k k
u
KU u K k
k d d
u
d u
DK D U
k d u
d
k u
DD
d
© SERG
Static vs Dynamic
Anomaly Detection
• Static Analysis is analysis done on source
code without actually executing it.
• E.g., Syntax errors are caught by static
analysis.
© SERG
Static vs Dynamic
Anomaly Detection (Cont’d)
• Dynamic Analysis is analysis done as a
program is executing and is based on
intermediate values that result from the
program’s execution.
• E.g., A division by 0 error is caught by
dynamic analysis.
• If a data-flow anomaly can be detected by
static analysis then the anomaly does not
concern testing. (Should be handled by the
compiler.)
© SERG
Anomaly Detection Using
Compilers
• Compilers are able to detect several data-
flow anomalies using static analysis.
• E.g., By forcing declaration before use, a
compiler can detect anomalies such as:
– -u
– ku
• Optimizing compilers are able to detect
some dead variables.
© SERG
Is Static Analysis Sufficient?
• Questions:
• Why isn’t static analysis enough?
• Why is testing required?
• Could a good compiler detect all data-flow
anomalies?
• Answer: No. Detecting all data-flow
anomalies is provably unsolvable.
© SERG
Static Analysis Deficiencies
• Current static analysis methods are
inadequate for:
– Dead Variables: Detecting unreachable
variables is unsolvable in the general case.
– Arrays: Dynamically allocated arrays contain
garbage unless they are initialized explicitly.
(-u anomalies are possible)
© SERG
Static Analysis Deficiencies
(Cont’d)
– Pointers: Impossible to verify pointer values at
compile time.
– False Anomalies: Even an obvious bug (e.g.,
ku) may not be a bug if the path along which
the anomaly exists is unachievable.
(Determining whether a path is or is not
achievable is unsolvable.)
© SERG
Data-Flow Modeling
• Data-flow modeling is based on the control
flowgraph.
• Each link is annotated with:
– symbols (e.g., d, k, u, c, p)
– sequences of symbols (e.g., dd, du, ddd)
• that denote the sequence of data operations
on that link with respect to the variable of
interest.
© SERG
Control Flowgraph Annotated for
X and Y Data Flows
1 INPUT X,Y
JOE LOOP B(U)?
Z:= X+Y
Y:= X-Y
dcc
1 3 4 5 6 7
3 IF Z>=0 GOTO SAM
4 JOE: Z:=Z-1 SAM
5 SAM: Z:=Z+V Z?
U:=0
6 LOOP
B(U),Q(V):=(Z+V)*U ELL
7 IF B(U)=0 GOTO JOE U,Z? Z?
Z:=Z-1 2 13 12 11 10 9 8
8 IF Z=0 GOTO ELL
U:=U+1
END YY U,V? U,V?
9 UNTIL U=z
B(U-1):=B(U+1)+Q(V-1)
10 ELL: B(U+Q(V)):=U+V
11 IF U=V GOTO JOE
12 IF U>V THEN U:=Z
13 YY:Z:=U
2 END
© SERG
Control Flowgraph Annotated for
Z Data Flows
1 INPUT X,Y p LOOP B(U)?
Z:= X+Y JOE
d
Y:= X-Y 1 3 p 4 5 cd 6 7
3 IF Z>=0 GOTO SAM cd c
Z? SAM
4 JOE: Z:=Z-1
5 SAM: Z:=Z+V cd
p
U:=0
ELL
6 LOOP d p p
B(U),Q(V):=(Z+V)*U 2 13 c 12 11 10 9 8 Z?
7 IF B(U)=0 GOTO JOE
END YY U,V? U,V? U,Z?
Z:=Z-1
p
8 IF Z=0 GOTO ELL
U:=U+1
9 UNTIL U=z
B(U-1):=B(U+1)+Q(V-1)
10 ELL: B(U+Q(V)):=U+V
11 IF U=V GOTO JOE
12 IF U>V THEN U:=Z
13 YY:Z:=U
2 END
© SERG
Definition-Clear Path Segments
• A Definition-clear Path Segment (w.r.t.
variable X) is a connected sequence of links
such that X is defined on the first link and
not redefined or killed on any subsequent
link of that path segment.
© SERG
Definition-Clear Path Segments
for Variable Z (Cont’d)
p LOOP B(), U?
JOE
d
1 3 4 5 6 7
p cd
cd c
SAM
Z?
cd
p
ELL
d p p
c
2 13 12 11 10 9 8 Z?
© SERG
Non Definition-Clear Path
Segments for Variable Z (Cont’d)
p LOOP B(), U?
JOE
d
1 3 4 5 6 7
p cd
cd c
SAM
Z?
cd
p
ELL
d p p
c
2 13 12 11 10 9 8 Z?
© SERG
Simple Path Segments
• A Simple Path Segment is a path segment
in which at most one node is visited twice.
– E.g., (7,4,5,6,7) is simple.
• Therefore, a simple path may or may not be
loop-free.
© SERG
Loop-free Path Segments
• A Loop-free Path Segment is a path
segment for which every node is visited at
most once.
– E.g., (4,5,6,7,8,10) is loop-free.
– path (10,11,4,5,6,7,8,10,11,12) is not loop-free
because nodes 10 and 11 are visited twice.
© SERG
du Path Segments
• A du Path is a path segment such that if the
last link has a use of X, then the path is
simple and definition clear.
© SERG
def-use Associations
• A def-use association is a triple (x, d, u,), where:
x is a variable,
d is a node containing a definition of x,
u is either a statement or predicate node
containing a use of x,
© SERG
Example: Def-Use Associations
1! read (x, y)
w=x+1 5! y=y+1
4!
6! print (x,y,w,z)
© SERG
Example: Def-Use Associations
What are all the def-use associations for the program below?
read (z)
x=0
y=0
if (z ≥ 0)
{
x = sqrt (z)
if (0 ≤ x && x ≤ 5)
y = f (x)
else
y = h (z)
}
y = g (x, y)
print (y)
© SERG
Example: Def-Use Associations
read (z)
def-use associations for variable z.! x=0
y=0
if (z ≥ 0)
{
x = sqrt (z)
if (0 ≤ x && x ≤ 5)
y = f (x)
else
y = h (z)
}
y = g (x, y)
print (y)
© SERG
Example: Def-Use Associations
read (z) def-use associations for variable
x=0 x.!
y=0
if (z ≥ 0)
{
x = sqrt (z)
if (0 ≤ x && x ≤ 5)
y = f (x)
else
y = h (z)
}
y = g (x, y)
print (y)
© SERG
Example: Def-Use Associations
read (z)
x=0 def-use associations for variable y.!
y=0
if (z ≥ 0)
{
x = sqrt (z)
if (0 ≤ x && x ≤ 5)
y = f (x)
else
y = h (z)
}
y=g (x, y)
print (y)
© SERG
Definition-Clear Paths
• A path (i, n1, ..., nm, j) is called a definition-clear
path with respect to x from node i to node j if it
contains no definitions of variable x in nodes
(n1, ..., nm , j) .
© SERG
Data-Flow Testing Strategies
• All du Paths (ADUP)
• All Uses (AU)
• All p-uses/some c-uses (APU+C)
• All c-uses/some p-uses (ACU+P)
• All Definitions (AD)
• All p-uses (APU)
• All c-uses (ACU)
© SERG
All du Paths Strategy (ADUP)
• ADUP is one of the strongest data-flow
testing strategies.
• ADUP requires that every du path from
every definition of every variable to every
use of that definition be exercised under
some test All du Paths Strategy (ADUP).
© SERG
Example: pow(x,y)
/* pow(x,y)
This program computes x to the power of y, where x and y are integers.
INPUT: The x and y values.
OUTPUT: x raised to the power of y is printed to stdout.
*/
1 void pow (int x, y)
2 {
3 float z;
4 int p;
5 if (y < 0) b g
6 p = 0 – y; a f i
7 else p = y; d
1 5 8 9 14 16 17
8 z = 1.0;
9 while (p != 0)
10 { c e h
11 z = z * x;
12 p = p – 1;
13 }
14 if (y < 0)
15 z = 1.0 / z;
16 printf(z);
17 }
© SERG
Example: pow(x,y)
du-Path for Variable x
/* pow(x,y)
This program computes x to the power of y, where x and y are integers.
INPUT: The x and y values.
OUTPUT: x raised to the power of y is printed to stdout.
*/
1 void pow (int x, y)
2 {
3 float z;
4 int p;
5 if (y < 0) b g
6 p = 0 – y; a d f i
7 else p = y; 1 5 8 9 14 16 17
8 z = 1.0;
9 while (p != 0)
10 { c e h
11 z = z * x;
12 p = p – 1;
13 }
14 if (y < 0)
15 z = 1.0 / z;
16 printf(z);
17 }
© SERG
Example: pow(x,y)
du-Path for Variable x
/* pow(x,y)
This program computes x to the power of y, where x and y are integers.
INPUT: The x and y values.
OUTPUT: x raised to the power of y is printed to stdout.
*/
1 void pow (int x, y)
2 {
3 float z;
4 int p;
5 if (y < 0) b g
6 p = 0 – y; a d f i
7 else p = y; 1 5 8 9 14 16 17
8 z = 1.0;
9 while (p != 0)
10 { c e h
11 z = z * x;
12 p = p – 1;
13 }
14 if (y < 0)
15 z = 1.0 / z;
16 printf(z);
17 }
© SERG
Example: pow(x,y)
du-Path for Variable y
/* pow(x,y)
This program computes x to the power of y, where x and y are integers.
INPUT: The x and y values.
OUTPUT: x raised to the power of y is printed to stdout.
*/
1 void pow (int x, y)
2 {
3 float z;
4 int p;
5 if (y < 0) b g
6 p = 0 – y; a d f i
7 else p = y; 1 5 8 9 14 16 17
8 z = 1.0;
9 while (p != 0)
10 { c e h
11 z = z * x;
12 p = p – 1;
13 }
14 if (y < 0)
15 z = 1.0 / z;
16 printf(z);
17 }
© SERG
Example: pow(x,y)
du-Path for Variable y
/* pow(x,y)
This program computes x to the power of y, where x and y are integers.
INPUT: The x and y values.
OUTPUT: x raised to the power of y is printed to stdout.
*/
1 void pow (int x, y)
2 {
3 float z;
4 int p;
5 if (y < 0) b g
6 p = 0 – y; a d f i
7 else p = y; 1 5 8 9 14 16 17
8 z = 1.0;
9 while (p != 0)
10 { c e h
11 z = z * x;
12 p = p – 1;
13 }
14 if (y < 0)
15 z = 1.0 / z;
16 printf(z);
17 }
© SERG
Example: pow(x,y)
du-Path for Variable y
/* pow(x,y)
This program computes x to the power of y, where x and y are integers.
INPUT: The x and y values.
OUTPUT: x raised to the power of y is printed to stdout.
*/
1 void pow (int x, y)
2 {
3 float z;
4 int p;
5 if (y < 0) b g
6 p = 0 – y; a d f i
7 else p = y; 1 5 8 9 14 16 17
8 z = 1.0;
9 while (p != 0)
10 { c e h
11 z = z * x;
12 p = p – 1;
13 }
14 if (y < 0)
15 z = 1.0 / z;
16 printf(z);
17 }
© SERG
Example: Using du-Path Testing
to Test Program COUNT
• Consider the following program:
/* COUNT
This program counts the number of characters and lines in a text file.
INPUT: Text File
OUTPUT: Number of characters and number of lines.
*/
1 main(int argc, char *argv[])
2 {
3 int numChars = 0;
4 int numLines = 0;
5 char chr;
6 FILE *fp = NULL;
7
© SERG
Program COUNT (Cont’d)
8 if (argc < 2)
9 {
10 printf(“\nUsage: %s <filename>”, argv[0]);
11 return (-1);
12 }
13 fp = fopen(argv[1], “r”);
14 if (fp == NULL)
15 {
16 perror(argv[1]); /* display error message */
17 return (-2);
18 }
© SERG
Program COUNT (Cont’d)
19 while (!feof(fp))
20 {
21 chr = getc(fp); /* read character */
22 if (chr == ‘\n’) /* if carriage return */
23 ++numLines;
24 else
25 ++numChars;
26 }
27 printf(“\nNumber of characters = %d”, numChars);
28 printf(“\nNumber of lines = %d”, numLines);
29 }
© SERG
The Flowgraph for COUNT
1 8 11 14 17 19 22 23 24 26 29
p
d p
1 8 11 14 17 19 22 23 24 26 29
argc?
© SERG
du-Path for argc
p
d p
1 8 11 14 17 19 22 23 24 26 29
argc?
© SERG
du-Path for argv[]
d c c
1 8 11 14 17 19 22 23 24 26 29
argc? fp?
© SERG
du-Path for argv[]
d c
1 8 11 14 17 19 22 23 24 26 29
argc?
© SERG
du-Path for numChars
fp?
d cd
1 8 11 14 17 19 22 23 24 26 29
© SERG
du-Path for numChars
fp?
d cd
1 8 11 14 17 19 22 23 24 26 29
© SERG
du-Path for numChars
fp?
d cd
1 8 11 14 17 19 22 23 24 26 29
© SERG
du-Path for numLines
fp?
d cd
1 8 11 14 17 19 22 23 24 26 29
© SERG
du-Path for numLines
fp?
d cd
1 8 11 14 17 19 22 23 24 26 29
© SERG
du-Path for numLines
fp?
d cd
1 8 11 14 17 19 22 23 24 26 29
© SERG
du-Path for chr
fp? p
d d
1 8 11 14 17 19 22 23 24 26 29
p
argc? fp? chr?
© SERG
du-Path for chr
fp? p
d d
1 8 11 14 17 19 22 23 24 26 29
p
argc? fp? chr?
© SERG
du-Path for fp
d
fp? p
d p
1 8 11 14 17 19 pc 22 23 24 26 29
fp? p chr?
argc?
© SERG
du-Path for fp
d
fp? p
d p
1 8 11 14 17 19 pc 22 23 24 26 29
fp? p chr?
argc?
© SERG
du-Path for fp
d
fp? p
d p
1 8 11 14 17 19 pc 22 23 24 26 29
fp? p chr?
argc?
© SERG
All Uses Strategy (AU)
• AU requires that at least one path from
every definition of every variable to every
use of that definition be exercised under
some test.
• Hence, at least one definition-clear path
from every definition of every variable to
every use of that definition be exercised
under some test.
• Clearly, AU < ADUP.
© SERG
All p-uses/Some c-uses
Strategy (APU+C)
• APU+C requires that for every variable and
every definition of that variable include at
least one definition-free path from the
definition to every predicate use.
• If there are definitions of the variable that
are not covered by the above prescription,
then add computational-use test cases to
cover every definition.
© SERG
All c-uses/Some p-uses Strategy
(ACU+P)
• ACU+P requires that for every variable and
every definition of that variable include at
least one definition-free path from the
definition to every computational use.
• If there are definitions of the variable that
are not covered by the above prescription,
then add predicate-use test cases to cover
every definition.
© SERG
All Definitions Strategy (AD)
• AD requires that for every variable and
every definition of that variable include at
least one definition-free path from the
definition to a computational or predicate
use.
• AD < ACU+P and AD < APU+C.
© SERG
All p-uses (APU)
All c-uses (ACU)
• APU is the same as APU+C without the C
requirement.
• APU < APU+C.
• ACU is the same as ACU+P without the P
requirement.
• ACU < ACU+P.
© SERG
Relationship among DF criteria
ALL-PATHS!
ALL-DU-PATHS!
ALL-USES!
ALL-P-USES/SOME-C-USES!
ALL-C-USES/SOME-P-USES!
ALL-P-USES!
ALL-C-USES! ALL-DEFS!
ALL-EDGES!
ALL-NODES!
© SERG
Feasible Data Flow Criteria
• What happens if we eliminate all un-executable
associations from consideration?
• If we eliminate all un-executable associations from
consideration, then there are significant differences
between the Data Flow criteria and the Feasible Data Flow
criteria.
• For a large class of “well behaved programs”, the Feasible
DF criteria All-p-uses, All-p-uses/some-c-uses, and All-uses
bridge the gap between All-edges and All-paths.
• However, for certain programs with anomalies there are
tests which satisfy All-p-uses without satisfying All-edges.
© SERG
An Example
1! read (x)
6
T F
2! x = sqrt(x)
7! 8!
3 x<0
9!
T F
T F
© SERG
Effectiveness of Strategies
• Ntafos compared Random, Branch, and
All uses testing strategies on 14 Kernighan
and Plauger programs.
• Kernighan and Plauger programs are a set
of mathematical programs with known bugs
that are often used to evaluate test
strategies.
• Ntafos conducted two experiments:
© SERG
Results of 2 of the 14
Ntafos Experiments
Mean Number Percentage of
Strategy of Test Cases Bugs Found
Random 35 93.7
Branch 34 85.5
© SERG
Data-Flow Testing Tips (Cont’d)
• Use explicit (rather than implicit)
declarations of data when possible.
• Put data declarations at the top of the
routine and return data objects at the bottom
of the routine.
© SERG
Summary
• Data are as important as code.
• Define what you consider to be a data-flow
anomaly.
• Data-flow testing strategies span the gap
between all paths and branch testing.
© SERG
Summary
• AU has the best payoff for the money. It
seems to be no worse than twice the number
of required test cases for branch testing, but
the results are much better.
• Path testing with Branch Coverage and
Data-flow testing with AU is a very good
combination.
© SERG