III.3 Dynamic Testing Techniques
Common definition [Myers1979] :
Testing is a process of
executing
a program with the
intent of finding an error.
A good test case is one that
has high probability of finding
an undiscovered error.
A successful test is one that
uncovers an as-yet
undiscovered error.
Also:
testing to assess reliability
(not considered here)
Test:
(1) An activity in which a system or
component is
executed
under
specified conditions, the results are
observed or recorded, and an
evaluation is made of some aspect
of the system or component.
(2) To conduct an activity as in (1).
(3) (IEEE Std 829-1983 [5]) A set of one
or more
test cases
.
(4) (IEEE Std 829-1983 [5]) A set of one
or more test procedures.
(5) (IEEE Std 829-1983 [5]) A set of one
or more test cases and procedures.
Objectives
Design tests that
systematically uncover defects
Maximize bug count: uncover as many defects (or bugs) as possible
Using the minimum cost and efforts, within the limits budgetary/scheduling
Guide correction such that an acceptable level of quality is reached
Help managers
make ship / no-ship decisions
Assess quality (depends on the nature of the product)
Block premature product releases; Minimize technical support costs
It is impossible to verify correctness of the product by testing, but you can
prove that the product is not correct or you can demonstrate that you didn’t
find any errors in a give time period.
Minimize risks
(especially safety-related lawsuit risk):
Assess conformance to specification
Conform to regulations
Find safe
scenarios for use
(find ways to get it to work, in spite of the bugs)
Indirect objectives
To compile a
record of software defects
for use in error prevention
(by corrective and preventive actions)
Testing Fundamentals
Software Testing is a
critical
element of Software Quality
Assurance
It represents the ultimate review of the requirements specification,
the design, and the code.
It is the most widely used method to insure software quality.
Many organizations spend 40-50% of development time in testing.
Testing is the one step in software engineering process that
could be viewed as destructive rather than constructive.
“A successful test is one that breaks the software.” [McConnell 1993]
A successful test is one that uncovers an as yet undiscovered
defect.
Testing can not show the absence of defects, it can only show that
software defects are present.
For most software exhaustive testing is not possible.
Test Cases & Set of Test Cases
Describe how to test a
system/module/function
Description must identify
system state before
executing the test
system/module/function
to be tested
parameter values for the
test
expected outcome of
the test
Should enable automation!
“Adequate” Set of Test Cases?
Objective:
to uncover as many defects as
possible
Test set criteria:
does the set cover the system in
a
complete/sufficient
manner
Constraints:
with a
minimum/appropriate
of
effort and time
I
e
Input test data
O
e
Output test results
System
Inputs causing
anomalous
behaviour
Outputs which reveal
the presence of
defects
Model of Testing (1/2)
The
environment
The
program
Nature and
psychology
Environment
model
Program
model
Bug
model
Tests
Unexpected
outcome
Expected
outcome
Testing Approaches
Black box testing
1.
Testing that ignores the internal mechanism of the
system or component and focuses solely on the outputs in
response to selected inputs and execution conditions
2.
Testing conducted to evaluate the compliance of a system
or component with specified functional requirements
(sometimes named functional testing)
White box testing
Testing that takes into account the internal mechanism of
a system or component
(sometimes named structural testing)
III.3.2 Black Box Testing Techniques
An approach to testing
where the program is
considered as a
“black-box”
requirements
input
events
output
Black box testing:
(1)
Function testing
(2)
Domain testing
(3)
Specification-based testing
(4)
Risk-based testing
(5)
Stress testing
(6)
Regression testing
(7)
User testing
(8)
Scenario testing
(9)
State-model based testing
(10)
High volume automated testing
(11)
Exploratory testing
(1)
Function Testing
Test each function / feature / variable in isolation.
Usually start with fairly
simple
function testing
focuses on a single function and tests it with
middle-of-the-road values
don’t expect the program to fail
but if it fails the algorithm is fundamentally wrong, the
build is broken, or …
Later switch to a different style which often involves
the
interaction of several functions
These tests are highly credible and easy to
evaluate but not particularly powerful
(2)
Domain Testing
Essence: sampling
Equivalence class testing:
Look at each variable domains (including invalid values)
reduce a massive set of possible tests (domain) into a small group
by partitioning
picking one or two representatives from each subset.
look for "best representative“: a value that is at least as likely to
expose an error as any other
Boundary value analysis:
A good test set for a numeric variable hits every boundary value
because boundary / extreme-value errors are common
use the minimum, the maximum, a value barely below the minimum,
and a value barely above the maximum
bugs are sometimes dismissed, especially when you test extreme
values of several variables at the same time (corner cases)
Equivalence Partitioning
Input data and output
results often fall into
different classes
Each of these classes
is an
equivalence
partition
where the
program behaves in an
“equivalent” way
Ö
Test cases should be
chosen from each
partition
System
Outputs
Equivalence Partitioning
an approach that divides the input domain of a program into
classes of data from which test cases can be derived.
Example:
Suppose a program computes the value of the
function.
This function defines the following
valid and invalid equivalence classes:
X < = -2
valid
-2 < X < 1
invalid
X >= 1
valid
Test cases would be selected from each equivalence class.
)
2
(
)
1
(
X
−
∗
X
+
[SWENE]
Boundary Value Analysis (1/2)
a testing technique where test cases are
designed to test the boundary of an input
domain.
Boundary value analysis complements and can
be used in conjunction with equivalence
partitioning.
Example (cont.):
boundary values of the three
ranges have to be included in the test data.
That is, we might choose the following test
cases:
X <= -2
MIN, -100, -2.1, -2
-2 < X < 1
-1.9, -1, 0, 0.9
X >= 1
1, 1.1,100, MAX
[SWENE]
“Bugs lurk in corners
and congregate at
boundaries…”
Boundary Value Analysis (2/2)
If an input condition specifies a range [a, b]:
Example: [-3, 10],
Ö
test values:
0 (interior point); 0 is always good candidate!
Ö
test values:
-3, 10 (extreme points)
Ö
test values:
-2, 9 (boundary points)
Ö
test values:
-4, 11 (off points)
If an input condition specifies a number values:
Ö
minimum and maximum
Ö
values just above and below minimum and maximum
Also for more complex data (data structures):
array
Ö
input condition: empty, single element, full element, out-of-boundary
Ö
search for element: element is inside array or not
(3)
Specification-Based Testing
Check the program against a reference document
design specification, a requirements list, a user interface description, a
model, or a user manual.
Rational:
the specification might be part of a contract
products must conform to their advertisements
safety-critical products must conform to any safety-related specification.
Observation
Specification-driven tests are often weak, not particularly powerful
representatives of the class of possible tests
Different approaches:
focus on the specification: the set of test cases should include an
unambiguous and relevant test for each claim made in the spec.
look further, for problems in the specification: explore ambiguities in the
spec. or examine aspects of the product that were not well-specified
(4) Risk-Based Testing
Process
(1)
Imagine a way the program could fail (cf. error guessing)
(2)
design one or more tests to check for it
a good risk-based test = a powerful representative that
address a given risk.
“complete”: a list of every way the program could fail.
If tests tie back to
significant failures in the field or
well known failures in a competitor’s product
Ö
failure will be highly credible and highly motivating
However, many risk-based tests are dismissed as
academic (unlikely to occur in real use).
Test for possible problem
Ö
carry high information value
(We learn a lot whether the program passes the test or not)
(5)
Stress Testing
Different definitions of stress testing:
Hit the program with a peak burst of activity and see it fail.
Testing conducted to evaluate a system or component at or beyond the
limits of its specified requirements with the goal of causing the system to
fail. [IEEE Standard 610.12-1990]
Driving the program to failure in order to watch how the program fails.
Keep increasing the size or rate of input until either the program finally fails or
you become convinced that further increases won’t yield a failure.
Look at the failure and ask what vulnerabilities have been exposed and which of
them might be triggered under less extreme circumstances.
Discussion:
stress test results as not representative of customer use
One problem with stress testing is that a failure may not be easily analyzed
(must includes enough diagnostic support)
stress-like tests can expose failures that are hard to see if the system is not
running with a high load (e.g., many concurrent tasks)
(6)
Regression Testing
Process
(1)
design tests with the intent of regularly reusing them
(2)
repeat the tests after making changes to the program.
A good regression test is designed for
reuse
.
adequately documented and maintainable
designed to be likely to fail if changes induce errors in the
addressed part of the program
Discussion:
First run of the tests may have been powerful, but after passed
many times detecting defects is not very likely (unless there have
been major changes or changes in part of the code directly
involved with this test.)
Ö
Usually carries little information value
Regression Testing Process
Requirements changes
Affect design, coding, and
testing document update
Design changes
Affect coding, tests, associated
components, system architecture,
related component interactions
Implementation changes
Affect test cases, test data, test scripts,
test specification (see also code change impact)
Test changes
Affect other tests and test documentation
Document changes
Affect other document
Software Change
Analysis
Software Change
Impact Analysis
Define Regression
Testing Strategy
Build Regression
Test Suite
Report Retest
Results
Run Regression
Regression Testing Strategies
Random
: The test cases are randomly selected from the
existing test suite.
Retest-all
: Run all the tests in the existing suite.
Safe
: The test selection algorithm excludes no test from the
original test suite that if executed would reveal faults in the
modified program.
Based on modifications
: Place an emphasis on selecting existing
test cases to cover modified program components and those that
may be affected by the modifications.
Dataflow/coverage based
: Select tests that exercise data
interactions that have been affected by modifications.
Remark ([Juriso+2004]):
For small programs and set of test cases is small, then retest-all.
Use safe techniques for large programs and programs with a large
number of test cases
.
(7)
User Testing
The testing done by real users (Not testers pretending to be users)
User tests might be designed by
the users,
the testers, or
other people (sometimes even by lawyers; acceptance tests in a contract)
Continuum:
tests can be designed in such detail that the user merely executes them
reports whether the program passed or failed
a carefully scripted demonstration, without much opportunity to detect defects
Leave the user some freedom to
cognitive activity while providing enough structure to ensure effective reports (in a way that helps readers understand and troubleshoot the problem)
detect problems a user will encounter in real use is much more difficult
Failures found in user testing are typically credible
Beta tests:
often described as cheap, effective user tests
but in practice they can be quite expensive to administer
you may not yield much information.
automation required!
Beta tests:
often described as cheap, effective user tests
but in practice they can be quite expensive to administer
you may not yield much information.
(8)
Scenario Testing
Check how the program copes with a scenario
Good scenario test should be
credible,
motivating,
easy to evaluate, and
complex.
How often run a given scenario test?
Some create a pool of scenario tests as regression tests
Others run a scenario once or a small number of times and then design
another scenario rather than sticking with the ones they’ve used before
Scenarios help to develop insight into the product
Early in testing (understand the product at all) or
Late in testing (understand advanced uses of a stabilized product)
(9)
State-Model-Based Testing
Model the visible behavior of the program as a state machine
Checking for conformance with the state machine model
Approach is credible, motivating and easy to troubleshoot
Tradeoffs:
state-based testing often involves simplification and if the model is
oversimplified, failures exposed by the model can be false negatives and
difficult to troubleshoot
more detailed models find more bugs, but can be much harder to read and
maintain
Experiences shows that much of the bug-finding happens while
modeling the state models rather than testing
coverage criteria / stopping rule (cf. [Al-Ghafees&Whittaker2002])
State-Model-Based Testing
Uncovers the following issues:
Wrong number of states.
Wrong transition for a given state-input combination.
Wrong output for a given transition.
States or sets of states that have become dead.
States or sets of states that have become unreachable.
Test sequence:
Define a set of covering input sequences that
starting from the initial state and
get back to the initial state
For each step in the input sequence, defines the expected
next state, the expected transition, and the expected output code
Remarks:
Completeness: transitions and outputs for every combination of input and states
Instrument the transitions to also capture the sequence of states (not just the outputs)
Variant: Syntax Testing
Test model: BNF-based syntax instead of state machine
E.g., for testing command-driven software
Dirty syntax testing:
a) to exercise a good sample of single syntactic errors in all commands in an attempt to break the software.
b) to force every diagnostic message appropriate to that command to be executed.
Example test cases:
correct ones
1.54683e5,
1.6548
incorrect without loops:
.05
1.
1.1e
incorrect with loops:
123456.78901.1
1.12345e6789D-123
<digital>
.
<digital>
<digital>
E
e
D
d
+
-Syntax Graph for <real_number>:
(10)
High-Volume Automated Testing
Massive numbers of tests, comparing the results against a partial
oracles.
The simplest partial oracle: crashing
State-model-based testing
(if the stopping rule is not based on a coverage
criterion) or also more general notion of
stochastic state-based testing
[Whittaker1997].
Randomly corrupt input data
Repeatedly feeds random data to the application under test and a reference
oracle in parallel [Kaner2000] (back-to-back testing)
Run arbitrarily long random sequence
of regression tests to let memory
leaks, stack corruption, wild pointers or other garbage that cumulates over
time finally causes failures. Probes (tests built into the program) can be
used to expose problems. [Kaner2000]
An almost
complete automation
for the testing is required
The low power of individual tests is make up for with
massive numbers
(11) Exploratory Testing
any testing which uses information gained while testing to better test [Bach2003a].
continuum between
purely scripted (the tester does precisely what the script specifies and nothing else) to
purely exploratory (no pre-specified activities, not documentation beyond bug reports is required)
prototypic case/ “freestyle exploratory testing”
exploratory testers continually learn about the software, the context, the ways it could fail, its
weaknesses, and the best ways to test the software.
At the same time they are also test the software and report the problems
Test cases are good to the extent that they advance the tester’s knowledge (goal-driven)
as the tester gains new knowledge the goals may also change quickly
which type of testing: what is most likely to reveal the information the tester is looking for
Not purely spontaneous:
studying competitive products,
review failure histories of this and analogous products
interviewing programmers and users,
reading specifications, and
working with the product.
Further Reading
[Al-Ghafees&Whittaker2002] Al-Ghafees, Mohammed; & Whittaker, James (2002) “Markov Chain-based Test Data Adequacy Criteria: A Complete Family”, Informing Science & IT Education Conference, Cork, Ireland,
http://ecommerce.lebow.drexel.edu/eli/2002Proceedings/papers/AlGha180Marko.pdf
[Bach2003a] James Bach (2003a), “Exploratory Testing Explained”, www.satisfice.com/articles/et-article.pdf [Berger2001] Berger, Bernie (2001) "The dangers of use cases employed as test cases," STAR West
conference, San Jose, CA. www.testassured.com/docs/Dangers.htm.
[El-Far1995] El-Far, Ibrahim K. (1995) Automated Construction of Software Behavior Models, M.Sc. Thesis, Florida Tech, www.geocities.com/model_based_testing/elfar_thesis.pdf
[El-Far2001] El-Far, Ibrahim K. (2001) “Enjoying the Perks of Model-Based Testing”, STAR West conference, San Jose, CA. www.geocities.com/model_based_testing/perks_paper.pdf
[El-Far+2001] El-Far, Ibrahim, K; Thompson, Herbert; & Mottay, Florence (2001) “Experiences in Testing Pocket PC Applications,” International Software Quality Week Europe,
www.geocities.com/model_based_testing/pocketpc_paper.pdf
[Kaner2000] Kaner, Cem (2000), “Architectures of Test Automation,” STAR West conference, San Jose, CA, www.kaner.com/testarch.html
[Kaner2002] Kaner, Cem (2002) A Course in Black Box Software Testing (Professional Version), www.testingeducation.org/coursenotes/kaner_cem/cm_200204_blackboxtesting/
[Kaner2003] Cem Kaner, "What is a good test case?" Software Testing Analysis & Review Conference (STAR) East, Orlando, FL, May 12-16, 2003.
[Nyman 1998] Nyman, N. (1998), “Application Testing with Dumb Monkeys”, STAR West conference, San Jose, CA.
[Pettichord2002] Pettichord, Bret (2002) “Design for Testability” Pacific Northwest Software Quality Conference, October 2002, www.io.com/~wazmo/papers/design_for_testability_PNSQC.pdf
[Whittaker1997] Whittaker, James (1997) “Stochastic Software Testing,” Annals of Software Engineering Vol. 4, 115-131.
III.3.3 White Box Testing Techniques
Black box testing:
Requirements fulfilled
Interfaces available
and working
But can we also exploit
the internal structure
of a component,
interactions between
objects?
Ö
white box testing
White box testing:
(1)
Control flow testing
(2)
Data flow testing
(1)
Control Flow Testing: Example
procedure XYZ is
A,B,C: INTEGER;
begin
1.
GET(A); GET(B); GET(C);
2.
if (A > 15 and C > 20) then
3.
if B < 10 then
4.
B := A + 5;
5.
else
6.
B := A - 5;
7.
end if
8.
else
9.
A := B + 5;
10. end if;
end XYZ;
[SWENE]
1
3
9
10
4
6
2
7
Control Flow Testing
a)
Statement coverage
: The test cases are
generated so that all the program statements are
executed at least once.
b)
Decision coverage
(branch coverage): The test
cases are generated so that the program
decisions take the value true or false.
c)
Condition coverage
: The test cases are
generated so that all the program conditions
(predicates) that form the logical expression of the
decision take the value true or false.
d)
Path coverage
: Test cases are generated to
a)
Statement Coverage
Statement
coverage
:
The test
cases are
generated so
that all the
program
statements
are executed
at least once.
1
3
9
10
4
6
2
7
b)
Decision / Branch Coverage
Decision
coverage
(branch
coverage): The
test cases are
generated so
that the program
decisions take
the value true or
false.
1
3
9
10
4
6
2
7
c)
Condition Coverage (1/2)
Condition
coverage
: The
test cases are
generated so
that all the
program
conditions
(predicates) that
form the logical
expression of
the decision
take the value
true or false.
1
3
9
10
4
6
2
7
Simple Condition Coverage
Composed condition
:
((A and B) or C)
Test cases must cover all
the values true or false for
each conditions (predicates)
of the logical expression
combinatorial explosion
Ö
only coverage for
simple
condition coverage
Simple condition
:
(A)
degenerates to decision
coverage but simple
condition coverage does not
ensure decision coverage
Remarks:
complete condition evaluation
assumed
Otherwise compiler dependent
execution semantics
A
B
C
(A and B) or C
TRUE
TRUE
TRUE
TRUE
TRUE
TRUE
FALSE
TRUE
TRUE
FALSE
TRUE
TRUE
TRUE
FALSE FALSE
FALSE
FALSE
TRUE
TRUE
TRUE
FALSE
TRUE
FALSE
FALSE
FALSE FALSE
TRUE
TRUE
Other Condition Coverage Types
Simple condition coverage:
require that the tests must
cover both true and false for each conditions (predicates) of
the logical expression.
Condition/decision coverage:
require in addition that the
coverage w.r.t. the decision is also complete.
Minimal multiple condition coverage
: also cover all sub
expressions of the Boolean formula
Modified condition/decision coverage:
requires that
every atomic condition can influence the overall output
(required for level A equipment by RTCA DO-178 B).
Multiple condition coverage:
test all combinations of the
atomic conditions.
d)
Path Coverage
Path coverage
: Test
cases are generated to
execute all/some program
paths.
Why path coverage?
logic errors and incorrect
assumptions are inversely
proportional to a path’s
execution probability
we often
believe
that a
path is not likely to be
executed; in fact, reality is
often counter intuitive
it’s likely that untested
paths will contain some
undetected defects
1
3
9
10
4
6
2
7
Exhaustive Testing Not Possible
There are 10
14
possible paths!
If we execute one
test per millisecond,
it would take 3.170
years to test this
program!!
Ö
Out of question
Basic Path Testing
Terminology
:
execution path:
a path that connects a start node and a
terminal node.
Two paths are
independent
if they have disjoint node or
edge sets
Steps
:
convert the unit into a “flow graph”
compute a measure of the unit's logical complexity
use the measure to derive a “basic” set of independent
execution paths for which test cases are determined
Remark
: Path coverage set is not unique!
Cyclomatic Complexity
Cyclomatic complexity
(McCabe complexity) is a metric, V(G), that
describes the logical complexity of a flow graph, G.
V(G) = E - N + 2 where E = #edges in G and N = #nodes in G
Studies have shown:
V(G) is directly related to the number of errors in source code
V(G) = 10 is a practical upper limit for testing
[SWENE]
modules
V(G)
modules in this range are
more error prone
Path Coverage Testing
1
2
4
7
8
3
6
5
Next, we derive the independent
paths:
Since V(G) = 4, we have 4 paths
Path 1:
1,2,3,6,7,8
Path 2:
1,2,3,5,7,8
Path 3:
1,2,4,7,8
Path 4:
1,2,4,7,2,4…7,8
Finally, we derive test cases to
exercise these paths, i.e. choose
inputs that lead to traversing the
paths
Testing Loops
simple loop
nested
loops
concatenated
loops
unstructured
loops
Loop Testing: Simple Loops
Minimum conditions - simple loops
1. skip the loop entirely
2. only one pass through the loop
(boundary)
3. two passes through the loop
(interior)
4. m passes through the loop m < n
5. (n-1), n, and (n+1) passes through the loop
Testing Nested Loops
Just extending simple loop testing:
Ö
number of tests explodes
Reduce the number of tests:
start at the innermost loop; set all other
loops to minimum values
conduct simple loop test; add out of range
or excluded values
work outwards while keeping inner nested
loops to typical values
Comparison: Control Flow Testing
a)
Statement coverage
: The
test cases are generated so
that all the program
statements are executed at
least once.
b)
Decision coverage
(branch
coverage): The test cases are
generated so that the
program decisions take the
value true or false.
c)
Condition coverage
: The
test cases are generated so
that all the program
conditions (predicates) that
form the logical expression of
the decision take the value
true or false.
d)
Path coverage
: Test cases
are generated to execute
all/some program paths.
b)
Decision
coverage
Simple condition
coverage
Condition/decision
coverage
Minimal multiple
condition coverage
Modified
condition/decision
coverage
Multiple condition
coverage
a)
Statement
coverage
Path
coverage
c)
Basic
path
coverage
d)
(2)
Data Flow Testing
Basic Definitions:
V
= the set of variables.
N
= the set of nodes
E
= the set of edges
Definition/write
on variables (nodes):
def(i)
= { x in V
| x has a global
def in block I }
Computational-use
of variables (nodes):
c-use(i) = { x in V
| x has a global
c-use in block i }
Predicate-use
of variables (edges):
p-use(i,j)
= { x in V | x has a p-use
in edge (i,j) }
1 2 3 4 5 6 7 8 9 10read(x,y)
i:=1;
writeln(“hello”);
i:= i+1;
i>2
i<=2
y>=0
y<0
writeln(x);
def (2) = {i}
c-use(2) = {x,y}
p-use(3,5) = {i}
p-use(3,4) = {i}
def (4) = {i}
c-use(4) = {i}
p-use(6,8) = {y}
p-use(6,7) = {y}
c-use(8) = {x}
Idea of Data Flow Testing
Idea:
coverage of the a def-clear
path between definition and
usage
Examples:
One path per def
(all-defs
)
One path per def to all c-uses
(
all-c-uses)
One path per def to all p-uses
(
all-p-uses)
One path per def to all p-uses or
c-uses (
all-uses)
All path (without loops) from defs
to all p-uses or c-uses (
all
du-paths)
1 2 3 4 5 6 7 8 9 10read(x,y)
i:=1;
writeln(“hello”);
i:= i+1;
i>2
i<=2
y>=0
y<0
writeln(x);
def (2) = {i}
c-use(2) = {x,y}
p-use(3,5) = {i}
p-use(3,4) = {i}
def (4) = {i}
c-use(4) = {i}
p-use(6,8) = {y}
p-use(6,7) = {y}
c-use(8) = {x}
Data Flow Testing
All node with an effective path from a def to a c-use
for variable x:
dcu(x, j)
=
{ j in N
| x in c-use(j) and
there is a def-clear path wrt
x from i to j }
All edges with an effective path from a def to a
p-use for variable x:
dpu(x, j)
=
{(j,k) in E
| x in p-use(j,k) and
there is a def-clear path wrt x
from i to (j,k) }
All effective path from a def to use without loops:
du-c(x,i,j)
=
{ i,…,k,j in N*
| x in def(i), x in
c-use(j), the path is def-clear wrt x,
and no node is visit twice}
du-p(x,i,j) =
{ i,…,k,j in N*
| x in def(i) , x in
p-use(k,j), and the path i,…,k is
def-clear wrt x, and no node i,…,k
is visit twice}
du(x,i,j)
=
du-c(x,i,j)
∪
du-c(x,i,j)
1 2 3 4 5 6 7 8 9 10
read(x,y)
i:=1;
writeln(“hello”);
i:= i+1;
i>2
i<=2
y>=0
y<0
writeln(x);
def (2) = {i,x,y}
c-use(2) = {}
p-use(3,5) = {i}
p-use(3,4) = {i}
def (4) = {i}
c-use(4) = {i}
p-use(6,8) = {y}
p-use(6,7) = {y}
c-use(8) = {x}
Types of Data Flow Testing
All-defs
: Test cases are generated to cover
each definition of each variable for at least
one use of the variable.
All c-uses
: Test cases are generated so that
there is at least one path of each variable
definition to each c-use of the variable.
All p-uses
: Test cases are generated so that
there is at least one path of each variable
definition to each p-use of the variable.
Comparison
:
Data Flow Testing
If time is an issue:
All p-uses should be
used instead of all-uses
All-uses should be used
instead of all-du-paths
to generate fewer tests:
use the test cases
covered by the (stronger)
criteria.
All-du-paths is usable in
practice.
[Juriso+2004]
All-Paths
All-DU-Paths
All-Uses
All-C-Uses/Some-P-Uses
All-P-Uses/Some-C-Uses
All-C-Uses
All-DEFS
All-P-Uses
decision
coverage
statement
(3) Mutation Testing
Strong (standard) mutation
: Test cases are
generated to cover all the mutants generated by
applying all the mutation operators defined for the
programming language in question.
Selective (or constrained) mutation
: Test cases
are generated to cover all of the mutants generated
by applying some of the mutation operators defined
for the programming language in questions.
Weak mutation
: Test cases are generated to
cover a given percentage of mutants generated by
applying all the mutation operators defined for the
programming language in question.
III.3.4 Comparison
Black box tests:
tester has no access to
information about the
system implementation
Characteristics:
Good for independence of
tester
But not good for formative
tests
Hard to test individual
modules...
White box tests:
tester can access
information about the
system implementation
Characteristics:
Simplifies diagnosis of
results
Can compromise
independence
How much do they need to
know?
III.4 Testing Process & Activities
Beforehand:
Requirement analysis
design analysis
Testing Process & Activities
(1)
Unit test
(2)
Integration test
(3)
Function test
(4)
Performance test
(5)
Acceptance test
(6)
Installation test
System test
Testing Activities
Unit
test
Unit
test
Unit
test
Integration
test
Function
test
Performance
test
Acceptance
test
Installation
test
Unit code
Unit code
Unit code
.
.
.
Integrated
modules
Functioning
system
Verified,
validated
software
Accepted
system
SYSTEM
IN USE!
Design
specifications
functional
System
requirements
Other
software
requirements
Customer
requirements
specification
User
environment
[Pfleeger 2001]
[SWENE]
III.4.1 Unit Testing
Individual components are tested independently to ensure their quality.
The focus is to uncover errors in design and implementation, including
data structure in a component
program logic and program structure in a component
component interface
functions and operations of a component
There is some debate about what constitutes a “unit”. Some common
definitions:
the smallest chunk that can be compiled by itself
a stand alone procedure of function
something so small that it would be developed by a single person
Unit testers:
Example: Test GCD Algorithm
Testing a unit designed to compute the “greatest
common divisor” (GCS) of a pair of integers (not
both zero).
GCD(a,b) = c where
c is a positive integer
c is a common divisor of a and b (e.g., c divides a and c divides
b)
c is greater than all other common divisors of a and b.
For example
GCD(45, 27) = 9
GCD (7,13) = 1
GCD(-12, 15) = 3
GCD(13, 0) = 13
GCD(0, 0) undefined
[SWENE]
Test Planning
How do we proceed to determine the tests cases?
1.
Determine appropriate equivalence classes for the
input data.
2.
Determine the boundaries of the equivalence classes.
3.
Design an algorithm for the GCD function.
4.
Analyze the algorithm using basic path analysis.
5.
Then, choose tests cases that include the basic path
set, data form each equivalence class, and data at and
near the boundaries.
GCD Algorithm
note: Based on Euclid’s algorithm
1.
function gcd (int a, int b) {
2.int temp, value;
3.
a := abs(a);
4.b := abs(b);
5.if (a = 0) then
6.value := b; // b is the GCD
7.else if (b = 0) then
8.raise exception;
9.else
10.loop
11.temp := b;
12.b := a mod b;
13.a := temp;
14.until (b = 0)
15.value := a;
16.end if;
17.return value;
18.end gcd
1
5
10
9
17
7
6
18
[SWENE]
Example: GCD Test Planning
Equivalence Classes
Although the the GCD algorithm should accept any
integers as input, one could consider 0, positive
integers and negative integers as “special” values.
This yields the following classes:
a < 0 and b < 0, a < 0 and b > 0, a > 0 and b < 0
a > 0 and b > 0, a = 0 and b < 0, a = 0 and b > 0
a > 0 and b = 0, a > 0 and b = 0, a = 0 and b = 0
Boundary Values
a = -2
31, -1, 0, 1, 2
31-1 and b = -2
31, -1, 0, 1, 2
31-1
Basic Path Set
V(G) = 4
(1,5,6,17,18), (1,5,7,18), (1,5,7,9,10,17,18),
(1,5,7,9,10,9,10,17,18)
Example GCD Test Plan
Test Description / Data Expected Results Test Experience / Actual Results
Basic Path Set
path (1,5,6,17,18) Î (0, 15) 15 path (1,5,7,18) Î (15, 0) 15 path (1,5,7,9,10,17,18) Î (30, 15) 15 path (1,5,7,9,10,9,10,17,18) Î (15, 30) 15 Equivalence Classes a < 0 and b < 0 Î (-27, -45) 9 a < 0 and b > 0 Î (-72, 100) 4 a > 0 and b < 0 Î (121, -45) 1 a > 0 and b > 0 Î (420, 252) 28 a = 0 and b < 0 Î (0, -45) 45 a = 0 and b > 0 Î (0 , 45) 45 a > 0 and b = 0 Î (-27, 0) 27 a > 0 and b = 0 Î (27, 0) 27 a = 0 and b = 0 Î (0 , 0) exception raised
Boundary Points
(1 , 0) 1
(-1 , 0) 1
(0 , 1) 1
(0 , -1) 1
(0 , 0) (redundant) exception raised
(1, 1) 1
(1, -1) 1
(-1, 1) 1
(-1, -1) 1
Test Implementation
Once one has determined the testing strategy, and
the units to tested, and completed the unit test
plans, the next concern is how to carry on the tests.
If you are testing a single, simple unit that does not
interact with other units (like the GCD unit), then one can
write a program that runs the test cases in the test plan.
However, if you are testing a unit that must interact with
other units, then it can be difficult to test it in isolation.
The next slide defines some terms that are used in
implementing and running test plan.
Test Implementation Terms
Test Driver
a class or utility program that applies test cases to a
component being tested.
Test Stub
a temporary, minimal implementation of a component to
increase controllability and observability in testing.
When testing a unit that references another unit, the unit
must either be complete (and tested) or stubs must be
created that can be used when executing a test case
referencing the other unit.
Test Harness
A system of test drivers, stubs and other tools to support
test execution
Test Implementation Steps
Here is a suggested sequence of steps for testing a unit:
1.
Once the
design
for the unit is complete, carry out
Complete a test plan for a unit.
create stubs for not yet completed referenced other units
Create a driver (or set of drivers) for the unit
construction of test case data (from the test plan)
2.
Once the
implementation
is complete, carry out
Use the driver (or set of drivers) to test the unit
execution of the unit, using the test case data
provision for the results of the test case execution to be printed or
logged as appropriate
Testing Tools
There are a number of tools that have been
developed to support the testing of a unit or
system.
googling “Software Testing Tools” will yield
thousands of results.
JUnit testing (
http://www.junit.org/index.htm
) is
a popular tool/technique that can be integrated
into the development process for a unit coded
in Java.
Further Reading
[Beck 2004]
Beck, K., and Gamma, E.
Test Infected: Programmers
Love Writing Tests
,
http://members.pingnet.ch/gamma/junit.htm
[Binder 1999]
Binder, R.V.,
Testing Object-Oriented Systems
,
Addison-Wesley, 1999.
[Humphrey 1995] Humphrey, Watts S.,
A Discipline for Software
Engineering
, Addison Wesley, 1995.
[McConnell 1993] McConnell, Steve,
Code Complete, A Practical
Handbook of Software Construction
, Microsoft Press,
1993.
[Jorgensen 2002] Jorgensen, Paul C.,
Software Testing: A Craftsman’s
Approach
, 2
ndedition, CRC Press, 2002.
[Pfleeger 2001] Pfleeger,
S.,
Software Engineering Theory and
Practice
, 2
ndEdition, Prentice-Hall, 2001.
[Pressman 2005] Pressman, R.S.,
Software Engineering: A Practitioner’s
Approach
, 6
thedition, McGraw-Hill, 2005.
III.4.2 Integration Testing
A group of dependent components are tested together to ensure their
the quality of their integration unit.
The focus is to uncover errors in:
Design and construction of software architecture
Integrated functions or operations at sub-system level
Interfaces and interactions between them
Resource integration and/or environment integration
Integration testers:
either developers and/or test engineers.
Tests complete systems or subsystems composed of integrated components
Integration testing should be black-box testing with tests derived from the
specification
Main difficulty is localising errors
Integration Testing Strategies
Incremental testing strategies:
Bottom-up testing:
Integrate individual components in
levels until the complete system is created
Top-down testing:
Start with high-level system and
integrate from the top-down replacing individual components
by stubs where appropriate
Outside-in integration:
Do bottom-up and top-down testing
at the same time such that the final integration step occurs
in a middle layer
Non incremental testing strategies:
Big bang testing:
put all together and test it
Remark:
In practice, most integration involves a combination
of these strategies
Big Bang Integration Testing
combine (or integrate) all parts at once.
Advantages:
simple
Disadvantages:
hard to debugging, not easy to isolate errors
not easy to validate test results
Bottom Up Integration Testing
Idea:
Modules at the lowest levels are integrated at first, then by moving
upward through the control structure.
Integration process (five steps):
1.
Low-level modules are combined into clusters that perform a specific
software sub-function
2.
A driver is written to coordinate test case input and output
3.
Test cluster is tested
4.
Drivers are removed and clusters are combined moving upward in the
program structure
Pros and cons of bottom-up integration:
no stubs cost
need test drivers
M9
M8
M1
M2
M3
M4
M5
M6
M7
M10
M11
Integration A
Integration B
Integration c
Stage 2
Stage 4
Stage 3
Stage 1
Top Down Integration Testing
Idea:
Modules are integrated by moving downward through the control structure.
Modules subordinate to the main control module are incorporated into the
system in either a depth-first or breadth-first manner.
Integration process (five steps):
1.
the main control module is used as a test driver, and the stubs are substituted
for all modules directly subordinate to the main control module.
2.
subordinate stubs are replaced one at a time with actual modules
3.
tests are conducted as each module is integrated
4.
On completion of each set of tests, another stub is replaced with the real
module
5.
regression testing may conducted
Pros and cons top-down integration:
stub construction cost
M9
M8
M1
M2
M3
M4
M5
M6
M7
M10
M11
Integration C
Integration A
Integration B
Stage 3
Stage 1
Stage 2
Stage 5
Integration D
Stage 4
Stage 6
Top-down testing of module M8
Bottom-up testing of module M8
Module
on test
M9
Stub
of M2
Stub
of M1
M8
Module
on test
Drive
of M9
M2
M1
M8
Module
tested in an
earlier
stage
Modules
tested in an
earlier
stage
Regression Testing & Integration
A module firewall:
closure of all possible
affected modules and
related integration links
in a program based on a
control-flow graph.
Ö
Reduction of the
software regression
testing to a smaller
scope (the firewall)
This implies that:
re-test of the changed
module and its affected
modules
re-integration for all of
related integration links
Main
M1
M2
M3
M7
M6
M5
M4
M8
A module firewall:
- M5, M1, and Main
Re-testing at the unit level: M5
Re-integration steps: 2, 3, 4
1
2
3
4
Changed ModuleComparison of the Approaches
Problems of non-incremental integration:
hard to debugging, hard to isolate errors
not easy to validate test results
impossible to form an integrated system
Pros and cons top-down integration:
stub construction cost
major control function is tested early
Pros and cons of bottom-up integration:
no stubs cost
need test drivers
no controllable system until the last step
Pros and cons outside-in integration:
stub construction cost
major control function is tested early
need test drivers
Architectural validation
Top-down/Outside-in integration testing
is better at discovering errors in the
system architecture
System demonstration
Top-down/Outside-in integration testing
allows a limited demonstration at an
early stage in the development
Test implementation
Often easier with bottom-up integration
testing
Test observation
Problems with all approaches. Extra
code may be required to observe tests
III.4.3 Function Testing
The integrated software is tested based on requirements to ensure that
we have a right product (validate
functional requirements
).
The focus is to uncover errors in:
System input/output
System functions and information data
System interfaces with external parts
User interfaces
System behavior
Function testers:
test engineers or SQA people.
System
(Operations &
Functions
& Behavior)
User
interface
User
External interfaces
III.4.4 Performance Testing
Performance testing is designed to test
run-time performance of software within
the context of an integrated system
(validate
non-functional requirements
).
The focus areas are:
Confirm and validate the specified system
performance requirements.
Check the current product capacity to
answer the questions from customers and
marketing people.
Identify performance issues and
performance degradation in a given
system
System behavior in the special load
conditions
Performance testers:
test engineers or SQA people.
Relevant attributes:
System
process speed
(Max./Min./Average)
System
throughput
(Max./Min./Average)
System
latency
(Max./Min./Average)
System
utilization
(Max./Min./Average)
System
availability
(component-level/system-level)
System
reliability
(component-level/system-level)