MTAT : Software Testing

(1)

MTAT.03.159: Software Testing

Lecture 07:

Tools, Metrics and Test

Process Improvement / TMMi

(Textbook Ch. 14, 9, 16)

Dietmar Pfahl

email: [email protected]

(2)

MTAT.03.159 / Lecture 07 / © Dietmar Pfahl 2013

Structure of Lecture 07

• Test Tools

• Test Measurement

• Test Process Improvement

• SWT Exam

(3)

Tools – the Workbench

• Good at repeating

tasks

• Good at organising

data

• Requires training

• Introduced

incrementally

• No

“

silver bullet

”

Evaluation criteria

• Ease of use • Power • Robustness • Functionality • Ease of insertion • Quality of support • Cost

(4)

Test Tools – in the Process

Test management tools

Test execution and comparison tools Dynamic analysis tools Debugging tools Coverage tools

Static analysis tools Test design tools Architectural design Detailed design Code Requirement specification Unit test Integration test Performance simulator tools System test Acceptance test

(5)

Test Tools

– by Test Maturity

TMM level 1 Debuggers Configuration builders LOC counters TMM level 2 Test/project planners Run-time error checkers Test preparation tools Coverage analyzers Cross-reference tools TMM level 3 Configuration management Requirements tools Capture-replay tools Comparator Defect tracker Complexity measurer Load generators TMM level 4 Code checkers

Code comprehension tools Test harness generators Perform./network analyzers Simulators/emulators

Test management tools

TMM level 5

Test library tools Advanced…

(6)

There is no shortage of Test Tools

• Defect Tracking (98) • GUI Test Drivers (71)

• Load and Performance (52) • Static Analysis (38)

• Test Coverage (22) • Test Design Tools (24) • Test Drivers (17)

• Test Implementation (35)

• assist with testing at runtime - memory leak checkers, comparators, and a wide variety of others

• Test case Management (24) • Unit Test Tools (63)

• 3 different categories of others

Test data generator – Generatedata.com

online service for generating data

(8)

(9)

Evolution of System Testing approaches

1. Recorded Scripts

2. Engineered Scripts

3. Data-driven Testing

4. Keyword-driven Testing

5. Model-based Testing

First Last Data

Pekka Pukaro 1244515 Teemu Tekno 587245

(10)

Recorded Scripts

• Unstructured

• Scripts generated using capture and replay tools

• Relatively quick to set up

• Mostly used for regression testing • Scripts non-maintainable, in practice

– If the system changes they need to be captured again

• Capture Replay Tools

– Record user’s actions to a script (keyboard, mouse)

• Tool specific scripting language

– Scripts access the (user) interface of the software

• Input fields, buttons and other widgets

– Simple checks can be created in the scripts

• Existence of texts and objects in the UI

(11)

Engineered Scripts

• Scripts are well-designed (following a systematic approach), modular, robust, documented, and maintainable

• Separation of common tasks

– E.g. setup, cleanup/teardown, and defect detection • Test data is still embedded into the scripts

– One driver script per test case • Code is mostly written manually

• Implementation and maintenance require programming skills which testers (test engineers) might not have

(12)

Data-Driven Testing

• Test inputs and expected outcomes stored as data

– Normally in a tabular format

– Test data are read from an external data source

• One driver script can execute all of the designed test cases • External test data can be edited without programming skills

– Test design and framework implementation are now separate tasks – former can be given to someone with the domain knowledge

(business people, customers) and latter to someone with programming skills.

• Avoids the problems of embedded test data

– Data are hard to understand in the middle of all scripting details

• Updating tests or creating similar tests with slightly different test data always requires programming

– Leads to copy-paste scripting

First Last Data

(13)

(14)

Keyword-Driven Testing

• Keywords also known as action words

• Keyword-driven testing improves data-driven testing:

– Keywords abstract the navigation and actions from the script – Keywords and test data are read from an external data source

• When test cases are executed keywords are interpreted by a test library which is called by a test automation framework

• The test library = the test scripts

• Example: Login: admin, t5t56y;

AddCustomers: newCustomers.txt RemoveCustomer: Pekka Pukaro

• More keywords (=action words) can be defined based on existing keywords • Keyword driven testing ~= domain specific languages (DSL)

• Details: http://doc.froglogic.com/squish/4.1/all/how.to.do.keyword.driven.testing.html

(15)

Architecture of a Keyword-Driven

Framework

Pekka Laukkanen. Data-Driven and Keyword-Driven Test Automation Frameworks. Master’s Thesis. Helsinki University of Technology. 2006.

(16)

(17)

Model-based Testing

• System under test is modelled

– UML-state machines, domain specific languages (DSL)

• Test cases are automatically generated from the model

– The model can provide also the expected results for the generated test cases

– More accurate model -> better test cases

• Generate a large amount of tests that cover the model

– Many different criteria for covering the model – Execution time of test cases might be a factor

• Challenges:

– Personnel competencies

– Data-intensive systems (cannot be modelled as a state-machine)

(18)

Evolution of System Testing approaches

1. Recorded Scripts

– Cheap to set up, quick & dirty 2. Engineered Scripts – Structured 3. Data-driven Testing – Data separation 4. Keyword-driven Testing – Action separation, DSL 5. Model-based Testing

– Modeling & Automatic test case generation

First Last Data

(19)

Automation and Oracles

• Automated testing depends on the ability to detect automatically (via a program) when the software fails

• An automated test is not equivalent to a similar manual test – Automatic comparison is typically more precise

– Automatic comparison will be tripped by irrelevant discrepancies

– The skilled human comparison will sample a wider range of dimensions, noting oddities that one wouldn't program the computer to detect

• “Our ability to automate testing is fundamentally constrained by our ability to create and use oracles” (Cem Kaner)

(20)

Types of outcome to compare

• Screen-based

– Character-based applications

– GUI applications

• Correct message, display attributes, displayed

correctly

• GUI components and their attributes

– Graphical images

• Avoid bitmap comparisons

• Disk-based

– Comparing text files – Comparing non-textual

forms of data

– Comparing databases and binary files

• Others

– Multimedia applications

• Sounds, video clips, animated pictures

– Communicating applications • Simple vs. complex comparison

(21)

Test case sensitivity in comparisons

Robust tests Sensitive tests

Susceptibility to change

Implementation effort

Miss defects

Failure analysis effort

Storage space

Redrawn from Fewster and Graham, Software Test Automation, 1999.

Sensitive test case compares many elements and is likely to notice that something breaks. However, it is also more sensitive to change and causes rework in test automation.

A robust test checks less and is more change-resilient, but also misses potential defects. Striking a balance is the challenge.

(22)

Effect of automation on ”goodness” of a test case

Effective Exemplary Evolvable Economic First run of automated tests Automated test after many runs

Manual test

Redrawn from Fewster and Graham Software Test Automation, 1999.

Evolvability (maintainability) of automated test does not change, but economics increases the sizes and the “goodness” of the test case

(23)

Scope: Automating different steps

Select/Identify test cases to run

Set up test environment - create test environment

- load test data

Repeat for each test case: - set up test prerequisites

- execute - compare results

- log results -analyze test failures

-report defect(s) - clear up after test case

Clear up test environment: - delete unwanted data

- save important data

Summarize results

Select/Identify test cases to run

Set up test environment: - create test environment

- load test data

Repeat for each test case: - set up test prerequisites

- execute -compare results

- log results

-clear up after test case

Clear up test environment: - delete unwanted data

-save important data

Summarize results Analyze test failures

Report defects

Automated tests Automated testing

Manual process Automated process

(24)

Relationship of testing activities

Redrawn from Fewster et al. Software Test Automation, 1999.

Edit tests

(maintenance) Set up Execute Analyze failures Clear up

Manual testing Same tests automated More mature automation Time

(25)

Test automation promises

1. Efficient regression test

2. Run tests more often

3. Perform difficult tests (e.g. load, outcome check)

4. Better use of resources

5. Consistency and repeatability

6. Reuse of tests

7. Earlier time to market

8. Increased confidence

(26)

Common problems

1. Unrealistic expectations

2. Poor testing practice

”Automatic chaos just gives faster chaos”

3. Expected effectiveness

4. False sense of security

5. Maintenance of automatic tests

6. Technical problems (e.g. Interoperability)

7. Organizational issues

(27)

What can be automated?

Intellectual Performed once Repeated Clerical 1. Identify 2. Design 3. Build 4. Execute 5. Check

(28)

Limits of automated testing

• Does not replace manual testing

• Manual tests find more defects than automated tests

– Does not improve effectiveness

• Greater reliance on quality of tests

– Oracle problem

• Test automation may limit the software development

– Costs of maintaining automated tests

(29)

What to automate first?

• Most important tests

• A set of breadth tests (sample each system area

overall)

• Test for the most important functions

• Tests that are easiest to automate

• Tests that will give the quickest payback

• Test that are run the most often

(30)

• Test Tools

• SWT Exam

(31)

Test Management

• Monitoring (or tracking)

– Check status – Reports

– Metrics

• Controlling

(32)

Purpose of Measurement

• Test monitoring – check the status

• Test controlling – corrective actions

• Plan new testing

• Measure and analyze results

– The benefit/profit of testing – The cost of testing

– The quality of testing

– The quality of the product

(33)

Cost of Testing

• How much does testing cost?

• As much as resources we have!

Ericcsson

mobile phones did not do too well because did too much of testing instad of getting the

product to market

(34)

Test Monitoring

• Status

– Coverage metrics

– Test case metrics: development and execution – Test harness development

• Efficiency / Cost metrics

– How much time have we spent?

– How much money/effort have we spent?

• Failure / Fault metrics

– How much have we accomplished?

– What is the quality status of the software?

• Effectiveness metrics

– How effective is the testing techniques in detecting defects?

Metrics Estimation

Cost Stop?

(35)

Selecting the right metrics

• What is the purpose of the collected data?

– What kinds of questions can they answer? – Who will use the data?

– How is the data used?

• When and who needs the data?

– Which forms and tools are used to collect the data? – Who will collect them?

– Who will analyse the data?

(36)

Goal-Question-Metric Paradigm (GQM)

• Goals

– What is the organization trying to achieve?

– The objective of process

improvement is to satisfy these goals

• Questions

– Questions about areas of

uncertainty related to the goals – You need process knowledge to

derive the questions

• Metrics

– Measurements to be collected to answer the questions

[van Solingen, Berghout, The Goal/Question/Metric Method, McGraw-Hill, 1999]

Goal example:

• Analyze <object(s) of study>

– the detection of design faults using inspection and testing

• for the purpose of <purpose>

– evaluation

• with respect to their <quality focus>

– effectiveness and efficiency

• from the point of view of the <perspective>

– managers

• in the context of <context>

– developers, and in a real application domain

(37)

Measurement Basics

Basic data:

• Time and Effort

(calendar- and staff-hours)

• Failures / Faults

• Size / Functionality

Basic rule:

• Feedback to origin

• Use data or don’t

(38)

Test metrics: Coverage

What?

% statements covered

% branches covered

% data flow

% requirements

% equivalence classes

Why?

• Track

completeness of test

(39)

Test metrics: Development status

• Test case development status

– Planned – Available

– Unplanned (not planned for, but needed)

• Test harness development

status

– Planned – Available – Unplanned

(40)

Test metrics: Test execution status

What?

• # faults/hour

• # executed tests

• Requirements

coverage

Why?

• Track progress of test

project

• Decide stopping

criteria

(41)

Test metrics: Size/complexity/length

What?

• Size/Length – LOC

• Functionality –

Function Points

• Complexity – McCabe

• Difficulty – Halstead

• Cohesion, Coupling,

...

Why?

(42)

Test metrics: Efficiency

What?

• # faults/hour

• # faults/test case

Why?

• Evaluate efficiency of

V&V activities

(43)

Test metrics: Faults/Failures

(Trouble reports)

What?

• # faults/size

• repair time

• root cause

Why?

• Monitor quality

• Monitor efficiency

• Improve

(44)

Test metrics: Effectiveness

What?

% found faults per

phase

% missed faults

Why?

• Evaluate effectiveness

of V&V activities

(45)

How good are we at testing?

Test quality Product quality Many faults Few faults Few faults Few faults Are we here? Or are we here?

(46)

When to stop testing?

• All planned tests are executed and passed

• All coverage goals met (requirements, code, ...)

• Detection of specific number of failures

• Rates of failure detection fallen below a specified

level

• Fault seeding ratios are favourable

• Reliability above a certain value

(47)

Example

Number of failures per day

Number of executed test cases Number of detected failures Interpretation?

(48)

Example

Interpretation? Number of executed test cases Number of detected failures

(49)

Example

Interpretation? Number of executed test cases Number of detected failures

(50)

• Test Tools

• SWT Exam

(51)

Process quality and product quality

• Quality in process ->

Quality in product

• Project: instantiated

process

• Quality according to

ISO 9126

– Process quality contributes to improving product quality, which in turn contributes to improving quality in use

Process Project

(52)

Principles

Test organisation Maturity Model Assess Improve

(53)

(54)

Process improvement models

• (Integrated) Capability maturity model (CMM, CMMI)

• Software process improvement and capability determination (SPICE)

• ISO 9001, Bootstrap, …

• Test maturity model (TMM)

• Test process improvement model (TPI) • Test improvement model (TIM)

• Minimal Test Practice Framework (MTPF) • …

(55)

(56)

CMMI

(Capability

Maturity

Model

Integrated)

Process change management Technology change management Defect prevention

Software quality management Quantitative process management

Peer reviews

Intergroup coordination

Software product engineering Integrated software management Training programme

Organization process definition Organization process focus

Software configuration management Software quality assurance

Software subcontract management Software project tracking and oversight Software project planning

Requirements management Initial Repeatable Defined Managed Optimizing

(57)

Test Maturity Model (TMM)

• Levels

• Maturity goals and sub-goals

– Scope, boundaries, accomplishments

– Activities, tasks, responsibilities

• Assessment model

– Maturity goals

– Assessment guidelines

– Assessment procedure

(58)

Level 2: Phase Definition

• Institutionalize basic testing techniques and

methods

• Initiate a test planning process

(59)

Level 3: Integration

• Control and monitor the testing

process

• Integrate testing into software

life-cycle

• Establish a technical training

program

• Establish a software test

organization

(60)

Level 4: Management and Measurement

• Software quality evaluation

• Establish a test management

program

• Establish an organization-wide

review program

(61)

Level 5: Optimizing, Defect Prevention,

and Quality Control

• Test process optimization

• Quality control

• Application of process data for

defect prevention

(62)

(63)

Clausewitz: Armor and mobility

alternate dominance (DeMarco)

Greeks Romans Vandals, Huns Franks Castles Maginot Line Mongols Field Artillery Tanks

(64)

Birth of the castle (CMMI) and the tiger (Agile)

U.S Department of defense Scientific management Statistical process control Management

Control

Large team & low skill

Leading industry consultants Team creates own process Working software

Software craftsmanship Productivity

(65)

Plan-driven vs. Agile (Boehm & Turner,

2003, IEEE Computer, 36(6), pp 64-69)

(66)

Software quality assurance comparison:

castle vs. tiger

Organisation

Independent QA team Integrated into the project team

Ensuring

Compliance to documented processes Applicability and improvement of the current processes and practices

Evaluation Criteria

Against predefined criteria Identifying issues and problems

Focus

Documents & processes & control Productivity & quality & customer

Communication

(67)

General advice

• Identify the real problems before starting an

improvement program

•

“

What the customer wants is not always what it

needs

”

• Implement

“

easy changes

”

first

• Involve people

(68)

Recommended

Textbook Exercises

• Chapter 14

– 2, 4, 5, 6, 9

• Chapter 9

– 2, 3, 4, 5, 8, 12

• Chapter 16

– No exercises

(69)

• Test Tools

• SWT Exam

(70)

Final Exam

• Written exam (40%)

– Based on textbook, lectures and lab sessions – Open book

– 90 min

• Dates:

– Exam 1: 30-May-2013 10:15-11:45 (J. Liivi 2-405)

– Exam 2: 10-June-2013 14:15-15:45 (J. Liivi 2-403)

(71)

http://www.aptest.com/resources.html

http://www.softwareqatest.com/qatweb1.html

http://www.testingfaqs.org/

http://doc.froglogic.com/squish/4.1/all/how.to.do.keyword.driven.testing.html

http://code.google.com/p/robotframework/

http://graphwalker.org/