Data Collection and Analysis

Full text


PEUSS 2011/2012 Data Collection and Analysis Page 1

Data Collection and Analysis

Dr Jane Marshall

Product Excellence using 6 Sigma



• Understand the relationship between data and

analysis objectives

• Understand the data collection planning process

• Appreciate human factors of data collection


PEUSS 2011/2012 Data Collection and Analysis Page 3

What is data?

• The terms 'data' and 'information' are used interchangeably

• However the terms have distinct meanings:

– Data are facts, events, transactions and so on which have been recorded. They are the input raw materials from which information is processed.

– Information is data that have been produced in such a way as to be useful to the recipient.

• In general terms basic data are processed in some way to form information but the mere act of processing data does not itself produce information.

Data Characteristics

• Data are facts obtained by reading, observation,

counting, measuring, and weighing etc. which are then recorded.

• Called raw orbasic data and are often records of the day to day transactions of an organization.

• Data are derived from both external and internal sources.

• Data may be produced as an automatic by-product of some routine but essential operation


PEUSS 2011/2012 Data Collection and Analysis Page 5

Data Characteristics

• The pool of data available is effectively limitless. • This abundance means that organisations have to be

selective in the data they collect.

• They must continually monitor their data gathering procedures to ensure that they continue to meet the organisation's specific needs

• The data gathered and the means employed naturally vary from business to business depending on the organization's requirements.

Why collect data?

• Measure reliability

• Document spares consumption

• Provide statistics

• …

• These are reactive


PEUSS 2011/2012 Data Collection and Analysis Page 7

Why collect data?

• Maintenance planning

• Maintenance improvement

• Identify & justify need for modification

• Calculate future resource & spares requirements

• Assess likelihood of mission success

• Confirm contractual requirements

Why collect data

• To assist achievement of worthwhile objectives

• Data collection is time-consuming & costly.

– We should only collect data where there is an identified and worthwhile benefit from doing so.


PEUSS 2011/2012 Data Collection and Analysis Page 9

From data to worthwhile


Operation Data Collection Analysis Results Decisions Achievement of Objectives

Put planning into data


Operation Data Collection Analysis Results Decisions Achievement of Objectives


PEUSS 2011/2012 Data Collection and Analysis Page 11

Put planning into data


• Worthwhile objectives require decisions: – To change—how much, what, when, how – To not change

• Decisions need clear supporting evidence: – Analysed results—not all analysis is equal • Analysis needs data

– Good results need good analysis—but good analysis may need expensive data

– Options—consider alternatives and identify most cost-effective that enables objectives

Put planning into data


• Data collection does not need to satisfy all

objectives all the time. For example:

– Objective 1: Identify quickly that there is a reliability problem

• Routine data collection sufficient to allow SPC or CUSUM analysis of occurrences

– Objective 2: Identify accurately what the problem is • Special data collection once a problem has been

identified—possibly using sampling techniques and engineering analysis rather than data analysis


PEUSS 2011/2012 Data Collection and Analysis Page 13

Data collection must have a


• Data should be collected for a purpose: – to enable analysis,

– Focus on increasing understanding of item operation and failure,

– Application of this knowledge to a goal or objective.

• Without a definition of the objective for the future data analysis and the application of its findings, collection of data is likely to be aimless and will omit important data, allow corruption of data, or may waste time and

resources by including data that offer little benefit.

Questions to consider

– What observed availability is achieved with the applied maintenance regime?

– What values have been achieved with a former, similar product?

– Does the product conform to the requirements? – What affect has environment and usage on


– How stable is the dependability of manufactured items with time?


PEUSS 2011/2012 Data Collection and Analysis Page 15

Level of reporting

• Structure of items – system; – equipment; – module or unit; – part or component; – software module. • Generically these

can all be termed items

• Different phases of the life cycle :

– production to delivery; – installation;

– operation;

– time of warranty;

– long term behaviour, useful life, service effort;

– withdrawal from operation;


– Information proving that a particular item exists in the field – How that item is configured

– What other items that item contains


– Information about when an item was placed into the field, – How that item is operated in the field

– When that item was removed from the field


– Information about the operating conditions of the item


– Information about any thing that has happened to the item during its life


PEUSS 2011/2012 Data Collection and Analysis Page 17

Data sources

• Servicing records, • warranty records,

• repaired product records • spares used records • Disposal records • Customer complaints

• Customer reports and comments can also be used to help complete a data set.

• Insurance claims and coverage records


• The infrastructure :

– Diagnosis and service utilities as necessary for maintenance; – Computerized tools for data storage, aggregation, Analysis and


– Facilities for raw data recording computerized facilities – Remote condition monitoring and data collection.

• Economical and financial aspects to be considered are:

– Cost for implementation and maintaining regular data collection; – Benefits gained by improvement of processes caused by measures


PEUSS 2011/2012 Data Collection and Analysis Page 19

Data Validation

• Why validate

– Avoid garbage-in, garbage-out

– Avoid wrong decisions with costly consequences

– Reliability analysis often requires large amounts of data, collected over a long period of time—it is too late to find that data is corrupt when analysis is attempted

• How to validate

– Input masks, cross-checks (e.g. serial # fitted previously is serial # removed, serial # fitted is serial # removed from stores, item fitted matches host equipment, etc.), usage matches expectation, gaps in data …

– Use electronic aids such as smart-chips, bar-coding – Validate incrementally—validate at point of data entry

Human factors in data


• Make simple to get data collection correct

• Make difficult to get data collection wrong

• Complexity? Layout? Masks? Computer


• Involve those who collect the data in the

planning process—buy-in to objectives


PEUSS 2011/2012 Data Collection and Analysis Page 21


• Analysis is often as much detective work as it is


– Analysis answers a statistical question—but the human must identify the question to ask

• There are no absolutes in reliability or

maintenance data analysis

– Results give guidance to decisions

• Always start with the simple analysis before

attempting more advanced methods

Examples of Analysis

• Count number of failure events?—what is a

failure event?

• Calculate the rate of occurrence against usage?

• Identify the distribution of the events with time?

• Examine the causes of failure events?


PEUSS 2011/2012 Data Collection and Analysis Page 23

What is usage?

• Which measures of life-consumption should be

used?—hours, days, cycles,


• What factors potentially affect the rate of

life-consumption?—time of year, production batch,


• What is the influence of the environment?—

effects of different market segments?

Analysis – data censoring

• Complete data means that the value of the life time of each item is observed or known. For example, for life data analysis, the data (if complete, which is unusual in field data collection) would comprise the times-to-failure of all units in the field.

• Often when life data are analyzed, all the units may not have experienced events of interest or the time of the event is not known. This type of data is censored data.

• There are three types of possible censoring schemes, – right censored data (also called suspended data), – interval censored data,


PEUSS 2011/2012 Data Collection and Analysis Page 25

Analysis – right censoring

• The most common case • These data are

composed of units that did not experience any events.

• The term "right

censored" implies that the event of interest is to the right of the analysis point. Unit 1 Unit 2 Unit 3 Unit 4 Unit 5

Analysis – interval censoring

• Interval censored data contains

uncertainty as to the exact times the events happened within an interval. Unit 1 Unit 2 Unit 3 Unit 4 Unit 5


PEUSS 2011/2012 Data Collection and Analysis Page 27

Analysis –left censoring

• An event occurrence time is only known to be before a certain time Unit 1 Unit 2 Unit 3 Unit 4 Unit 5


• Use the results

– Support decisions to enable achievement of objectives

– Improve data collection process • Refine


PEUSS 2011/2012 Data Collection and Analysis Page 29

Syndicate exercise

You are project managers in a car design and manufacturing company. • Your company has links to a network of car dealers (sales, repair and

servicing). It does not currently have contact directly with end-users.

• Identify 3 key objectives for a data collection and analysis system to be used by your company.

• For each objective give examples of: – Type of data

– Method of collection – Costs implications

• With appropriate consideration of technology, human factors, business factors and costs, design a cost-effective data collection and analysis system identify:

– Benefits

– How well it will meet the objectives • Present your work


• Reliability & Maintenance data collection should

pro-actively support management objectives.

• R&M data may be expensive and should be

tailored for maximum cost-benefit.

• The analysis process is feasible only with valid

data—Human factors are an important issue