PEUSS 2011/2012 Data Collection and Analysis Page 1
Data Collection and Analysis
Dr Jane Marshall
Product Excellence using 6 Sigma
Module
Objectives
• Understand the relationship between data and
analysis objectives
• Understand the data collection planning process
• Appreciate human factors of data collection
PEUSS 2011/2012 Data Collection and Analysis Page 3
What is data?
• The terms 'data' and 'information' are used interchangeably
• However the terms have distinct meanings:
– Data are facts, events, transactions and so on which have been recorded. They are the input raw materials from which information is processed.
– Information is data that have been produced in such a way as to be useful to the recipient.
• In general terms basic data are processed in some way to form information but the mere act of processing data does not itself produce information.
Data Characteristics
• Data are facts obtained by reading, observation,
counting, measuring, and weighing etc. which are then recorded.
• Called raw orbasic data and are often records of the day to day transactions of an organization.
• Data are derived from both external and internal sources.
• Data may be produced as an automatic by-product of some routine but essential operation
PEUSS 2011/2012 Data Collection and Analysis Page 5
Data Characteristics
• The pool of data available is effectively limitless. • This abundance means that organisations have to be
selective in the data they collect.
• They must continually monitor their data gathering procedures to ensure that they continue to meet the organisation's specific needs
• The data gathered and the means employed naturally vary from business to business depending on the organization's requirements.
Why collect data?
• Measure reliability
• Document spares consumption
• Provide statistics
• …
• These are reactive
PEUSS 2011/2012 Data Collection and Analysis Page 7
Why collect data?
• Maintenance planning
• Maintenance improvement
• Identify & justify need for modification
• Calculate future resource & spares requirements
• Assess likelihood of mission success
• Confirm contractual requirements
Why collect data
• To assist achievement of worthwhile objectives
• Data collection is time-consuming & costly.
– We should only collect data where there is an identified and worthwhile benefit from doing so.
PEUSS 2011/2012 Data Collection and Analysis Page 9
From data to worthwhile
objectives
Operation Data Collection Analysis Results Decisions Achievement of ObjectivesPut planning into data
collection
Operation Data Collection Analysis Results Decisions Achievement of ObjectivesPEUSS 2011/2012 Data Collection and Analysis Page 11
Put planning into data
collection
• Worthwhile objectives require decisions: – To change—how much, what, when, how – To not change
• Decisions need clear supporting evidence: – Analysed results—not all analysis is equal • Analysis needs data
– Good results need good analysis—but good analysis may need expensive data
– Options—consider alternatives and identify most cost-effective that enables objectives
Put planning into data
collection
• Data collection does not need to satisfy all
objectives all the time. For example:
– Objective 1: Identify quickly that there is a reliability problem
• Routine data collection sufficient to allow SPC or CUSUM analysis of occurrences
– Objective 2: Identify accurately what the problem is • Special data collection once a problem has been
identified—possibly using sampling techniques and engineering analysis rather than data analysis
PEUSS 2011/2012 Data Collection and Analysis Page 13
Data collection must have a
purpose!
• Data should be collected for a purpose: – to enable analysis,
– Focus on increasing understanding of item operation and failure,
– Application of this knowledge to a goal or objective.
• Without a definition of the objective for the future data analysis and the application of its findings, collection of data is likely to be aimless and will omit important data, allow corruption of data, or may waste time and
resources by including data that offer little benefit.
Questions to consider
– What observed availability is achieved with the applied maintenance regime?
– What values have been achieved with a former, similar product?
– Does the product conform to the requirements? – What affect has environment and usage on
dependability?
– How stable is the dependability of manufactured items with time?
PEUSS 2011/2012 Data Collection and Analysis Page 15
Level of reporting
• Structure of items – system; – equipment; – module or unit; – part or component; – software module. • Generically thesecan all be termed items
• Different phases of the life cycle :
– production to delivery; – installation;
– operation;
– time of warranty;
– long term behaviour, useful life, service effort;
– withdrawal from operation;
• Inventory
– Information proving that a particular item exists in the field – How that item is configured
– What other items that item contains
• Usage
– Information about when an item was placed into the field, – How that item is operated in the field
– When that item was removed from the field
• Environment
– Information about the operating conditions of the item
• Events
– Information about any thing that has happened to the item during its life
PEUSS 2011/2012 Data Collection and Analysis Page 17
Data sources
• Servicing records, • warranty records,
• repaired product records • spares used records • Disposal records • Customer complaints
• Customer reports and comments can also be used to help complete a data set.
• Insurance claims and coverage records
Resources
• The infrastructure :
– Diagnosis and service utilities as necessary for maintenance; – Computerized tools for data storage, aggregation, Analysis and
reporting;
– Facilities for raw data recording computerized facilities – Remote condition monitoring and data collection.
• Economical and financial aspects to be considered are:
– Cost for implementation and maintaining regular data collection; – Benefits gained by improvement of processes caused by measures
PEUSS 2011/2012 Data Collection and Analysis Page 19
Data Validation
• Why validate
– Avoid garbage-in, garbage-out
– Avoid wrong decisions with costly consequences
– Reliability analysis often requires large amounts of data, collected over a long period of time—it is too late to find that data is corrupt when analysis is attempted
• How to validate
– Input masks, cross-checks (e.g. serial # fitted previously is serial # removed, serial # fitted is serial # removed from stores, item fitted matches host equipment, etc.), usage matches expectation, gaps in data …
– Use electronic aids such as smart-chips, bar-coding – Validate incrementally—validate at point of data entry
Human factors in data
collection
• Make simple to get data collection correct
• Make difficult to get data collection wrong
• Complexity? Layout? Masks? Computer
assistance?
• Involve those who collect the data in the
planning process—buy-in to objectives
PEUSS 2011/2012 Data Collection and Analysis Page 21
Analysis
• Analysis is often as much detective work as it is
statistics
– Analysis answers a statistical question—but the human must identify the question to ask
• There are no absolutes in reliability or
maintenance data analysis
– Results give guidance to decisions
• Always start with the simple analysis before
attempting more advanced methods
Examples of Analysis
• Count number of failure events?—what is a
failure event?
• Calculate the rate of occurrence against usage?
• Identify the distribution of the events with time?
• Examine the causes of failure events?
PEUSS 2011/2012 Data Collection and Analysis Page 23
What is usage?
• Which measures of life-consumption should be
used?—hours, days, cycles,
time-since-overhaul?
• What factors potentially affect the rate of
life-consumption?—time of year, production batch,
user?
• What is the influence of the environment?—
effects of different market segments?
Analysis – data censoring
• Complete data means that the value of the life time of each item is observed or known. For example, for life data analysis, the data (if complete, which is unusual in field data collection) would comprise the times-to-failure of all units in the field.
• Often when life data are analyzed, all the units may not have experienced events of interest or the time of the event is not known. This type of data is censored data.
• There are three types of possible censoring schemes, – right censored data (also called suspended data), – interval censored data,
PEUSS 2011/2012 Data Collection and Analysis Page 25
Analysis – right censoring
• The most common case • These data are
composed of units that did not experience any events.
• The term "right
censored" implies that the event of interest is to the right of the analysis point. Unit 1 Unit 2 Unit 3 Unit 4 Unit 5
Analysis – interval censoring
• Interval censored data contains
uncertainty as to the exact times the events happened within an interval. Unit 1 Unit 2 Unit 3 Unit 4 Unit 5
PEUSS 2011/2012 Data Collection and Analysis Page 27
Analysis –left censoring
• An event occurrence time is only known to be before a certain time Unit 1 Unit 2 Unit 3 Unit 4 Unit 5
Results
• Use the results
– Support decisions to enable achievement of objectives
– Improve data collection process • Refine
PEUSS 2011/2012 Data Collection and Analysis Page 29
Syndicate exercise
You are project managers in a car design and manufacturing company. • Your company has links to a network of car dealers (sales, repair and
servicing). It does not currently have contact directly with end-users.
• Identify 3 key objectives for a data collection and analysis system to be used by your company.
• For each objective give examples of: – Type of data
– Method of collection – Costs implications
• With appropriate consideration of technology, human factors, business factors and costs, design a cost-effective data collection and analysis system identify:
– Benefits
– How well it will meet the objectives • Present your work