Data Mining Applications in Manufacturing
Dr Jenny Harding Dr Jenny Harding
Senior Lecturer
Wolfson School of Mechanical &
Manufacturing Engineering, Loughborough University
New P
roduct New Designs
Additional Facilities?
Redesigned Processes?
Extra Resources?
Existing Products
Existing Designs
New Business Objectives
New Technology?
Identification of Knowledge - Context
Intelligent Support
Systems Problem Areas
Tools:
Information Modelling Process Modelling
Artificial Intelligence Simulation
Data Mining???
How do we find and use the
“Right” Knowledge?
Sources of Manufacturing Knowledge
People
Expertise and Experience
Facts or Folklore? Under Exploited?
Processes
(Operational Data)
Production Processes
Background History
Why Data Mining?
What is the “Right” Knowledge?
Or is the knowledge hidden somewhere else as well?
Are “experts” the only sources?
Is there any substance in folklore?
Reported Applications of DM in Manufacturing
Earliest - Fault diagnosis (1987) Substantial increase in
(manufacturing) data mining publications in last 2 years
1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 Ma
teria l Prop
erties Sh
op Flo
or C ontro
l La
you t Des
ign M
ain ten
ance Dec
isio n S
up port En
gin eerin
g D esign Ma
nu factu
rin g Sy
stem s 2
1 4
1 1 4
2 8
23 3 7
6 1
1
3 2
3
1 2 2 7
1
12 1 1
3 1 2 2
4
1 3
2 1 1 3
1 3
1 1 5
2
1 1 11 2 1 1
1 1
1
1 1
1 1
1 112 2
1 11 2 0
1 2 3 4 5 6 7 8
No. of Papers
1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 Ma
teria l Prop
erties Sh
op Flo
or C ontro
l La
you t Des
ign M
ain ten
ance Dec
isio n S
up port En
gin eerin
g D esign Ma
nu factu
rin g Sy
stem s 2
1 4
1 1 4
2 8
23 3 7
6 1
1
3 2
3
1 2 2 7
1
12 1 1
3 1 2 2
4
1 3
2 1 1 3
1 3
1 1 5
2
1 1 11 2 1 1
1 1
1
1 1
1 1
1 112 2
1 11 2 0
1 2 3 4 5 6 7 8
No. of Papers
Harding, J A, Shahbaz, M, Srinivas and Kusiak, A, 2006, "Data Mining in Manufacturing: A Review", American Society of Mechanical Engineers
(ASME): Journal of Manufacturing Science and Engineering.
Reported Applications of DM in Manufacturing (2)
Use of data mining applications in Manufacturing is relatively new
To date – most of the research done has focused on semi-conductor industry (fault detection and
productivity improvement)
Summary:
To date – most of the reported research uses decision trees, clustering and neural networks
Challenges for DM in Manufacturing
? Investment – very wide range of tools and techniques and not obvious in most applications which will produce best results and rewards (DM preparation costs are high).
Why slow adoption?
? Availability and quality of data – substantial data cleaning and pre-processing required.
? Combinations of expertise needed – Data Mining Expertise + Expertise in Processes / Technologies under consideration
? Inadequate Exploitation of Resulting Knowledge -
“one-off” applications addressing specific problems
Challenges for DM in Manufacturing (2)
? Uncertainty and risk - there are no guarantees that useful knowledge will be discovered. In highly
competitive situations high risks may be unacceptable
Why slow adoption?
? Complexity and diversity - extremely difficult to devise generic data mining processes even for particular types of manufacturing problems
Case Studies
Experience from literature and
industrial case studies shows……….
An Example
Width
Height
aa ab ac ad
Thickness aa ab ac ad ae af
aa ab ac
Width
Height
aa ab ac ad
Thickness aa ab ac ad ae af
aa ab ac
Width
Height
aa ab ac ad
Thickness aa ab ac ad ae af
aa ab ac
Width
Height
aa ab ac ad
Thickness aa ab ac ad ae af
aa ab ac
A simple sample product
Data Cleaning
• Very important! (But very time-consuming)
•Iterative Process
Process 1
Process 2
Process n
Data Cleaning
Module Data
Data
Data
consolidated Error-free
table Process 1
Process 2
Process n
Data Cleaning
Module Data
Data
Data
consolidated Error-free
table
General Classifications
Identical Records
Some Duplication in Records Confusing Records
Missing Values in Records etc
Possible Causes
Breakdowns / Restarts Rework cycles
Operator Error
Operator Training Etc., etc.
Example Thoughts during Cleaning
Should Reworked records be included or ignored?
Can Missing values be replaced or should the whole record be rejected?
Can domain experts of more detailed files be consulted about replacing the abnormal data?
All the replaced values need to be documented for future consultation.
Data Transformation
01 02 03 04 05 06 07 08 09 10 11 σ
σ σ σ σ
σσ σ σ
σ σ σ
01 02 03 04 05 06 07 08 09 10 11 σ
σ σ σ σ
σσ σ σ
σ σ σ
Lower Limit LoutLower H-Lower M-Lower S-Lower Nominal S-Upper M-Upper H-Upper Upper
Upper Limit Uout
Lower Limit LoutLower H-Lower M-Lower S-Lower Nominal S-Upper M-Upper H-Upper Upper
Upper Limit Uout
Lower Limit LoutLower H-Lower M-Lower S-Lower Nominal S-Upper M-Upper H-Upper Upper
Upper Limit Uout
Lower Limit LoutLower H-Lower M-Lower S-Lower Nominal S-Upper M-Upper H-Upper Upper
Upper Limit Uout
Products - split
tolerance range for dimension.
Equal divisions???
Process variables
Normal distributions?
Equal divisions???
Data Mining
Association Rules the Apriori Algorithm has been most successful
Regression analysis used first – are there any simple relationships that can be exploited?
Several data mining approaches tried
A complete set of dimensional / manufacturing data for each product needs to be collected and transformed to make each record
Data Mining
In Manufacturing - Data mining decisions should be made in the context of the physical realities of the data that is being examined.
In very highly controlled manufacturing
processes, unusual combinations of results (representing problems) may be rare.
Support level kept very low to reduce risk of
missing unusual but important combinations of manufacturing outputs
Association Rule
Used in mostly Market Basket Analysis Apriori Property:
“Any subset of a large itemset must be large”
Rules are in the form of
A ŁŁŁŁ B or A ŁŁŁŁ (B & C) etc.
e.g. Chips ŁŁŁ Coke Ł
•Each rule is based on certain Support and Confidence level.
Rule Quality
Need to check Quality of the generated rules – use Support, Confidence, Statistical methods AND check with domain experts that the rules make sense in the Manufacturing Context.
Support: Support is the frequency of the occurrence of items in a transaction or record.
Confidence: Confidence is the certainty of the discovered rule.
Example Rules
Types of Manufacturing Rules that may be identified include:
Product Design Errors or Constraints Process Errors or Constraints
From Product Data - Rules of the form
Good Dimension output ŁŁŁŁ Good Dimension output Bad Dimension output ŁŁŁŁ Bad Dimension output Good Dimension output ŁŁŁŁ Bad Dimension output
From Product and Process Data
Process variable values that produce Good Product Results Process variable values that produce Poor Product Results
Example Rules
•Rules that indicate one dimension being “nominal”
results in another dimension being “nominal” are reassuring – but generally not very interesting
•Rules that indicate one dimension being “nominal”
results in another dimension being out of tolerance or away from “nominal” are more interesting.
Possible Causes include:- Design Error
Process Error
Process Constraints
Results
• Data Mining can be used to extract Manufacturing process limitations.
• Data Mining can be used to discover design constraints and then help to improve the product design, (if manufacturing process is acceptable).
• Data Mining can be used to improve the output quality by controlling the manufacturing process variables.
Data Mining projects tend to be “one-off”
applications to solve a particular problem
Future Considerations……….
But how do we make sure the acquired
knowledge is made use of when it has been discovered?
Greater benefits would be gained if they were
part of an ongoing Knowledge Reuse
programme
Data Mining Network
Main Pool of Knowledge Integration &
DM Engine
Local Site / Process 1
Local Databases and Files
Main Data Warehouse
Local Data Mining Engines
Local Pool of Knowledge
Local Site / Process 2
Local Databases and Files
Local Data Mining Engines
Local Pool of Knowledge Local Site / Process n-1
Local Databases and Files
Local Data Mining Engines
Local Pool of Knowledge
Local Site / Process n
Local Data Mining Engines
Local Pool of Knowledge
Local Databases and Files