commercialize - you are allowed to use this work for

(1)

Statistics for Materials Engineers

MATLS 3J03

Instructor: Tim Dietrich

Overall revision number: 19 (January 2013)

(2)

Copyright, sharing, and attribution notice

This work is licensed under the Creative Commons Attribution-ShareAlike 3.0

Unported License. To view a copy of this license, please visit

http://creativecommons.org/licenses/by-sa/3.0/

This license allows you:

I to share - to copy, distribute and transmit the work

I to adapt- but you must distribute the new result under the same or similar license to this one

I commercialize - youare allowedto use this work for commercial purposes

I attribution - but you must attribute the work as follows:

I “Portions of this work are the copyright of Kevin Dunn”,or

I “This work is the copyright of Kevin Dunn”

(3)

We appreciate:

I if you let us know about any errors in the slides

I any suggestions to improve the notes

All of the above can be done by writing to

[email protected]

or anonymous messages can be sent to Kevin Dunn at

http://learnche.mcmaster.ca/feedback-questions

If reporting errors/updates, please quote the current revision number:77

Please note that all material is provided “as-is” and no liability will be accepted for your usage of the material.

(4)

Intro

(5)

In context

I The section we’ve just finished could be considered:

“Empirical modelling of systems using a least squares model”

I Experiments are important:

I We learn more about our systems

I We use the data to fit an empirical model

I Main aim: use the model to optimize a process forhigher profit

I Happenstance (as-is) data

I cannot tell cause-and effect

I most often is not in a DOE layout, but might still be valuable to learn from.

(6)

(7)

References

I Box, Hunter and Hunter, Statistics for Experimenters

I chapters 10, 11, 12, 13, 15 in first edition

I chapters 5 and 6 in second edition

(8)

Experiments with a single variable at two levels

I Simplest case:

I catalyst A vs catalyst B

I low RPM vs high RPM

I etc

I Measure nA value from setup A

I Measure nB values from setup B

(9)

Recap of group-to-group differences

Recap:

s_P2 = (nA−1)s

2

A+ (nB −1)sB2 nA−1 +nB −1

z = (¯xBs−xA¯ )−(µB −µA)

s_P2 1 nA + 1 nB

(¯xB−x¯A)−ct×

q s2 P 1 nA+ 1 nB

≤ µB−µA ≤ (¯xB−x¯A)+ct×

q s2 P 1 nA+ 1 nB

I Significant difference: does confidence interval span zero?

I Practical difference?

I width of confidence interval

I where it lies relative to zero

(10)

Using linear least squares models

I Canachievesame resultusing least squares: yi =b0+g di I _d_i =0 forA

I _d_i =1 forB

I andyi istheresponse variable.

Twototallydifferentmethods;sameresult!Confirmitforyourself

(11)

Importance of randomization

Whyrandomize experiments?

I Prevent unmeasured, and uncontrollabledisturbances affecting y

I Guarantees independence in the data

I We can then use t-distributions (which require independence)

I The example of Fisher: lady and the tea. Modern day example: Coke vs Pepsi.

I Engineering example: A = TK104 and B = TK107

I nA= 8: [254, 440, 501, 368, 697, 476, 188, 525]

I nB = 9: [338, 470, 558, 426, 733, 539, 240, 628, 517 ] I Null hypothesis: there is no difference

I Implies these numbers could have come from either A or B

Details of the analysis are given in the course textbook

(12)

Importance of randomization

I nA= 8: [254, 440, 501, 368, 697, 476, 188, 525]

I nB = 9: [338, 470, 558, 426, 733, 539, 240, 628, 517 ]

I Randomly assign “A” to any nA of the values and “B” to any nB of the values

I (nA+nB)! nA!nB!

possible combinations = 24310

I Combinations: number of unique ways to split 17 experiments into 2 groups of nA = 8 andnB = 9

I 1: AAAAAAAABBBBBBBBB

I 2: AAAAAAABABBBBBBBB

I 3: AAAAAAABBABBBBBBB

I etc

I For each arrangement we calculate: ¯yA−yB¯

I Plot a histogram of this difference of averages

(13)

Importance of randomization

I Probability that the actual experiment could have come from chance?

I 79.6% combinations have a lower value than actual difference

I Using standard group-to-group difference:

I z = 0.8435

I Pr(z <0.8435) = 79.3% (DOF =nA+nB−2)

I Result: if we don’t randomize, we cannot usez-values and

confidence intervals - may be misleading.

(14)

Importance of randomization

The previous derivation, used random combinations and made no statistical assumptions.

Why don’t we use this approach instead?

The original data set (still a small data set by today’s standards) wasnA = 20 andnB = 23. There are

(nA+nB)! nA!nB!

≈960,567 million combinations, and it would take

about 3 years on a regular computer to do the computation (never mind storing the results.)

(15)

I Base case: T=346K,S = 1.5g/L; yield = 63%.

15

Change

one

single

variable

at

a

time

(COST)

(16)

Change one single variable at a time (COST)

I Trapped in a sub-optimal solution

I In the previous example: we would have considered experiment 7 to be the optimum

I experiment 3 is the optimum wrt “Temperature”

I then experiment 7 is the optimum wrt “Substrate”

I but, we’re still away from the true optimum

I We have known for 80 years now: COST is wrong way to optimize a system

(17)

Why not use existing data?

I Existing data = historical data = happenstance data

I This is data without any intentional perturbations

I Problem: we see correlations, but we cannot tell if they are

causal

(18)

Terminology:

Factors

Factor: the thing that is being changed.

I growing plants?

I water used = [50mLvs80 mL] I maximizing sales in a store?

I height from floor = [3ftvs5ft] I first date or “date-night”?

I action movievschick flick I growing plants?

(19)

Terminology:

Response

Response: the outcome that is being measured

I growing plants?

I e.g. height of plant after 10 days

I other outcomes are possible

I maximizing sales in a store?

I total profit

A response variable:

I is usually (in almost every case) a continuous variable

I should be measured in the same manner for all experiments

I should be reproducibly measurable

I measure as many outcomes as you can to avoid repeating experiments later

(20)

Factorial designs: 2 levels for 2 or more factors

I Change multiple factors simultaneously

I Factor: is a variable that we can manipulate/adjust/set

I Consider, for now, two levels in each factor. For example:

I continuous: low and highpH

I continuous: short reaction time and longreaction time I discrete: catalystA and B

(21)

Factorial designs: by example

I We will use this system for our example

(22)

Factorial designs: by example

Bioreactor example: aim is to maximize they = conversion [%]

I T: Temperature: Tlow = 338K and Thigh = 354K

I S: Substrate concn: Slow = 1.25 g/L and Shigh = 1.75 g/L I How is the range chosen?

I About 25% of typical operating range if no other prior knowledge. We will consider other criteria later also.

I Factors are: T andS

I Number of experiments (runs): 2k;k = number of factors

(23)

Factorial designs: by example

I Run your experiments in random order, collect results:

Notes:

I we don’t need to run an experiment at the baseline (it can be useful though)

I baseline atT = 1₂(338 + 354) and S = 1₂(1.25 + 1.75), i.e.

baseline at (346K; 1.5g/L) = midpoint of the factorial

I if we had replicate experiments, then use the average of the response variable

(24)

Factorial designs: by example

(25)

Analysis: Main effects

I Main effect: difference from high to low level

I Where would you run your next experiment(s) to improve yield?

(26)

Analysis: Main effects

(27)

Analysis: Main effects

I No computer? Use an interaction plot (see notes for section 1 of the

course)

I Lines are roughly parallel in this case

I The numbers “1”, “2”, “3”, “4” refer to the experiment number in standard order

(28)

Analysis: interaction effects

(29)

Analysis: interaction effects

Experiment T [K] S [g/L] y [%]

1 −(390K) −(0.5 g/L) 77

2 + (400K) −(0.5 g/L) 79

3 −(390K) + (1.25 g/L) 81

4 + (400K) + (1.25 g/L) 89

I Main effect of T:

I Main effect of S:

(30)

Analysis: interaction effects

1 −(390K) −(0.5 g/L) 77

2 + (400K) −(0.5 g/L) 79

3 −(390K) + (1.25 g/L) 81

4 + (400K) + (1.25 g/L) 89

I Main effect ofT: 5% per 10K; but reported as2.5% per 5K

I ∆TS+= 8% per 10K I ∆TS−= 2% per 10K

I Main effect of S: 7% per 0.75g/L; report3.5% per 0.375g/L

(31)

Analysis: interaction effects

I Lines not parallel

I Indicates magnitude of effect is not the same at both levels of the variable being held constant

I Implies there is aninteraction

I _{In this case, interaction between}_T _and_S

I could also be called theS andT interaction: symmetrical

I called theT ×S interaction (orS×T interaction)

I it is a 2-factor interaction (2fi)

(32)

Analysis: interaction effects

Recall system withno interaction (earlier example):

I Main effect of T:

I TS+=−11% per 16K I TS−=−9% per 16K

I Main effect of S:

I ST+=−7% per 0.5g/L I ST−=−5% per 0.5g/L

(33)

Analysis: interaction effects

Systemwith interaction (second example):

I Main effect of T: 5% per 10K

I TS+= 8% per 10K I TS−= 2% per 10K

I Main effect of S: 7% per 0.75g/L

I ST+= 10% per 0.75g/L I ST−= 4% per 0.75g/L

I There was an important phenomenon that we did not capture with the main effects alone

I The main effects are quite different for each estimate.

I We need “something else” to capture this interaction

(34)

Analysis: interaction effects

I T interaction withS:

I ∆y due toT at highS: +8

I ∆y due toT at lowS: +2

I The half difference: [+8−(+2)]/2 =3 I S interaction withT:

I ∆y due toS at high T: +10

I ∆y due toS at low T: +4

I _{The half difference: [+10}₋_(+4)]_/_{2 =}₃

Interpretation:

I T andS increase y by a greater amount when both are high

I Similarly, both terms reducey when they are of opposite sign.

Interaction terms dominate on a ridge, and are important as

(35)

Visualizing the interaction: we are on a ridge

I T interacts with S:

I ∆y due toT atS+: +8 I ∆y due toT atS−: +2

I S interacts with T:

I ∆y due toS atT+: +10 I ∆y due toS atT−: +4

T and S increasey when they both operate together (they are

(36)

Analysis by least squares modelling

I Return back to system with little interaction:

Baseline 346 K 1.50

1 −(338K) −(1.25 g/L) 69

2 + (354K) −(1.25 g/L) 60

3 −(338K) + (1.75 g/L) 64

4 + (354K) + (1.75 g/L) 53

I Standard form: variable−center point range/2

I T− =

338−346 (354−338)/2 =

−8 8 =−1

I S− =

1.25−1.50 (1.75−1.25)/2 =

−0.25 0.25 =−1

I T+ = +1

(37)

Analysis by least squares modelling

Least squares model

y =β0+βTxT+βSxS +βTSxTxS+ε y =b0+bTxT +bSxS +bTSxTxS+e

I 4 parameters to estimate: b0,bT,bS,bTS

I 4 data points

I Zero degrees of freedom (i.e. SE = 0, no confidence intervals

possible)

(38)

Analysis by least squares modelling

Aim: Write out the LS model equation for the 4 data points; stack them as rows in a matrix.

For example, for the first experiment from the standard order table, which was run at lowT and low S:

y1 = b0 + bTxT + bSxS + bTSxTxS + e1 y1 = b0 + bTT− + bSS− + bTST−S− + e1

    y1 y2 y3 y4     =    

1 T− S− T−S−

1 T+ S− T+S−

1 T− S+ T−S+

1 T+ S+ T+S+         b0 bT bS bTS     +     e1 e2 e3 e4    

(39)

Visualizing the least squares modelling

I Least squares model for DOE in 2 factors

I Interaction term is small: blue plane is flat

(40)

Analysis by least squares modelling

    y1 y2 y3 y4     =    

1 T− S− T−S−

1 T+ S− T+S−

1 T− S+ T−S+

1 T+ S+ T+S+         b0 bT bS bTS     +     e1 e2 e3 e4         69 60 64 53     =    

1 −1 −1 +1 1 +1 −1 −1 1 −1 +1 −1 1 +1 +1 +1

        b0 bT bS bTS     +     e1 e2 e3 e4    

y = Xb + e

Xmatrix is trivial to set up:

I Interceptcolumn: is alwaysacolumn of 1’s

I x_T column: comes directlyfromstandard table

I x_S column: comes directlyfromstandardtable

(41)

Analysis by least squares modelling

I XTX=



  

4 0 0 0 0 4 0 0 0 0 4 0 0 0 0 4



  

I XTy=

(42)

Analysis by least squares modelling

b = (XTX)−1XTy=



  

1/4 0 0 0 0 1/4 0 0 0 0 1/4 0 0 0 0 1/4

        246 −20 −12 −2     b =    

61.5

−5

−3

−0.5



  

I y =b0+bTxT +bSxS+bTSxTxS+e

I y = 61.5−5xT −3xS+−0.5xTxS+e

You can easily calculate these effects by hand:

I (+69 + 60 + 64 + 53)/4 = 61.5

I (−69 + 60−64 + 53)/4 =−5

I (−69−60 + 64 + 53)/4 =−3

(43)

Analysis by least squares modelling

1. XTX: zeros on off-diagonals

I _{orthogonal matrix}

I each column is varied independently of the others

I calculate thekthslope coefficient separately: bk =

x_kTy xT

k xk

2. InterpretbT =−5?

I xT is the change innormalized temperatureby 1 unit

I ChangingxT from 0 to 1 impliesTactualchanges from 346K to

354K (baseline to high level)

I ChangingxT from -1 to 0 impliesTactualchanges from 338K

to 346K (low level to baseline)

I −5% decrease in conversion for every 8K increase in temperature

3. Now interpret bS =−3?

4. How to use this model for a prediction?

(44)

Analysis by least squares modelling

I The least squares model was y = 61.5−5xT −3xS−0.5xTxS+e I The geometric construction was:

(45)

Analysis by least squares modelling

Return to system withhigh interaction

I Base line: T = 395K and S = (1.25+0.5)/2 = 0.875 g/L

I Calculate deviation variables

I Build the matrices and calculateb= (XTX)−1XTy

I Verify at home: y = 81.5 + 2.5xT + 3.5xS+ 1.5xTxS

Large interaction is confirmed in least squares model due to high value of the 1.5 coefficient on thexTxS term

(46)

Analysis by least squares modelling: visualizing it

High interaction system:

Ignoring interaction term:

We are estimating a linear equation using linear least squares.

With interaction term:

(47)

DOE of a 3-factor experiment

Plastics molding factory; waste treatment.

I Factor 1: C: chemical compound added (A or B)

I Factor 2: T: treatment temperature (72F or 100F)

I Factor 3: S: stirring speed (200 rpm or 400 rpm)

I y = amount of pollutant discharged [lb]

I Categorial variables: A=−1 and B=+1 (orvice versa)

(48)

DOE of a 3-factor experiment

Example on the board:

1. Geometric illustration of the data

2. Calculate main effects

3. Calculate the 3 two-factor interactions, and the single 3 factor interaction

I C×T andC×S andT ×S andC×T ×S

4. Main effects and interactions using least squares (by-hand)

5. Computer verification:

I _y₌₁₁_.₂₅₊₆_.₂₅_x_C₊₀_.₇₅_x_T₋₇_.₂₅_x_S₊₀_.₂₅_x_Cx_T

(49)

Summary of factorial designs

I Good visual interpretation, even on paper

I Few experiments, but powerful information

I Building blocks for complex designs

I 2k experiments fork factors

I Each factor is varied independently of the others

I Each factor in model can be interpreted independently

I Least squares model easily derived by hand

I Main effects cannot be interpreted separate from their

interactions

I y =b0+bPxP+bQxQ+bPQxPxQ+e

I Sometimes a small effect is desirable: implies y not sensitive that factor

(50)

Summary of factorial designs

Much more efficient than change one-single factor at-a-time (COST)

I COST: cannot estimate interactions

(51)

Review: Change one variable at a time

If COST in a cross shape (experiments 2, 3, 4, 5, 6):

I we cannot estimate interactions

I only a single estimate of each main effect

I rescued to a full factorial: e.g. use experiments 2, 3, 6 and add new point below 2, to the right of 6