Engineering Experimentation II
Engineering Experimentation II
Lecture 7
Lecture 7
S
Su
um
mm
ma
ar
r o
of
f L
Le
ec
cttu
urre
e 6
6
Regression ModelRegression Model
Linear model coefficientsLinear model coefficients
Model evaluationModel evaluation
Exploit contour and surface plotsExploit contour and surface plots
Error bars for 2Error bars for 222 exampleexample
Single factor multiple levelSingle factor multiple level
Make sense of your dataMake sense of your data
Model LinearizationModel Linearization
Curve fittingCurve fitting
RR22 definitiondefinition
S
Su
um
mm
ma
ar
r o
of
f L
Le
ec
cttu
urre
e 6
6
Regression ModelRegression Model
Linear model coefficientsLinear model coefficients
Model evaluationModel evaluation
Exploit contour and surface plotsExploit contour and surface plots
Error bars for 2Error bars for 222 exampleexample
Single factor multiple levelSingle factor multiple level
Make sense of your dataMake sense of your data
Model LinearizationModel Linearization
Curve fittingCurve fitting
RR22 definitiondefinition
The hypothesis testing frameworkThe hypothesis testing framework
The two-sampleThe two-sample t t -test-test
eecc nng g aassssuummpp oonnss, , vvaa yy
Comparing more than two factor’s levels…Comparing more than two factor’s levels…thethe aananalysis lysis ofof variance
variance
ANOVA decomposition of total variability ANOVA decomposition of total variability
Statistical testing & analysisStatistical testing & analysis
Checking assumptions, model validityChecking assumptions, model validity
Post-ANOVA testing of meansPost-ANOVA testing of means
P
The Hypothesis Testing Framework
Statistical hypothesis testing
is a useful framework for
many experimental situations
r g ns o
e me o o ogy a e rom e ear y
s
We will use a procedure known as the two-sample
t-The Hypothesis Testing Framework
Sampling from a normal distribution Statistical h otheses:
0
:
1 2H
μ
=
μ
1
:
1 2Estimation of Parameters
1
n=
1 i nn
= 2 2 2 1(
) estimates the variance
1
i iS
y
y
n
=σ
=
−
Summar Statistics
. 36
Modified Mortar
“
”
Unmodified Mortar
“Ori inal reci e”
1
16.76
y
=
1 217.04
y
=
1 1.
0.316
S
=
=
1 1.
0.248
S
=
=
110
n
=
n
1=
10
How the Two-Sample
t
-Test Works:
Use the sam le means to draw inferences about the o ulation means
1 2
16.76 17.04
0.28
y
− =
y
−
=
−
2
Standard deviation of the difference in sample means
2 y
This su
ests a statistic:
n
σ=
1 2 0 2 2Z
=
y
−
y
1 2 1 2n
+
n
How the Two-Sample
t
-Test Works:
1 2 1 2se
an
o es ma e
an
y
y
σ σ−
2 2 1 2e prev ous ra o ecomes
S
+
S
1 2
2 2 2
However we have the case where
n
n
σ
=
σ=
σPool the individual sample variances:
2 2 2
(
11)
1(
21)
22
pn S n
S
S
n
n
−
+
−
=
+ −
How the Two-Sample
t
-Test Works:
The test statistic is
1 2 0
y
y
t
=
−
1 2 pS
n
+
n
Values of t 0 that are near zero are consistent with the null hypothesis Values of t 0that are very different from zero are consistent with the
alternative h othesis
t 0is a “distance” measure-how far apart the averages are expressed in
standard deviation units
The Two-Sample (Pooled)
t
-Test
2 2 2 1 1 2 2 1 2 ( 1) ( 1) 9(0.100) 9(0.061) 0.081 2 10 10 2 p n S n S S n n−
+
−
+
=
=
=
+
−
+
−
0.284 p S=
1 2 0 16.76 17.04 2.20 1 1 1 1 y y t=
−
=
−
=
−
1 2 . 10 10 p n nThe two sample means are a little over two standard deviations apart Is this a "large" difference?
The Two-Sample (Pooled)
t
-Test
So far, we haven’t reall done
any “statistics”
We need an objective basis
for deciding how large the test
t
0 = -2.20
statistic 0 really is
In 1908, W. S. Gosset derived
the reference distribution
0…
distribution
Tables of the t distribution
The Two-Sample (Pooled)
t
-Test
A value of t 0 between –2.101 and 2.101 is consistent with equality of means
t 0 is exceeding the range of 2.101 or –2.101, leads to significant means difference Could also use the P -value approach
The Two-Sample (Pooled)
t
-Test
t
0 = -2.20
The P- value is the risk of wron l re ectin the null h othesis of e ual
means (it measures rareness of the event)
Im ortance of the
t
-Test
Provides an objective framework for simple comparative
experiments
ou
e use o es a re evan ypo eses n a
wo-level factorial design, because all of these hypotheses
“
”
versus the mean response at the opposite “side” of the
cube
What If There Are More Than Two Factor Levels?
The t -test does not directly apply
There are lots of practical situations where there are either more
than two levels of interest, or there are several factors of simultaneous interest
The analysis of variance (ANOVA) is the appropriate analysis
“engine” for these types of experiments – Chapter 3, textbook
e was eve ope y s er n t e ear y s, an
initially applied to agricultural experiments
An Exam le See
. 60
An engineer is interested in investigating the relationship
. objective of an experiment like this is to model the relationship between etch rate and RF power, and to specify the power
.
The response variable is etch rate.
She is interested in a particular gas (C2F6) and gap (0.80 cm),
and wants to test four levels of RF power: 160W, 180W, 200W, and 220W. She decided to test five wafers at each level of RF power.
The experimenter chooses 4 levels of RF power 160W, 180W,
200W, and 220W
– order
An Example (See pg. 62)
Does changing the power
change the mean etch rate?
Is there an optimum level
The Analysis of Variance (Sec. 3-2, pg. 63)
In general, there will be a levels of the factor, or a treatments,
and n re licates of the ex eriment run in random order …a
completely randomized design (CRD)
N = an total runs
… will be discussed later
The Analysis of Variance
The name “analysis of variance” stems from a partitioning of
are consistent with a model for the experiment
The basic single-factor ANOVA model is
1,2,...,
i
a
τ ε=
⎧
= + +
1,2,...,
j
=
n
⎩
an overall mean,
iith
treatment effect,
μ
=
τ=
exper men a error,
,
ij
Models for the Data
There are several ways to write a model for the data:
is called the effects model
ij i ij
y
= + +
μ
τ
ε
,
is called the means model
i i
y
μ
ε
=
=
+
The Analysis of Variance
Total variability is measured by the total sum of squares:
The basic ANOVA partitioning is:
2 ..
(
)
a n T ijSS
=
∑∑
y y
−
1 1 i = j= 2 2 .. . .. .(
)
[(
) (
)]
a n a n ij i ij iy
−
y
=
y
−
y
+
y
−
y
∑∑
∑∑
1 1 1 1 2 2 i j i j a a nn
y
y
y
y
= = = ==
.−
..+
−
. 1 1 1 i i j T Treatments ESS SS
SS
= = ==
+
The Analysis of Variance
T
Treatments
E
SS SS
=
+
SS
A large value of SS Treatmentsreflects large differences in treatment
means
A small value of SS Treatments likely indicates no differences in
treatment means
Formal statistical hypotheses are:
:
H
=
=
L
=
1
: At least one mean is different
a
The Analysis of Variance
While sums of squares cannot be directly compared to test the
hypothesis of equal means, mean squares can be compared.
A mean square is a sum of squares divided by its degrees of freedom:
=
1
1
(
1)
ota reatments rror
an
− = − +
a
a n
−
,
1
(
1)
Treatments E Treatments EMS
MS
a
a n
=
=
−
−
If the treatment means are equal, the treatment and error mean
squares will be (theoretically) equal.
rea men means er, e rea men mean square w e arger an
Analysis of Variance: Summarized
Computing…see text, pp 66-70
The reference distribution for F 0is the F a -1,a (n- 1) distribution e ec e nu ypo es s equa rea men means
ANOVA calculations are usually done via
computer
Calculations can be done on Minitab, NCSS, Excel,
Matlab, Scilab, …etc
Model Adequacy Checking in the ANOVA
Text reference, Section 3-4, pg. 75
Checking assumptions
is important
Normalit
Constant variance
Inde endence
Have we fit the right model?
Later we will talk about what to do if some of these
assumptions are violated
Model Adequacy Checking in the ANOVA
residuals (see text, Sec. 3-4, pg. 75)
ˆ
ij ij ije
=
y
−
y
NCSS enerates the . ij−
i residuals Residual plots are very Normal probability plot
Post-ANOVA Comparison of Means
means
Assume that residual analysis is satisfactory
a ypo es s s re ec e , we on now w c spec c means
are different
Determining which specific means differ following an ANOVA is
called the multiple comparisons problem
There are lots of ways to do this…see text, Section 3-5, pg. 87 We will use airwise t -tests on means…sometimes called Fisher’s
Two-Factor Multi le levels Ex eriment
Extension of the ANOVA to Factorials
a b n a b ... .. ... . . ... 1 1 1 1 1(
ijk)
(
i)
(
j)
i j k i j a b a b ny
y
bn
y
y
an
y
y
= = = = =−
=
−
+
−
2 2 . .. . . ... . 1 1 1 1 1(
ij i j)
(
ijk ij)
i j i j kn
y
y
y
y
y
y
= = = = =+
∑∑
−
−
+
+
∑∑∑
−
T A B AB ESS
= + +
SS
SS
SS
+
SS
breakdown:
1
1
1
1
1
1
df
abn
− = − + − + −
a
b
a
b
− +
ab n
−
–
NCSS and Minitab
will perform the computations
Text gives details of
manual computing
– see pp.
An aly si s of Varian ce Table
Source Sumof Mean Prob Power Term DF Squares Square F-Ratio Level (Alpha=0.05) A: C2 2 900801.2 450400.6 2563.41 0.000000* 1.000000 B: C3 2 420599.2 210299.6 1196.90 0.000000* 1.000000 AB 4 809992.1 202498 1152.50 0.000000* 1.000000 S 18 3162.667 175.7037 Total Ad usted 26 2134555 Total 27
* Term significant at alpha = 0.05 Means and Effect s Section
Standard All 27 478.2592 478.2592 A: C2 1 9 468.7778 4.418442 -9.481482 2 9 706.5555 4.418442 228.2963 . . - . B: C3 1 9 305.4445 4.418442 -172.8148 2 9 595.7778 4.418442 117.5185 3 9 533.5555 4.418442 55.2963 AB: C2,C3 1,1 3 16.33333 7.652967 -279.6296 1,2 3 796.6667 7.652967 210.3704 1,3 3 593.3333 7.652967 69.25926 , . . . 2,2 3 708 7.652967 -116.0741 2,3 3 873 7.652967 111.1481 3,1 3 361.3333 7.652967 274.7037 3,2 3 282.6667 7.652967 -94.2963
Factorials with More Than Two Factors
- …
treatment combinations are run in random order
ANOVA identity is also similar:
T A B AB AC
SS
SS SS
SS
SS
SS
SS
SS
= + + +
+
+
+
+
+
+
L
L
L
Complete three-factor example in text, Example 5-5