Anova Lecture

(1)

Engineering Experimentation II

Lecture 7

(2)

S

Su

um

mm

ma

ar

r o

of

f L

Le

ec

cttu

urre

e 6

6



 Regression ModelRegression Model



 Linear model coefficientsLinear model coefficients



 Model evaluationModel evaluation



 Exploit contour and surface plotsExploit contour and surface plots



 Error bars for 2Error bars for 222 exampleexample



 Single factor multiple levelSingle factor multiple level



 Make sense of your dataMake sense of your data



 Model LinearizationModel Linearization



 Curve fittingCurve fitting



 RR22 definitiondefinition



(3)

S

Su

um

mm

ma

ar

r o

of

f L

Le

ec

cttu

urre

e 6

6



 Regression ModelRegression Model



 Linear model coefficientsLinear model coefficients



 Model evaluationModel evaluation



 Exploit contour and surface plotsExploit contour and surface plots



 Error bars for 2Error bars for 222 exampleexample



 Single factor multiple levelSingle factor multiple level



 Make sense of your dataMake sense of your data



 Model LinearizationModel Linearization



 Curve fittingCurve fitting



 RR22 definitiondefinition



(4)



 The hypothesis testing frameworkThe hypothesis testing framework



 The two-sampleThe two-sample t t -test-test



 eecc nng g aassssuummpp oonnss, , vvaa yy



 Comparing more than two factor’s levels…Comparing more than two factor’s levels…thethe aananalysis lysis ofof variance

variance



 ANOVA decomposition of total variability ANOVA decomposition of total variability



 Statistical testing & analysisStatistical testing & analysis



 Checking assumptions, model validityChecking assumptions, model validity



 Post-ANOVA testing of meansPost-ANOVA testing of means



(5)

P

(6)

(7)

(8)

The Hypothesis Testing Framework



Statistical hypothesis testing

is a useful framework for

many experimental situations



r g ns o

e me o o ogy a e rom e ear y

s



We will use a procedure known as the two-sample

t

(9)

-The Hypothesis Testing Framework

 Sampling from a normal distribution  Statistical h otheses:

0

:

1 2

H

μ

=

μ

1

:

1 2

(10)

Estimation of Parameters

1

n

=

1 i n

n

= 2 2 2 1

(

) estimates the variance

1

_i i

S

y

n

=

σ

=

−

(11)

Summar Statistics

. 36

Modified Mortar

“

”

Unmodified Mortar

“Ori inal reci e”

1

16.76 y

=

1 2

17.04 y

=

1 1

.

0.316 S

=

1 1

.

0.248 S

=

1

10 n

=

n

₁

=

10

(12)

How the Two-Sample

t

_{-Test Works:}

Use the sam le means to draw inferences about the o ulation means

1 2

16.76 17.04

0.28 y

− =

y

−

=

−

2

Standard deviation of the difference in sample means

2 y

This su

ests a statistic:

n

σ

=

1 2 0 ₂ ₂

Z

=

y

−

y

1 2 1 2

n

+

n

(13)

How the Two-Sample

t

_{-Test Works:}

1 2 1 2

se

an

o es ma e

an

y

σ σ

−

2 2 1 2

e prev ous ra o ecomes

S

₊

S

1 2

2 2 2

However we have the case where

n

σ

=

σ

=

σ

Pool the individual sample variances:

2 2 2

(

1

1)

1

(

2

1)

2

p

n S n

S

n

−

+

−

=

+ −

(14)

How the Two-Sample

t

_{-Test Works:}

The test statistic is

1 2 0

y

t

=

−

1 2 p

S

n

+

n

 Values of t ₀ that are near zero are consistent with the null hypothesis  Values of t ₀that are very different from zero are consistent with the

alternative h othesis

 t ₀is a “distance” measure-how far apart the averages are expressed in

standard deviation units

(15)

The Two-Sample (Pooled)

t

_-Test

2 2 2 1 1 2 2 1 2 ( 1) ( 1) 9(0.100) 9(0.061) 0.081 2 10 10 2 p n S n S S n n

−

+

−

+

=

+

−

+

−

0.284 p S

=

1 2 0 16.76 17.04 2.20 1 1 1 1 y y t

=

−

=

−

=

−

1 2 . 10 10 p n n

The two sample means are a little over two standard deviations apart Is this a "large" difference?

(16)

The Two-Sample (Pooled)

t

_-Test

 So far, we haven’t reall done

any “statistics”

 We need an objective basis

for deciding how large the test

t

0 = -2.20

statistic ₀ really is

 In 1908, W. S. Gosset derived

the reference distribution

0…

distribution

 Tables of the t distribution

(17)

The Two-Sample (Pooled)

t

_-Test

 A value of t ₀ between –2.101 and 2.101 is consistent with equality of means

 t ₀ is exceeding the range of 2.101 or –2.101, leads to significant means difference  Could also use the P -value approach

(18)

The Two-Sample (Pooled)

t

_-Test

t

0 = -2.20

 The P- value is the risk of wron l re ectin the null h othesis of e ual

means (it measures rareness of the event)

(19)

(20)

Im ortance of the

t

_-Test



Provides an objective framework for simple comparative

experiments



ou

e use o es a re evan ypo eses n a

wo-level factorial design, because all of these hypotheses

“

”

versus the mean response at the opposite “side” of the

cube

(21)

What If There Are More Than Two Factor Levels?

 The t -test does not directly apply

 There are lots of practical situations where there are either more

than two levels of interest, or there are several factors of simultaneous interest

 The analysis of variance (ANOVA) is the appropriate analysis

“engine” for these types of experiments – Chapter 3, textbook

 e was eve ope y s er n t e ear y s, an

initially applied to agricultural experiments

(22)

An Exam le See

. 60

 An engineer is interested in investigating the relationship

. objective of an experiment like this is to model the relationship between etch rate and RF power, and to specify the power

.

 The response variable is etch rate.

 She is interested in a particular gas (C2F6) and gap (0.80 cm),

and wants to test four levels of RF power: 160W, 180W, 200W, and 220W. She decided to test five wafers at each level of RF power.

 The experimenter chooses 4 levels of RF power 160W, 180W,

200W, and 220W

– order

(23)

An Example (See pg. 62)

 Does changing the power

change the mean etch rate?

 Is there an optimum level

(24)

The Analysis of Variance (Sec. 3-2, pg. 63)

 In general, there will be a levels of the factor, or a treatments,

and n _{re licates of the ex eriment run in random order …a}

completely randomized design (CRD)

 N = an total runs

… will be discussed later

(25)

The Analysis of Variance

 The name “analysis of variance” stems from a partitioning of

are consistent with a model for the experiment

 The basic single-factor ANOVA model is

1,2,...,

i

a

τ ε

=

⎧

= + +

1,2,...,

j

=

n

⎩

an overall mean,

_i

ith

treatment effect,

μ

=

τ

=

exper men a error,

,

ij

(26)

Models for the Data

There are several ways to write a model for the data:

is called the effects model

ij i ij

y

= + +

μ

τ

ε

,

is called the means model

i i

y

μ

ε

=

+

(27)

The Analysis of Variance

 Total variability is measured by the total sum of squares:

 The basic ANOVA partitioning is:

2 ..

(

)

a n T ij

SS

=

∑∑

y y

−

1 1 i = j= 2 2 .. . .. .

(

)

[(

) (

)]

a n a n ij i ij i

y

−

y

=

y

−

y

+

y

−

y

∑∑

1 1 1 1 2 2 i j i j a a n

n

y

= = = =

=

_.

−

_..

+

−

_. 1 1 1 i i j T Treatments E

SS SS

SS

= = =

=

+

(28)

The Analysis of Variance

T

Treatments

E

SS SS

=

+

SS

 A large value of SS _Treatmentsreflects large differences in treatment

means

 A small value of SS _Treatments likely indicates no differences in

treatment means

 Formal statistical hypotheses are:

:

H

=

_L

=

1

: At least one mean is different

a

(29)

The Analysis of Variance

 While sums of squares cannot be directly compared to test the

hypothesis of equal means, mean squares can be compared.

 A mean square is a sum of squares divided by its degrees of freedom:

=

1

1 (

1)

ota reatments rror

an

− = − +

a

a n

−

,

1 (

1)

Treatments E Treatments E

MS

a

a n

=

−

 If the treatment means are equal, the treatment and error mean

squares will be (theoretically) equal.

 rea men means er, e rea men mean square w e arger an

(30)

Analysis of Variance: Summarized

 Computing…see text, pp 66-70

 The reference distribution for F ₀is the F _a_-1,_a₍_n-₁₎ distribution  e ec e nu ypo es s equa rea men means

(31)

(32)

(33)

ANOVA calculations are usually done via

computer



Calculations can be done on Minitab, NCSS, Excel,

Matlab, Scilab, …etc

(34)

Model Adequacy Checking in the ANOVA

Text reference, Section 3-4, pg. 75



Checking assumptions

is important



Normalit



Constant variance



Inde endence



Have we fit the right model?



Later we will talk about what to do if some of these

assumptions are violated

(35)

Model Adequacy Checking in the ANOVA

residuals (see text, Sec. 3-4, pg. 75)

ˆ

ij ij ij

e

=

y

−

y

 NCSS enerates the . ij

−

i residuals

 Residual plots are very  Normal probability plot

(36)

(37)

Post-ANOVA Comparison of Means

means

 Assume that residual analysis is satisfactory

 a ypo es s s re ec e , we on now w c spec c means

are different

 Determining which specific means differ following an ANOVA is

called the multiple comparisons problem

 There are lots of ways to do this…see text, Section 3-5, pg. 87  We will use airwise t -tests on means…sometimes called Fisher’s

(38)

Two-Factor Multi le levels Ex eriment

(39)

Extension of the ANOVA to Factorials

a b n a b ... .. ... . . ... 1 1 1 1 1

(

_ijk

)

(

_i

)

(

_j

)

i j k i j a b a b n

y

bn

y

an

y

= = = = =

−

=

−

+

−

2 2 . .. . . ... . 1 1 1 1 1

(

_ij _i _j

)

(

_ijk _ij

)

i j i j k

n

y

= = = = =

+

∑∑

−

+

∑∑∑

−

T A B AB E

SS

= + +

SS

+

SS

breakdown:

1

1 df

abn

− = − + − + −

a

b

a

b

− +

ab n

−

(40)

–

NCSS and Minitab

will perform the computations

Text gives details of

manual computing

– see pp.

(41)

An aly si s of Varian ce Table

Source Sumof Mean Prob Power Term DF Squares Square F-Ratio Level (Alpha=0.05) A: C2 2 900801.2 450400.6 2563.41 0.000000* 1.000000 B: C3 2 420599.2 210299.6 1196.90 0.000000* 1.000000 AB 4 809992.1 202498 1152.50 0.000000* 1.000000 S 18 3162.667 175.7037 Total Ad usted 26 2134555 Total 27

* Term significant at alpha = 0.05 Means and Effect s Section

Standard All 27 478.2592 478.2592 A: C2 1 9 468.7778 4.418442 -9.481482 2 9 706.5555 4.418442 228.2963 . . - . B: C3 1 9 305.4445 4.418442 -172.8148 2 9 595.7778 4.418442 117.5185 3 9 533.5555 4.418442 55.2963 AB: C2,C3 1,1 3 16.33333 7.652967 -279.6296 1,2 3 796.6667 7.652967 210.3704 1,3 3 593.3333 7.652967 69.25926 , . . . 2,2 3 708 7.652967 -116.0741 2,3 3 873 7.652967 111.1481 3,1 3 361.3333 7.652967 274.7037 3,2 3 282.6667 7.652967 -94.2963