Chapter IV
ANOVA
.
... …...
Objetive
Chapter
The purpose is to take a decision maker
to compare three or more independent
sample means to see is there are
statistically significant differences
between the means of the populations
from which the sample are taken.
Many analyses involve experiments in
which you want to test if one or more
discrete level factors (independent
variable) influence an outcome
measurement (variable quantitative, and
is dependent variable).
4.1 Introduction
Analysis of variance (ANOVA) is a test of hypothesis that is appropriate to compare means of a continuous variable in two or more independent comparison groups. ANOVA was developed by Ronald Fisher in 1918 and is the extension of the T and the Z test. Before the use of ANOVA, the T-test and Z-test were commonly used. But the problem with the T-test is that it cannot be applied for more than two groups.
Analysis of variance provides a way to determine if one or more discrete level factors (independent variable) influence an outcome measurement (variable quantitative, and is dependent variable).
Factor: a characteristic under consideration, thought to influence the measured observations.
The purpose is to take a decision maker to compare three or more independent sample means to see is there are statistically significant differences between the means of the populations from which the sample are taken.
The null hypothesis, typically, is that all means are equal.
Analysis of variance must have a dependent variable that is metric (measured using an interval or ratio scale).
There must also be one or more independent variables that are all categorical (nonmetric). Categorical independent variables are also called factors.
A particular combination of factor levels, or categories, is called a treatment.
• Analysis of variance compares two or more populations of interval data.
• Specifically, we are interested in determining whether differences exist between the population means.
One-way analysis of variance involves only one categorical variable, or a single factor. In one-way analysis of variance, a treatment is the same as a factor level. When we compare more than two groups, based on one factor (independent variable), this is called one way ANOVA.
Application Example
We wish to conduct a study in the area of mathematics education involving different teaching methods to improve standardized math scores in local classrooms. The study will include four different teaching methods and use fifth grade students who are randomly sampled from a large urban school district and are then random assigned to the four different teaching methods.
The four different teaching methods to be examined are:
1) the traditional teaching method where the classroom teacher explains the concepts and assigns homework problems from the textbook.
2) the intensive practice method, in which students fill out additional work sheets both before and after school.
3) the computer assisted method, in which students learn math concepts and skills from using various computer based math learning programs.
4) the peer assistance learning method, which pairs each fifth grader with a sixth grader who helps them learn the concepts followed by the student teaching the same material to another student in their group.
Students will stay in their math learning groups for an entire academic year. At the end of the spring semester all students will take the Multiple Math Proficiency Inventory (MMPI).
The experiment is designed so that each of the four groups will have the same sample size.
One-way repeated measures ANOVA
A one-way repeated measures ANOVA is used when you have a single group on which you have measured something more than one time. For example, if you wanted to test students’ understanding of a subject, you could administer the same test at the beginning of the course, in the middle of the course, and at the end of the course. You would then use a one-way repeated measures ANOVA to see if students’ performance on the test changed over time.
Two-way between groups ANOVA
Each of the main effects is a one-way test. The interaction effect is simply asking if there is any significant difference in performance when you test the final grade and overseas/local acting together.
4.2 Conducting One-Way Analysis of Variance
1. Identification of Dependent & Independent (Factor) Variables (the dependent variable can be continuous or on the interval scale and a Factor variable in ANOVA should be categorical).
Variables which are experimentally manipulated by an investigator are called independent variables. 2. Decomposition of the Total Variation
3. Measurement of Effects
4. Significance Testing (ANOVA results: Contrasts, Multiple Comparisons, Tests for Trend) 5. Assumptions in Analysis of Variance
6. Interpretation of Results
• Variance can be separated into two major components
Between groups - differences according to the group or the treatment received. Within groups – variability or differences in particular groups (individual differences)
Relationship amongst T Test, Analysis of Variance, Analysis of Covariance, & Regression
4.3 Assumptions must be true before the ANOVA technique can be applied to a decision-making situation: (these assumptions can be tested using statistical software).
The samples are drawn randomly, and each sample is independent of other samples.
The errors are normally distributed, with a zero mean and a constant variance.
The variances of all errors are equal to each other. The assumption of homogeneity of variance can be tested using tests such as Levene’s test or the Brown-Forsythe Test
• If the Sig. that gives Levene’s test is (p >.05), assume equal variances
• If the Sig. that gives Levene’s test is (p <.05), equal variances cannot be assumed (this information makes adjustments to the violation of equal variances).
In case the crucial assumptions of ANOVA are no met, ONE WAY wish to consider a parallel Non-parametric test such as
• Kruskal – Wallis procedure or
• Friedman procedure, respectively, for One or two-way ANOVA
It is important to note that ANOVA is not robust to violations to the assumption of independence. This is to say, that even if you violate the assumptions of homogeneity or normality, you can conduct statistical procedures that will still enable you to conduct the ANOVA but you cannot with violations to independence. In general, with violations of homogeneity the study can probably carry on if you have equal sized groups.
4.4 The procedural Steps for an ANOVA Test
Step 1. State the Hypotheses
In general one-way ANOVA techniques can be used to study the effect of k(>2) levels of a single factor. To determine if different levels of the factor affect measured observations differently, the following hypotheses are tested.
H0: μ1 = μ2 = … = μK (That is, “all population means are equal”), where K= the number of group of population under study
Ha: At least one µi is different (at least one mean differs from the others)
Step 2. Select the Level of Significance
A criterion for rejection of Ho is necessary, and test are typically made where α is specified to be .01, .05 or .10 Step 3. Determine to the Test Distribution to use
An F distribution is used in an ANOVA test.
The null hypothesis may be tested by the F statistic based on the ratio between these two estimates:
The ANOVA F-statistic is a ratio of the Between Group Variation divided by the Within Group Variation:
A large F is evidence against H0, since it indicates that there is more difference between groups than within groups.
This statistic follows the F distribution, with (k- 1) and (N - k) degrees of freedom (df) K= Number of groups
Step 4. Making a Decision and Interpreting the Result of the Test
• If the null hypothesis of equal category means is not rejected, then the factor or independent variable does not have a significant effect on the dependent variable.
• On the other hand, if the null hypothesis is rejected, then the effect of the independent variable is significant.
(Since the test statistic F is more or exceeds the critical value F, we can reject Ho) Decision rule with Sig. or p_value:
• If the Sig. is less than the level of significance (α = .05) we should reject the null hypothesis.
Operational formulas
(1) Total sum of squares
(2) Sum of Squares Between Groups
(3) Sum of Squares Within Groups
Illustrative Applications of One-way ANOVA
Are sexually active teenagers better informed about AIDS and other potential health problem related to sex than teenagers who are sexually inactive?
A 15-item test of general knowledge about sex and health was administered to random samples of teens who are sexually inactive, teens who are sexually active but with only a single partner, and teens who are sexually active with more than one partner. Is there any significant difference in the test scores?
The data is
:
SexuallyInactive Active-Onepartner Active-More thanone partner
14 11 8
12 11 12
8 6 10
14 5 4
11 12 3
12 10 5
We illustrate the previous example, analysis of variance procedure using the software SPSS, the results of conducting are presented as follow:
Steps for Illustrative Applications of One-way
Null and alternative hypothesis
Ho: That is, “the group means are all equal”
Ha: At least one µi is different “at least one mean differs from the others”
Level of significance:
Grand Total (add all of the scores together, then square the total)
Square each individual score and then add up all of the squared scores
Total number of subjects
Procedure in SPSS
Output from SPSS - Descriptives
Descriptives
General knowledge about sex and health (AIDS) in teenagers
N Mean
Std.
Deviation Std. Error
95% Confidence Interval for Mean
Minimu m
Maximu m Lower
Bound
Upper Bound
Inactive 6 11.833 2.229 0.910 9.495 14.172 8.00 14.00 Active-One
partner 6 9.167 2.927 1.195 6.095 12.238 5.00 12.00
Active-More than one
partner 6 7.000 3.578 1.461 3.245 10.755 3.00 12.00
Total 18 9.333 3.447 0.812 7.619 11.048 3.00 14.00
This table describes the means and standard deviations of each group: Score in general knowledge about sex and health (AIDS) in teenagers
ANOVA
General knowledge about sex and health (AIDS) in teenagers
Sum of
Squares df Mean Square F Sig. Between Groups 70.333 2 35.167 4.006 .040
Within Groups 131.667 15 8.778
Total 202.000 17
Statistic Test:
Making a Decision and Interpreting the Result of the Test
The significance (Sig.) value of the F test in the ANOVA table is .040 < .05. Thus, you must reject the null hypothesis that average scores about knowledge about sex and health in teenagers are not equal across the groups. Now that you know the groups differ in some way, you need to learn more about the structure of the differences.
The means plot helps you to "see" this structure. Teenagers who are sexually inactive have a higher score than their counterparts.
Which means are different?
Can directly compare the subgroups using “Post Hoc” tests.
4.6 Post Hoc Tests: Multiple Comparisons
Post-hoc tests allow you to determine where significant differences lie.
When the ANOVA is found to be significant, one must examine which two groups differ significantly from the total number of groups: so post-hoc tests look at mean differences between different pairs:
Post-hoc testing usually involves multiple comparisons.
There are several multiple comparison tests that can be conducted that will control the type one error rate.
• If you are concerned about violations of the assumptions use Scheffe’s Test.
• If you are not concerned about violations to the assumptions and are testing compound and pair wise tests, use Dunn’s test or the modified Bonferroni Test.
• If you are not concerned with violations of the assumptions and are just comparing the treatment to the control, use Dunnette’s Test.
• Games-Howell does not assume population variances are equal or that sample sizes are equal, so is a good alternative if this turns out to be the case.
All of these tests will ensure that the Type I error rate remains under control as was established by the researcher and will tell you exactly which groups are different from one another.
P.D: When the null hypothesis is rejected, the conclusion is that at least one population mean is different from at least one other mean. However, since the ANOVA does not reveal which means are different from which, it offers less specific information than the Post-hoc Analysis, one of them is Tukey HSD. The Tukey HSD is therefore preferable to ANOVA in this situation. Some textbooks introduce the Tukey test only as a follow-up to an ANOVA. However, there is no logical or statistical reason why you should not use the Tukey test even if you do not compute an ANOVA.
You might be wondering why you should learn about ANOVA when the Tukey test is better. One reason is that there are complex types of analyses that can be done with ANOVA and not with the Tukey test. A second is that ANOVA is by far the most commonly-used technique for comparing means, and it is important to understand ANOVA in order to understand research reports.
Post Hoc output Multiple comparison
Dependent Variable: Tukey HSD
(I) Group
Mean Difference
(I-J) Std. Error Sig.
95% Confidence Interval Lower Bound Upper Bound Inactive Active-One
partner 2.667 1.711 0.293 -1.776 7.110 Active-More
than one
partner 4.833
* 1.711 0.032 0.390 9.276
Active-One partner
Inactive -2.667 1.711 0.293 -7.110 1.776 Active-More
than one partner
2.167 1.711 0.435 -2.276 6.610
Active-More than one partner
Inactive 4.833* 1.711 0.032 -9.276 -0.390 Active-One
partner -2.167 1.711 0.435 -6.610 2.276 *. The mean difference is significant at the 0.05 level.
Interpretation: Looking at the data, the researcher asks:
•Are the two groups between the inactive and active-more than one partner really different? (Sig. = 0.032, therefore the difference between the two groups are statistically significant).
Homogeneous subset
General knowledge about sex and health (AIDS) in teenagers
Tukey HSDa
Group N
Subset for alpha = 0.05
1 2
Active-More than one
partner 6 7.0000
Active-One partner 6 9.1667 9.1667
Inactive 6 11.8333
Sig. .435 .293
Means for groups in homogeneous subsets are displayed. a. Uses Harmonic Mean Sample Size = 6.000.
According to the table before, teenagers who are inactive sexually have more highly score than their counterparts.
Required Conditions or assumption:
1. The populations tested are normally distributed.
Assumption: Normality
Each group is approximately normal, check this by looking at histograms and boxplot or normal Q-Q plots, or use the test Kolmogorov Smirnov if the sample is big, or use test Shapiro-Wilk if the sample is small (< 30).
Tests of Normality
Group
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
General knowledge about sex and health (AIDS) in teenagers
Inactive .196 6 .200* .890 6 .316
Active-One
partner .279 6 .159 .838 6 .126
Active-More than one
partner .212 6 .200
* .935 6 .619
Ho: The errors terms follow a normal distribution Ha: The errors terms do not follow normal distribution
Decision and interpret result: P-values or (sig)>.05, then do not reject Ho, therefore conclude that the errors follow a normal distribution or the normality assumption may be assumed valid.
Steps to find Test of Normality
Assumption: homogeneity of variance
This table (Levene’s test) tests the assumption of equal variances for the ANOVA.
Test of Homogeneity of Variances
(The population variances are equal)
Ha: The population variances are not equal
Test of Homogeneity of Variances
General knowledge about sex and health (AIDS) in teenagers
Levene Statistic df1 df2 Sig.
1.750 2 15 .207
Making Decision and interpret the result: look at the sig. or p-value (.207) which is above .05. The Sig. or p_value given in the last column is sufficiently large to conclude the assumption of constant variances should not be rejected, therefore we conclude that the population variances are equal.
Notes for interpret Sig for main hypothesis:
- Sig. or P_value: when interpreting the Sig or p-value for a test, if the value is less than .05 then the test is significant at the 5% of level of significant, and we would usually say there is evidence to reject the null hypothesis.
If the Sig. or p-value is less than 0.1 but greater than 0.05 then there is weak evidence in favor of the alternative hypothesis. Finally if the p-value is greater than 0.1 then we would usually say there is no evidence to reject the null hypothesis. Never accept the null hypothesis and conclude it to be true as this will be incorrect; we always reject or do not reject the null.
4.7 Multivariate Analysis of Variance
• Multivariate analysis of variance (MANOVA) is similar to analysis of variance (ANOVA), except that instead of one metric dependent variable, we have two or more, based on their relationships to categorical and scale predictors.
• In MANOVA, the null hypothesis is that the vectors of means on multiple dependent variables are equal across groups.
• Multivariate analysis of variance is appropriate when there are two or more dependent variables that are correlated.
Review problems of chapter
Follow the procedures covered in this chapter to generate appropriate to answer the following questions: 1. What necessary assumption must be met for an analysis of variance test to be valid?
2. In a One-way ANOVA, if the Sig or p_value is greater than the level of significance, you: a. Reject Ho because there is evidence all the means differ
b. Reject Ho because there is evidence at least one of the means differs from the others c. Do not reject Ho because there is no evidence of a difference in the means
d. Do not reject Ho because one mean is different from the others 3. In a one-way ANOVA, the null hypothesis is always:
a. All the population means are different b. Some of the population means are different c. Some of the population means the same d. All of the population means are the same
The following should be used to answer Question 4 through 7
4. Three groups of students involved in the same curriculum are obliged to study for 15 minutes, 30 minutes, or 45 minutes a night for 8 weeks before taking a mathematics test. Their scores are as follows:
15 minutes 43 39 55 56 73
30 minutes 55 58 66 79 62
45 minutes 51 66 85 86 89
a. Are the differences statistically significant? (Conduct ANOVA test). Answer=F=3.354
b. If F is significant, which group(s) is (are) significantly different from which? (See Multiple comparison) c. Which group is better?
e. What is the independent or factor variable? and what is the data scale of the independent variable? f. What is the dependent variable? and what is the data scale of the dependent variable?
5. A random sample of 15 nations from three levels of development has been selected. “Least developed” nations are largely agricultural and have two lowest quality of life. “Developed” nations are industrial and the most affluent and modern. “Developing” nations are between these extremes. Are these general characteristics reflected in differences in life expectancy (the number of years the average citizen can expect to live at birth) between the three categories?
The
data for 15 nations:
Least developed Developing Developed
Nation
Life
expectancy Nation
Life
expectancy Nation
Life expectancy
Cambodia 56.8 China 71.6 Australia 79.9
Mali 47 Indonesia 68.3 Belgium 78
Nepal 58.2 Pakistan 61.5 Japan 80.8
Niger 41.6 South Korea 74.7 Russia 67.3
Sudan 56.9 Turkey 71.2 United Kingdom 77.8
Source: U.S. Bureau of the census 2003. Statistica Abstract of the United States, 2002. P. 829. Washington, D.C.: U.S. Government Printing Office.
Are there statistically significant differences in life expectancy between nations at different levels of economic development? Answer: F=22.048, sig =.000
6. What type of person is most involved in the neighborhood and the community? Who is more likely to volunteer for an organization such as Umuganda, Scouts or Little League? Random samples of 15 people were asked for the number of times they participated during 6 months in the voluntary community organization. What differences are significant?
What is the p_value for those tests? Interpret it and tell what decision you would make from it.
Number of times they participated by education:
Number of times they participated by length of residence in present
community:
Number of times they participated by number of
children: Less than
High
School schoolHigh College
Less than 2
years 5 years More than5 years None Child One
More Than One
Child
0 1 0 0 0 1 0 2 0
1 3 3 1 2 3 1 3 3
2 3 4 3 3 3 1 4 4
3 4 4 4 4 4 3 4 4
4 5 4 4 5 4 3 4 5
Answers: Membership by education: F=.805, sig .470
Membership by length of residence in present community: F=.165, sig .850 Membership by number of children: F= 2.32 , sig .141