• No results found

RESEARCH STRATEGY 6.1 Introduction

6.7 Data Analysis

6.7.7 Two-Way factorial analysis of variance

The main computations in the prospective study would be derived from analysis of variance. The appropriate model technically is called a two-way between-groups (independent groups) factorial analysis of variance (Kirkpatrick & Feeney, 2007, pp. 49-57). This advanced statistical method is a powerful method of analysis since it assesses main effects, interaction effects and also accounts for error factors. One issue that researchers have to contend with is that results on computer printouts and textbook explanations vary somewhat although more experienced researchers would notice the similarities. The discussion is a general overview of the method while introducing willing learners to an understanding of the computations and concepts that are associated with three-way classification analysis of variance (Ferguson & Takane, 1989, pp. 297-320). Calculation procedures are done in six steps that are illustrated and explained by means of one univariate frequency table, several cross-tabulations, and an array of formulas that is presented and discussed sequentially as the statistical

1 2 3 4 5

Long-term Frequency f11 f12 f13 f14 f15 Row freq1

contract Row percentage %R11 %R11 %R11 %R11 %R11 Row%

Column percentage %C11 %C12 %C13 %C14 %C15 Column freq2

Percentage of sample Tots11 Tots12 Tots13 Tots14 Tots15 Total%1

Short-term Frequency f21 f22 f23 f24 f25 Row Total2

contract Row percentage %R12 %R22 %R32 %R42 %R52

Column percentage %C21 %C22 %C23 %C24 %C25 Column freq2

Percentage of sample Tots21 Tots22 Tots23 Tots24 Tots55 Total%2

Row f1 Row f2 Row f3 Row f4 Row f5 Grand Total

121 model evolves. The intact data set, irrespective of being ordered or unordered, is used for calculations (See Table 6.2).

Table 6.2 Listing of Individual Scores on the First Dependent Variable

Step 1 requires the intact data set. A key statistic in multivariate analysis of variance is the grand mean that is calculated across all N observations (N being the number of subjects involved in the study). The raw scores of the 128 subjects (X1 to X128) are

added up. The formula for the grand mean is:

= ( Formula 12

In step 2 the researcher uses the grand mean to calculate the total sum of squares that is denoted as SS Total. The formula for this statistic reads:

SS Total = ( Formula 13

The grand mean is subtracted from the raw scores (X1 to X128) of every subject. The 128 difference scores are squared and added up. The SS Total can be divided up in its additive parts. The degrees of freedom of SS Total are equal to N – 1.

From this point onward, the intact (original) data set is split up and rearranged in terms of the three independent variables: firstly, according to the two contract subgroups, then in terms of the two gender subgroups, and finally according to the three age cohort groups. This split-up is illustrated in Table 6.3.

Note that the intact data set was rearranged into three subsets, in accordance with the categories of the three independent variables. This data transformation would assist the researcher to compute statistical outputs that could answer three critical

Total Sample X1 . XN N = 128 Grand Mean

122 questions with regard to the effects of the independent variables on the dependent variable:

 Did the test scores of executives with long-term contracts differ significantly from the test scores of their fellow-executives with short-term contracts?

 Did the test scores of male executives differ significantly from the test scores of female executives?

 Did the test scores of executives in the 20-40 year, 41-50 year, and ≥ 51 year age cohort groups differ significantly?

Table 6.3. Listing of Individual Scores on the First Dependent Variable, According to the Categories of the Independent Variables Contract Term, Gender, and Age Cohort Group

Main Effects Independent Variables

Contract Term Gender

Long-Term Short-Term Male Female

X1 X1 X1 X1

X2 X2 X2 X2

. . . .

XN XN XN XN

Nj1 = 76 Nj2 = 52 Nj1 = 86 Nj2 = 42

Mean L-T Mean S-T Mean M Mean F

Subgroup Subgroup

Mean T.1.1 Mean T.1.2 Mean T .2.1 Mean T .2.2 Age Cohort Group

21-30 Years 31-50 Years ≥ 51 Years

X1 X1 X1

X2 X2 X2

XN XN XN

Nj1 = 54 Nj2 = 42 Nj1 = 32

Mean 21-40 Mean 41-50 Mean ≥ 51

Subgroup

123 The categories of the three independent variables represent the main effects of the study. Since statistical manipulation of one independent variable did not involve any of the two remaining independent variables, testing for statistical significance was restricted to examination of within group differences. In step 3, three formulas, one for each independent variable, are used:

SS Contract = nCΣ( - ... Formula 14

SS Gender = ncG - ... Formula 15

SS Age Cohort Group = nacG - ... Formula 16

The three subgroup means for the above data split are (mean L-T + mean S-T)/ 2, (mean M + mean F)/2, and (mean 21-40 + mean 41-50 + mean ≥ 51)/3. The three SSIV are

designated as SS Contract, SS Gender, and SS Age Category.

The pending study examined three biographical variables: the primary variable Contract Term (Long- or Short-Term), the secondary variable Gender (Male and Female), and the tertiary variable Age Cohort Group (categories 21-40 Years, 41-50 Years, and ≥ 51 Years). The main effects determine whether differences between the two or more categories of each biographical variable are statistically significant or indeed insignificant (testing for within group differences). The data split for the three main effects is illustrated in Table 6.4.

The calculations in step 4 are interim procedures that are related to analyses of the sets of interactions among the independent variables examined in the current study. The following formulas are used for this purpose:

The calculation procedures for the interim statistics are similar although the number of categories per variable might differ. The grand mean is subtracted from each of the two or three group means, squared hereafter, and consecutively multiplied by the N subjects and g categories of the specific independent variable.

A second set of interim statistics is calculated in step 5 to yield the sum of squares for the cells. The formula reads as follows:

124

SS Cells = nΣ ( Formula 17

The SS Cells are derived from the above formula. The grand mean is subtracted from the seven category means, namely means L-T , S-T, M , F, 21-40, 40-51 and ≥51, hereafter

squared and multiplied by N.

In step 7 the interactions effects are calculated by the following formulas: SS Contract x Gender =SS Cells – SS Contract - SS gender Formula 18

SS Contract x Age Cohort group =SS Cells – SS Contract - SS Age Cohort group Formula 19

SS Gender x Age Cohort group =SS

Cells –SS gender – SS Age Cohort group Formula 20

Table 6.4 demonstrates analysis of variance of a three-way classification.

Table 6.4. Listing of Individual Scores on the Interaction Contract x Gender, Contract Term x Age Cohort Group, and Gender x Age Cohort Group

Interaction Effects Independent Variables

Contract-Term x Gender Contract Term x Age Cohort Group

Gender Age Cohort Group

Male Female 21-40 Years 41-50 Years ≥51 Years Contract Long-Term X1 X1 X1 X1 X1 X2 X2 X2 X2 X2 . . . . . XN XN XN XN XN N L-T x M = N L-T x F = N L-T x 21-40 = N L-T x 41-50 = N L-T x ≥ 51 =

Mean Mean Mean Mean Mean

L-T x M L-T x F L-T x 21-40 L-T x 21-50 L-T x ≥ 51 Short-Term X1 X1 X1 X1 X1 X2 X2 X2 X2 X2 .. .. .. .. .. XN XN XN XN XN N S-T x M = NS-T x F = N S-T x 21-40 = NS-T x 41-50 = N ST x ≥ 51 =

Mean Mean Mean Mean Mean

S-T x M S-T x F S-T x 21-40 S-T x 21-50 S-T x ≥ 51

Age Cohort group

21-40 Years 41-50 Years ≥ 51 Years Gender

125 The degrees of freedom for the interaction effects are the number of categories of the first independent variable minus 1, multiplied by the number of categories of the second independent variable.

Male X1 X1 X1

X2 X2 X2

. . .

XN XN XN

N Mx 21-40 = NM x 41-50 = N Mx ≥51 =

Mean= Mean= Mean =

Mx 21-40 NM x 41-50 Mx ≥51 Female X1 X1 X1 X2 X2 X2 . . . XN XN XN N Fx 21-40 = NF x 41-50 = N F x 21-40 =

Mean = Mean= Mean=

126 Analysis of variance of three-way classification of is complex as the effects of three independent variables on a single dependent variable are examined. The effect of interaction occurs when two or more independent variables produce a joint effect over and above their main effects (Whitley, Jr., 2002, pp. 204-207). Use of this statistical method assumes subsamples of equal size. Whenever this requirement is violated, the statistical method applies general linear modelling to estimate group and subgroup means.

Table 6.5. Example of Printout with Results of

Multivariate Factorial Analysis of Variance

The trans

form ation that the original data set has to undergo to compute statistics that demonstrate the influence of three main effect variables on a dependent variable. The higher level of complexity is clear.

Type III Sum Degrees of Mean Level of

Source of Squares Freedom Square F Ratio Significance

Corrected Model 7873.418 11 414.390 1.313 0.191 Intercept 123824.607 10 123824.607 392.248 0.000

Contact Term 36.420 1 36.420 0.115 0.735

Gander 19.204 1 19.204 0.061 0.806

Age Cohort Group 3181.374 2 636.275 2.016 0.082 Contract * Gender 324.465 1 324.465 1.028 0.313 Contract * Age 1873.043 2 468.261 1.483 0.212 Gender * Age 1465.703 2 366.426 1.161 0.332 Contract * Gender *

Age Cohort Group 1425.347 2 475.116 1.505 0.217 Error Factor 34093.361 116 315.679

Total Sum of

Squares 322638.264 128 Corrected Sum of

127 The primary calculations in the current study were done by means of analysis of variance of three-way classification (refer to Subsections 9.4.1 to 9.4.20 of Chapter 9. An example of a computer print-out of an analysis of variance of a three-way classification is presented in Table 6.5.

The researcher examined the main, two-way and three-way interaction effects of Length of Contract Term x Gender X Age Cohort Group on the 20 dependent variables that were selected for the purposes on the study.

While the prospective study, technically, was a factorial analysis of variance of three- way classification (Ferguson & Takane, 1989, pp. 297-320; Howell, 2004, pp. 399- 423). The computation procedures relatively similar to those that were required for a two-way between-groups factorial analysis of variance design with independent group comparisons (Kirkpatrick & Feeney, 2007, pp. 49-57). The chosen version of a multiple analysis of variance is a powerful statistical method that analyses main effects, or effects that are attributed to a specific independent variable. The main effect of a specific independent variable is not influenced by effects of any of the remaining independent variables in the data set. Interaction occurs when two or more independent variables combine to produce a joint effect over and above their main effects (Whitley, Jr., 2002, pp. 204-207). Use of this statistical method assumes subsamples of equal size. Whenever this requirement is violated, the statistical method applies general linear modelling to estimate group and subgroup means. The statistics that are required for understanding and interpreting tables of two-way and three-way analysis of variance are the computed F ratio and its associated level of significance, both of which appear on the right-hand side of the ANOVA table. If the computed F ratio is equal to or greater than the F critical value, the contrast is judged as being statistically significant and appropriate for further analysis. In such a case, the significance level would be equal to or less than 0.05. If the computed F ratio is less than the F critical value, the numeric value of the level of significance would exceed p = 0.05. In this the case, the contrast would be interpreted as being insignificant. The critical F value for contrasts that involve Length of Contract Term and Gender is 3.91 with 1 and 127 degrees of freedom, provided that the level of

128 significance is preset at 0.05 and one-tailed hypothesis testing, or directional testing, is done. In the case of the main effect Age Cohort Group, the F critical value is 3.06, with 2 and 127 degrees of freedom, hypothesis testing is one-tailed and the level of significance is preset at the 0.05 level (or 5% level).