5 Analysis of Variance models, complex linear models and Random effects models
In this chapter we will show any of the theoretical background of the analysis. The focus is to train the set up of ANOVA models in GenStat. GenStat comes with a very extensive help system and in addition several PDF files which sever as a documentation and reference. You should also consult secondary literature on statistical modelling to properly use these procedures.
At the beginning of each chapter you find a short introductions on how to generate the example field trails in GenStat. GENSTAT is also a good software to create field trails, although it is somewhat limited regarding the number of treatments. For complex design specialised software should be used.
The design creation in GenStat always give good advice on how the analysis model for a certain design should be set up. In any case you should consult a biometrician .
5.1 Basic syntax of ANOVA models
Table 5: Notation of ANOVA models + = A +B = Main effects of A and B
. = A.B = Interaction of A and B only
* = A*B = A+B+A.B factorial structure
/ = A/B = A+A.B –without main effect of B, but B nested within A
Table 6: Examples of ANOVA Models A*B*C = A+B+C+A.B+A.C+B.C+A.B.C full factorial model (A+B)*(C+D) = (A+B)+(C+D)+(A+B).(C+D)
= A+B+C+D+A.C+A.D+B.C+B.D Block/Plot/Subplot = Block+Block.Plot+Block.Plot.Subplot
A/(B*C) = A+A.B+A.C+A.B.C
Recommendation: Syntax of ANOVA Models
You can reuse any model from the „input log“ window and copy it in an extrax script window, then edit the model to suite your needs. Whenever you are not completele sure about the syntax of teh model you should use the long form writing using „ + “ and „ . “.
5.2 Anaylsis example : Potatoe yield – Latin square Please restart the GENSTAT Server via „Restart Server“ (see 1.3.2).
5.2.1 Create Design
To create a design in GENSTAT you use the menu „Stats -> Design -> Generate Standard Design“.
Please choose the base design „Latin Square“ from the pull down menu of the dialog box “Generate a Standard design”. Also enter names for the Rows-, Column and Treatment factor as well as the
“Number of Levels“:
Graph 102: Create a Latin Square Design
Click the „Run“ button and the design will be created in form of a spreadsheet. The specialty to this particular spreadsheet is that the information about the analysis model is saved within the spreadsheet. You can check the power of the design after you hit the “run” button. Go back to the design creation dialog and click the “Check for Power” button which is visible now. You need to know some basic information like the hypothesised mean difference “Size of difference to detect” and an information about the standard deviation “Residual Mean Square”. In case the power is below 80%
you want to rethink the design and could add replications to overcome the low power situation.
Please insert a new column for the response values “Yield” via “Spread -> Insert -> Column after current column” and enter some data. Now open the menu “Stats -> Analysis of Variance ->
General...“ .
You can save the design spreadsheet for later use via the menu „File“ and the „Save“ dialog. You should close the design after checking the results.
5.2.2 Enter or load data
Please restart the GENSTAT Server via „Restart Server“ (see 1.3.2).and load the file
„Latin_Square_Data_Potato_Yield.xls“ using the Excel Import Wizards (see 2.1).
You should convert Zeile, Spalte and Sorte to factor variables. The following data will be loaded:
Table 7: data set „Latin_Square_Data_Potato_Yield.xls“
Zeile Spalte Sorte Ertrag
1 1 C 22
1 2 B 20
1 3 A 39
1 4 D 27
1 5 E 34
2 1 E 29
2 2 D 29
2 3 C 25
2 4 A 30
2 5 B 23
3 1 A 29
3 2 E 25
3 3 D 34
3 4 B 26
3 5 C 27
4 1 B 23
4 2 A 27
4 3 E 27
4 4 C 32
4 5 D 41
5 1 D 33
5 2 C 21
5 3 B 24
5 4 E 30
5 5 A 33
5.2.3 Analysis
Graph 103: Analysis of Variance : General
Please start the ANOVA via „Stats ->
Analysis of Variances -> General“
In the following dialog you specify the analysis model .
Graph 104: Latin square ANOVA model
The response „Y-Variate“ of the model is the yield which is named „Ertrag“ in this data file.
The treatment or independent variable is the variety named „Sorte“ .
Built into every latin square model is the block structure of rows*columns here called „Zeile*Spalte“. Please see chapter 5.1 for more details on the usage of “*, ., /” in setting up ANOVA models.
Graph 105: ANOVA Options
Please specify all other settings as shown in Graph 105 and Graph 106, then click „Run“
In jedem Fall sollten Sie als Zusatzoption die graphische Ausgabe aktivieren.
Graph 106: ANOVA Save
After you „Run“ the analysis once, you can click the „Save“ button in the Analysis of Variance“ dialog.
To be able to check the assumption of normality of the residuals and run the appropriate test you need to save the residuals first.
To be able to run a multiple comparison test or pairwise means comparison you need to save the means first.
The following script listings are automatically generated as commands in the Input Log (see script listing 4 and script listing 5). You can save the Input Log to reuse the command sequence later.
script listing 4: Create One Way ANOVA Output
"General Analysis of Variance."
BLOCK "No Blocking"
TREATMENTS Sorte+Spalte+Zeile COVARIATE "No Covariate"
ANOVA [PRINT=aovtable,information,means,%cv; FACT=1; CONTRASTS=7; FPROB=yes; PSE=diff,\
means] Ertrag
APLOT [RMETHOD=simple] fitted,normal,halfnormal,histogram AGRAPH [METHOD=means]
script listing 5: Saving results of the One Way ANOVA
DELETE [REDEFINE=yes] Kartoffel_Meantab
AKEEP [RESIDUAL=Kartoffel_Residuals; FACT=32]Sorte; MEANS=Kartoffel_Meantab FSPREADSHEET [SHEET=29548864; METHOD=replace] Kartoffel_Residuals
FSPREADSHEET Kartoffel_Meantab
GENSTAT creates diagnostic plots and means plots for the treatment automatically . The diagnostics for this example look very good and allow the statement that the data comply with the assumtion of normal residuals without any further statistical analysis. In Normal plot as well as in the Half Normal plot a few value seem to be outstanding. Those will also be found in the list of “large residuals” in the output.
Graph 107: Example Potatoe yields – Diagnostic plots
Graph 108: Potatoe yields – Means
The result of the numerical analysis is listed in the following Output list .
Ouput 5: Result of the One Way ANOVA
Analysis of variance
Variate: Ertrag
Source of variation d.f. s.s. m.s. v.r. F pr.
Sorte 4 330.00 82.50 5.64 0.009
Spalte 4 150.00 37.50 2.56 0.093
Zeile 4 20.40 5.10 0.35 0.840
Residual 12 175.60 14.63
Total 24 676.00
Message: the following units have large residuals.
*units* 3 6.00
*units* 4 -6.40
Tables of means
Variate: Ertrag
Grand mean 28.40
Sorte A B C D E
31.60 23.20 25.40 32.80 29.00
Spalte 1 2 3 4 5
27.20 24.40 29.80 29.00 31.60
Zeile 1 2 3 4 5
28.40 27.20 28.20 30.00 28.20
Standard errors of means
Table Sorte Spalte Zeile
rep. 5 5 5
d.f. 12 12 12
e.s.e. 1.711 1.711 1.711
Standard errors of differences of means
Table Sorte Spalte Zeile
rep. 5 5 5
d.f. 12 12 12
s.e.d. 2.419 2.419 2.419
Stratum standard errors and coefficients of variation
Variate: Ertrag
d.f. s.e. cv%
12 3.825 13.5
5.2.4 Pairwise comparison
It is not possible in GenStat to run pairwise LS means comparisons using the menus. If the code for LS-means comparisons is generated manually it works fine. This chapter shows an example. Before you can apply the script to a specific analysis you have to find some information in the previous output and paste that into the script.
You have to supply the name of the table of means and information about the standard deviation and degrees of freedom. Besides the overly conservative Bonferroni method you can also run Tukey and Sidak tests.
VSN knows about this problem and will add this option to Version 10 of GENSTAT.
script listing 6: Create a pairwise comparison of means Data from table „Stratum standard errors and coefficients of variation“
VARIANCE = s.e. from the Output has to be squared DF = d.f. from Output
ALLPAIRWISE [METHOD=Bonferroni; DIRECTION=descending; PROBABILITY=0.05]\
MEANS=Kartoffel_Meantab; REPLICATION=5; VARIANCE=14.631; DF=12
Ouput 6: Result of the pairwise comparison on the basis of a One Way ANOVA All pairwise comparisons are tested.
Variance = 14.6310 with 12 degrees of freedom
Bonferroni test
Experimentwise error rate = 0.0500 Comparisonwise error rate = 0.0050
D E 1.571 No
D C 3.059 No
D B 3.968 Yes
A E 1.075 No
A C 2.563 No
A B 3.472 Yes
E C 1.488 No
E B 2.398 No
C B 0.909 No
Identifier Mean D 32.80 | A 31.60 | E 29.00 | | C 25.40 | | B 23.20 |
5.2.5 Test if data are Normal
Graph 109: Menu to Test if data are Normal
The Normality test is started via the Graph menu . Please specify as shown in Graph 110 .
Graph 110: Options of the Normality test
The script command is automatically created by GenStat :
script listing 7: Graphical Test of Normality
DPROBABILITY [PRINT=parameters,tests;DISTRIBUTION=NORMAL;METHOD=quantile;QMETHOD=standardized;\
BANDS=simultaneous;ALPHA=0.95;PLOT=reference] Kartoffel_Residuals
Ouput 7: Numerical results of the Normality tests
Critical values of test statistics (marginal tests)
Test statistic 15% 10% 5% 2.5% 1%
Anderson-Darling 0.576 0.656 0.787 0.918 1.092
Cramer-von Mises 0.091 0.104 0.126 0.148 0.178
Watson 0.085 0.096 0.116 0.136 0.163
Marginal tests
Variate Anderson-Darling Cramer-von Mises Watson
1 0.3176 0.0415 0.0415
?, *, ** indicate significance at 10%, 5% and 1% levels respectively
The graphical analysis shows that all residuals are within the limits of the confidence interval, which is an indication that the residuals are following a gaussian normal distribution. The result is congruent with the numerical analysis .
Graph 111: Graphical Output of the Normality test