Environmental Scientists, Biologists, and Resource Managers
C. J. Schwarz
Department of Statistics and Actuarial Science, Simon Fraser University
[email protected]
1 In the beginning... 15
1.1 Introduction . . . 15
1.2 Effective note taking strategies . . . 16
1.3 It’s all Γρκ to me . . . 19
1.4 Which computer package? . . . 19
1.5 FAQ - Frequently Asked Question . . . 28
1.5.1 Accessing journal articles from home . . . 28
1.5.2 Downloading from the web . . . 28
1.5.3 Printing 2 pages per physical page and on both sides of the paper . . . 28
1.5.4 Is there an on-line textbook? . . . 29
2 Introduction to Statistics 30 2.1 TRRGET - An overview of statistical inference . . . 31
2.2 Parameters, Statistics, Standard Deviations, and Standard Errors . . . 34
2.2.1 A review . . . 34
2.2.2 Theoretical example of a sampling distribution . . . 39
2.3 Confidence Intervals . . . 41
2.3.1 A review . . . 42
2.3.2 Some practical advice . . . 48
2.3.3 Technical details . . . 49
2.4 Hypothesis testing . . . 50
2.4.1 A review . . . 50
2.4.2 Comparing the population parameter against a known standard . . . 51
2.4.3 Comparing the population parameter between two groups . . . 58
2.4.4 Type I, Type II and Type III errors . . . 62
2.4.5 Some practical advice . . . 65
2.4.6 The case against hypothesis testing . . . 66
2.4.7 Problems with p-values - what does the literature say? . . . 69
Statistical tests in publications of the Wildlife Society . . . 69
The Insignificance of Statistical Significance Testing . . . 69
Followups . . . 71
2.5 Meta-data . . . 71
2.5.1 Scales of measurement . . . 71
2.5.3 Roles of data . . . 74
2.6 Bias, Precision, Accuracy . . . 74
2.7 Types of missing data . . . 77
2.8 Transformations . . . 79
2.8.1 Introduction . . . 79
2.8.2 Conditions under which a log-normal distribution appears . . . 80
2.8.3 ln() vs. log() . . . 80
2.8.4 Mean vs. Geometric Mean . . . 81
2.8.5 Back-transforming estimates, standard errors, and ci . . . 82
Mean on log-scale back to MEDIAN on anti-log scale . . . 82
2.8.6 Back-transforms of differences on the log-scale . . . 83
2.8.7 Some additional readings on the log-transform . . . 84
2.9 Standard deviations and standard errors revisited . . . 95
2.10 Other tidbits . . . 104
2.10.1 Interpreting p-values . . . 104
2.10.2 False positives vs. false negatives . . . 104
2.10.3 Specificity/sensitivity/power . . . 104
3 Sampling 106 3.1 Introduction . . . 108
3.1.1 Difference between sampling and experimental design . . . 108
3.1.2 Why sample rather than census? . . . 109
3.1.3 Principle steps in a survey . . . 109
3.1.4 Probability sampling vs. non-probability sampling . . . 110
3.1.5 The importance of randomization in survey design . . . 111
3.1.6 Model vs. Design based sampling . . . 114
3.1.7 Software . . . 115
3.2 Overview of Sampling Methods . . . 115
3.2.1 Simple Random Sampling . . . 115
3.2.2 Systematic Surveys . . . 117
3.2.3 Cluster sampling . . . 120
3.2.4 Multi-stage sampling . . . 124
3.2.5 Multi-phase designs . . . 126
3.2.6 Panel design - suitable for long-term monitoring . . . 128
3.2.7 Sampling non-discrete objects . . . 129
3.2.8 Key considerations when designing or analyzing a survey . . . 129
3.3 Notation . . . 131
3.4 Simple Random Sampling Without Replacement (SRSWOR) . . . 131
3.4.1 Summary of main results . . . 132
3.4.2 Estimating the Population Mean . . . 133
3.4.3 Estimating the Population Total . . . 133
3.4.4 Estimating Population Proportions . . . 134
3.4.5 Example - estimating total catch of fish in a recreational fishery . . . 134
What is the population of interest? . . . 136
What is the frame? . . . 137
Excel analysis . . . 137
SASanalysis . . . 140
3.5 Sample size determination for a simple random sample . . . 141
3.5.1 Example - How many angling-parties to survey . . . 144
3.6 Systematic sampling . . . 147
3.6.1 Advantages of systematic sampling . . . 148
3.6.2 Disadvantages of systematic sampling . . . 148
3.6.3 How to select a systematic sample . . . 148
3.6.4 Analyzing a systematic sample . . . 149
3.6.5 Technical notes - Repeated systematic sampling . . . 149
Example of replicated subsampling within a systematic sample . . . 149
3.7 Stratified simple random sampling . . . 152
3.7.1 A visual comparison of a simple random sample vs. a stratified simple random sample154 3.7.2 Notation . . . 162
3.7.3 Summary of main results . . . 162
3.7.4 Example - sampling organic matter from a lake . . . 164
3.7.5 Example - estimating the total catch of salmon . . . 168
What is the population of interest? . . . 169
What is the sampling frame? . . . 169
What is the sampling design? . . . 170
Excel analysis . . . 170
SASanalysis . . . 173
When should the various estimates be used? . . . 175
3.7.6 Sample Size for Stratified Designs . . . 177
3.7.7 Allocating samples among strata . . . 180
3.7.8 Example: Estimating the number of tundra swans. . . 183
3.7.9 Post-stratification . . . 187
3.7.10 Allocation and precision - revisited . . . 189
3.8 Ratio estimation in SRS - improving precision with auxiliary information . . . 190
3.8.1 Summary of Main results . . . 192
3.8.2 Example - wolf/moose ratio . . . 192
Excel analysis . . . 194
SASAnalysis . . . 198
Post mortem . . . 201
3.8.3 Example - Grouse numbers - using a ratio estimator to estimate a population total . . 201
Excel analysis . . . 203
SASanalysis . . . 206
Post mortem - a question to ponder . . . 209
3.9 Additional ways to improve precision . . . 210
3.9.1 Using both stratification and auxiliary variables . . . 210
3.9.2 Regression Estimators . . . 211
3.9.3 Sampling with unequal probability - pps sampling . . . 211
3.10 Cluster sampling . . . 212
3.10.1 Sampling plan . . . 213
3.10.2 Advantages and disadvantages of cluster sampling compared to SRS . . . 219
3.10.4 Summary of main results . . . 220
3.10.5 Example - estimating the density of urchins . . . 221
Excel Analysis . . . 222
SASAnalysis . . . 225
Planning for future experiments . . . 227
3.10.6 Example - estimating the total number of sea cucumbers . . . 227
SASAnalysis . . . 231
3.11 Multi-stage sampling - a generalization of cluster sampling . . . 235
3.11.1 Introduction . . . 235
3.11.2 Notation . . . 236
3.11.3 Summary of main results . . . 237
3.11.4 Example - estimating number of clams . . . 238
Excel Spreadsheet . . . 240
SASProgram . . . 240
3.11.5 Some closing comments on multi-stage designs . . . 242
3.12 Analytical surveys - almost experimental design . . . 242
3.13 References . . . 246
3.14 Frequently Asked Questions (FAQ) . . . 247
3.14.1 Confusion about the definition of a population . . . 247
3.14.2 How is N defined . . . 248
3.14.3 Multi-stage vs. Multi-phase sampling . . . 248
3.14.4 What is the difference between a Population and a frame? . . . 249
3.14.5 How to account for missing transects. . . 249
4 Designed Experiments - Terminology and Introduction 250 4.1 Terminology and Introduction . . . 251
4.1.1 Definitions . . . 251
4.1.2 Treatment, Experimental Unit, and Randomization Structure . . . 252
4.1.3 The Three R’s of Experimental Design . . . 255
4.1.4 Placebo Effects . . . 257
4.1.5 Single and bouble blinding . . . 257
4.1.6 Hawthorne Effect . . . 258
4.2 Applying some General Principles of Experimental Design . . . 258
4.2.1 Experiment 1 . . . 259
4.2.2 Experiment 2 . . . 259
4.2.3 Experiment 3 . . . 259
4.2.4 Experiment 4 . . . 260
4.2.5 Experiment 5 . . . 260
4.3 Some Case Studies . . . 261
4.3.1 The Salk Vaccine Experiment . . . 261
4.3.2 Testing Vitamin C - Mistakes do happen . . . 262
4.4 Key Points in Design of Experiments . . . 262
4.4.1 Designing an Experiment . . . 263
4.4.2 Analyzing the data . . . 264
4.4.3 Writing the Report . . . 264
4.5.1 Introduction . . . 265
4.5.2 Experimental Protocols . . . 265
4.5.3 Some Common Designs . . . 267
5 Single Factor - Completely Randomized Designs (a.k.a. One-way design) 272 5.1 Introduction . . . 273
5.2 Randomization . . . 274
5.2.1 Using a random number table . . . 275
Assigning treatments to experimental units . . . 275
Selecting from the population . . . 276
5.2.2 Using a computer . . . 276
Randomly assign treatments to experimental units . . . 277
Randomly selecting from populations . . . 281
5.3 Assumptions - the overlooked aspect of experimental design . . . 285
5.3.1 Does the analysis match the design? . . . 286
5.3.2 No outliers should be present . . . 286
5.3.3 Equal treatment group population standard deviations? . . . 287
5.3.4 Are the errors normally distributed? . . . 288
5.3.5 Are the errors are independent? . . . 289
5.4 Two-sample t-test- Introduction . . . 289
5.5 Example - comparing mean heights of children - two-sample t-test . . . 290
5.6 Example - Fat content and mean tumor weights - two-sample t-test . . . 297
5.7 Example - Growth hormone and mean final weight of cattle - two-sample t-test . . . 303
5.8 Power and sample size . . . 310
5.8.1 Basic ideas of power analysis . . . 311
5.8.2 Prospective Sample Size determination . . . 312
5.8.3 Example of power analysis/sample size determination . . . 313
Using tables . . . 314
Using a package to determine power . . . 314
5.8.4 Further Readings on Power analysis . . . 319
5.8.5 Retrospective Power Analysis . . . 320
5.8.6 Summary . . . 321
5.9 ANOVA approach - Introduction . . . 322
5.9.1 An intuitive explanation for the ANOVA method . . . 323
5.9.2 A modeling approach to ANOVA . . . 328
5.10 Example - Comparing phosphorus content - single-factor CRD ANOVA . . . 331
5.11 Example - Comparing battery lifetimes - single-factor CRD ANOVA . . . 343
5.12 Example - Cuckoo eggs - single-factor CRD ANOVA . . . 353
5.13 Multiple comparisons following ANOVA . . . 366
5.13.1 Why is there a problem? . . . 366
5.13.2 A simulation with no adjustment for multiple comparisons . . . 367
5.13.3 Comparisonwise- and Experimentwise Errors . . . 369
5.13.4 The Tukey-Adjusted t-Tests . . . 370
5.13.5 Recommendations for Multiple Comparisons . . . 372
5.13.6 Displaying the results of multiple comparisons . . . 373
5.14.1 Using Tables . . . 376
5.14.2 Using SAS to determine power . . . 377
5.14.3 Retrospective Power Analysis . . . 378
5.15 Pseudo-replication and sub-sampling . . . 379
5.16 Frequently Asked Questions (FAQ) . . . 381
5.16.1 What does the F -statistic mean? . . . 381
5.16.2 What is a test statistic - how is it used? . . . 381
5.16.3 What is MSE? . . . 381
5.16.4 Power - various questions . . . 382
What is meant by detecting half the difference? . . . 382
Do we use the std dev, the std error, or root MSE in the power computations? . . . . 382
Retrospective power analysis; how is this different from regular (i.e., prospective) power analysis? . . . 382
What does power tell us? . . . 383
When to use retrospective and prospective power? . . . 383
When should power be reported . . . 383
What is done with the “total sample size” reported by JMP? . . . 384
5.16.5 How to compare treatments to a single control? . . . 384
5.16.6 Experimental unit vs. observational unit . . . 384
5.16.7 Effects of analysis not matching design . . . 385
5.17 Table: Sample size determination for a two sample t-test . . . 388
5.18 Table: Sample size determination for a single factor, fixed effects, CRD . . . 390
5.19 Scientific papers illustrating the methods of this chapter . . . 393
5.19.1 Injury scores when trapping coyote with different trap designs . . . 393
6 Single factor - pairing and blocking 395 6.1 Introduction . . . 396
6.2 Randomization protocol . . . 399
6.2.1 Some examples of several types of block designs . . . 399
Completely randomized design - no blocking . . . 400
Randomized complete block design - RCB design . . . 400
Randomized complete block design - RCB design - missing values . . . 401
Incomplete block design - not an RCB . . . 401
Generalized randomized complete block design . . . 402
6.3 Assumptions . . . 403
6.3.1 Does the analysis match the design? . . . 403
6.3.2 Additivity between blocks and treatments . . . 404
6.3.3 No outliers should be present . . . 406
6.3.4 Equal treatment group standard deviations? . . . 406
6.3.5 Are the errors normally distributed? . . . 407
6.3.6 Are the errors independent? . . . 408
6.4 Comparing two means in a paired design - the Paired t-test . . . 408
6.5 Example - effect of stream slope upon fish abundance . . . 409
6.5.1 Introduction and survey protocol . . . 409
6.5.2 Using a Differences analysis . . . 412
6.5.4 Using a General Modeling analysis . . . 417
6.5.5 Which analysis to choose? . . . 420
6.5.6 Comments about the original paper . . . 420
6.6 Example - Quality check on two laboratories . . . 421
6.7 Example - Comparing two varieties of barley . . . 427
6.8 Example - Comparing prep of mosaic virus . . . 432
6.9 Example - Comparing turbidity at two sites . . . 437
6.9.1 Introduction and survey protocol . . . 437
6.9.2 Using a Differences analysis . . . 439
6.9.3 Using a Matched paired analysis . . . 442
6.9.4 Using a General Modeling analysis . . . 443
6.9.5 Which analysis to choose? . . . 446
6.10 Power and sample size determination . . . 447
6.11 Single Factor - Randomized Complete Block (RCB) Design . . . 449
6.11.1 Introduction . . . 449
6.11.2 The potato-peeling experiment - revisited . . . 449
6.11.3 An agricultural example . . . 450
6.11.4 Basic idea of the analysis . . . 451
6.12 Example - Comparing effects of salinity in soil . . . 453
6.12.1 Model building - fitting a linear model . . . 455
6.13 Example - Comparing different herbicides . . . 461
6.14 Example - Comparing turbidity at several sites . . . 468
6.15 Power and Sample Size in RCBs . . . 474
6.16 Example - BPK: Blood pressure at presyncope . . . 476
6.16.1 Experimental protocol . . . 476
6.16.2 Analysis . . . 479
6.16.3 Power and sample size . . . 484
6.17 Final notes . . . 487
6.18 Frequently Asked Questions (FAQ) . . . 488
6.18.1 Difference between pairing and confounding . . . 488
6.18.2 What is the difference between a paired design and an RCB design? . . . 489
6.18.3 What is the difference between a paired t-test and a two-sample t-test? . . . 489
6.18.4 Power in RCB/matched pair design - what is root MSE? . . . 490
6.18.5 Testing for block effects . . . 490
6.18.6 Presenting results for blocked experiment . . . 491
6.18.7 What is a marginal mean? . . . 491
6.18.8 Multiple experimental units within a block? . . . 492
6.18.9 How does a block differ from a cluster? . . . 492
7 Incomplete block designs 493 7.1 Introduction . . . 493
7.2 Example: Investigate differences in water quality . . . 494
8 Estimating an over all mean with subsampling 501 8.1 Average flagellum length . . . 502
8.1.2 Using the raw measurements . . . 508
8.1.3 Followup . . . 512
9 Single Factor - Sub-sampling and pseudo-replication 513 9.1 Introduction . . . 514
9.2 Example - Fat levels in fish - balanced data in a CRD . . . 514
9.2.1 Analysis based on sample means . . . 516
9.2.2 Analysis using individual values . . . 519
9.3 Example - fat levels in fish - unbalanced data in a CRD . . . 524
9.4 Example - Effect of UV radiation - balanced data in RCB . . . 525
9.4.1 Analysis on sample means . . . 528
9.4.2 Analysis using individual values . . . 531
9.5 Example - Monitoring Fry Levels - unbalanced data with sampling over time . . . 535
9.5.1 Some preliminary plots . . . 538
9.5.2 Approximate analysis of means . . . 540
9.5.3 Analysis of raw data . . . 544
9.5.4 Planning for future experiments . . . 545
9.6 Example - comparing mean flagella lengths . . . 547
9.6.1 Average-of-averages approach . . . 550
9.6.2 Analysis on individual measurements . . . 562
9.6.3 Followup . . . 568
9.7 Final Notes . . . 568
10 Two Factor Designs - Single-sized Experimental units - CR and RCB designs 569 10.1 Introduction . . . 570
10.1.1 Treatment structure . . . 571
Why factorial designs? . . . 572
Why not factorial designs? . . . 573
Displaying and interpreting treatment effects - profile plots . . . 573
10.1.2 Experimental unit structure . . . 579
10.1.3 Randomization structure . . . 581
10.1.4 Putting the three structures together . . . 582
10.1.5 Balance . . . 582
10.1.6 Fixed or random effects . . . 583
10.1.7 Assumptions . . . 584
10.1.8 General comments . . . 585
10.2 Example - Effect of photo-period and temperature on gonadosomatic index - CRD . . . 586
10.2.1 Design issues . . . 587
10.2.2 Preliminary summary statistics . . . 588
10.2.3 The statistical model . . . 593
10.2.4 Fitting the model . . . 594
10.2.5 Hypothesis testing and estimation . . . 595
10.3 Example - Effect of sex and species upon chemical uptake - CRD . . . 603
10.3.1 Design issues . . . 605
10.3.2 Preliminary summary statistics . . . 606
10.3.4 Fitting the model . . . 609
10.4 Power and sample size for two-factor CRD . . . 619
10.5 Unbalanced data - Introduction . . . 624
10.6 Example - Stream residence time - Unbalanced data in a CRD . . . 626
Design issues . . . 627
10.6.1 Preliminary summary statistics . . . 628
10.6.2 The Statistical Model . . . 629
10.6.3 Hypothesis testing and estimation . . . 631
10.6.4 Power and sample size . . . 641
10.7 Example - Energy consumption in pocket mice - Unbalanced data in a CRD . . . 641
10.7.1 Design issues . . . 643
10.7.2 Preliminary summary statistics . . . 643
10.7.3 The statistical model . . . 646
10.7.4 Fitting the model . . . 646
10.7.5 Hypothesis testing and estimation . . . 648
10.7.6 Adjusting for unequal variances? . . . 656
10.8 Example: Use-Dependent Inactivation in Sodium Channel Beta Subunit Mutation - BPK . . 656
10.8.1 Introduction . . . 656
10.8.2 Experimental protocol . . . 656
10.8.3 Analysis . . . 657
10.9 Blocking in two-factor CRD designs . . . 668
10.10FAQ . . . 669
10.10.1 How to determine sample size in two-factor designs . . . 669
10.10.2 What is the difference between a ‘block’ and a ‘factor’? . . . 669
10.10.3 If there is evidence of an interaction, does the analysis stop there? . . . 670
10.10.4 When should you use raw means or LSmeans? . . . 671
11 SAS CODE NOT DONE 673 12 Two-factor split-plot designs 674 12.1 Introduction . . . 674
12.2 Example - Holding your breath at different water temperatures - BPK . . . 675
12.2.1 Introduction . . . 675
12.2.2 Standard split-plot analysis . . . 677
12.2.3 Adjusting for body size . . . 685
12.2.4 Fitting a regression to temperature . . . 687
12.2.5 Planning for future studies . . . 691
12.3 Example - Systolic blood pressure before presyncope - BPK . . . 698
12.3.1 Experimental protocol . . . 698
12.3.2 Analysis . . . 701
12.3.3 Power and sample size determination . . . 707
13 Analysis of BACI experiments 709 13.1 Introduction . . . 710
13.2 Before-After Experiments - prelude to BACI designs . . . 714
13.2.1 Analysis of stream 1 - yearly averages . . . 717
13.2.3 Analysis of all streams - yearly averages . . . 721
13.2.4 Analysis of all streams - individual values . . . 724
13.3 Simple BACI - One year before/after; one site impact; one site control . . . 726
13.4 Example: Change in density in crabs near a power plant - one year before/after; one site impact; one site control . . . 727
13.4.1 Analysis . . . 732
13.5 Simple BACI design - limitations . . . 737
13.6 BACI with Multiple sites; One year before/after . . . 737
13.7 Example: Density of crabs - BACI with Multiple sites; One year before/after . . . 739
13.7.1 Converting to an analysis of differences . . . 741
13.7.2 Using ANOVA on the averages . . . 744
13.7.3 Using ANOVA on the raw data . . . 748
13.7.4 Model assessment . . . 751
13.8 BACI with Multiple sites; Multiple years before/after . . . 752
13.9 Example: Counting fish - Multiple years before/after; One site impact; one site control . . . 754
13.9.1 Analysis of the differences . . . 757
13.9.2 ANOVA on the raw data . . . 761
13.9.3 Model assessment . . . 763
13.10Example: Counting chironomids - Paired BACI - Multiple-years B/A; One Site I/C . . . 764
13.10.1 Analysis of the differences . . . 766
13.10.2 ANOVA on the raw data . . . 768
13.10.3 Model assessment . . . 771
13.11Example: Fry monitoring - BACI with Multiple sites; Multiple years before/after . . . 771
13.11.1 A brief digression . . . 772
13.11.2 Some preliminary plots . . . 775
13.11.3 Analysis of the averages . . . 779
13.11.4 Analysis of the raw data . . . 783
13.11.5 Power analysis . . . 786
13.12Closing remarks about the analysis of BACI designs . . . 787
13.13BACI designs power analysis and sample size determination . . . 788
13.13.1 Introduction . . . 788
13.13.2 Power: Before-After design . . . 791
Single Location studies . . . 792
Multiple Location studies . . . 794
13.13.3 Power: Simple BACI design - one site control/impact; one year before/after; inde-pendent samples . . . 798
13.13.4 Power: Multiple sites in control/impact; one year before/after; independent samples . 803 13.13.5 Power: One sites in control/impact; multiple years before/after; no subsampling . . . 808
13.13.6 Power: General BACI: Multiple sites in control/impact; multiple years before/after; subsampling . . . 811
14 Comparing proportions - Chi-square (χ2) tests 814 14.1 Introduction . . . 815
14.2 Response variables vs. Frequency Variables . . . 816
14.3 Overview . . . 818
14.4.1 Resource selection - comparison to known habitat proportions . . . 820
14.4.2 Example: Homicide and Seasons . . . 826
14.5 Comparing sets of proportions - single factor CRD designs . . . 830
14.5.1 Example: Elk habitat usage - Random selection of points . . . 830
14.5.2 Example: Ownership and viability . . . 834
14.5.3 Example: Sex and Automobile Styling . . . 839
14.5.4 Example: Marijuana use in college . . . 843
14.5.5 Example: Outcome vs. cause of accident . . . 847
14.5.6 Example: Activity times of birds . . . 851
14.6 Pseudo-replication - Combining tables . . . 853
14.7 Simpson’s Paradox - Combining tables . . . 857
14.7.1 Example: Sex bias in admissions . . . 857
14.7.2 Example: - Twenty-year survival and smoking status . . . 858
14.8 More complex designs . . . 859
14.9 Final notes . . . 859
14.10Appendix - how the test statistic is computed . . . 860
14.11Fisher’s Exact Test . . . 862
14.11.1 Sampling Protocol . . . 864
14.11.2 Hypothesis . . . 864
14.11.3 Computation . . . 864
14.11.4 Example: Relationship between Aspirin Use and MI . . . 867
Mechanics of the test . . . 868
14.11.5 Avoidance of cane toads by Northern Quolls . . . 870
15 Correlation and simple linear regression 878 15.1 Introduction . . . 879 15.2 Graphical displays . . . 880 15.2.1 Scatterplots . . . 880 15.2.2 Smoothers . . . 882 15.3 Correlation . . . 886 15.3.1 Scatter-plot matrix . . . 887 15.3.2 Correlation coefficient . . . 889 15.3.3 Cautions . . . 891 15.3.4 Principles of Causation . . . 893 15.4 Single-variable regression . . . 895 15.4.1 Introduction . . . 895
15.4.2 Equation for a line - getting notation straight (no pun intended) . . . 895
15.4.3 Populations and samples . . . 896
15.4.4 Assumptions . . . 897
Linearity . . . 897
Correct scale of predictor and response . . . 897
Correct sampling scheme . . . 897
No outliers or influential points . . . 898
Equal variation along the line . . . 898
Independence . . . 898
X measured without error . . . 899
15.4.5 Obtaining Estimates . . . 900
15.4.6 Obtaining Predictions . . . 902
15.4.7 Residual Plots . . . 903
15.4.8 Example - Yield and fertilizer . . . 903
15.4.9 Example - Mercury pollution . . . 914
15.4.10 Example - The Anscombe Data Set . . . 923
15.4.11 Transformations . . . 924
15.4.12 Example: Monitoring Dioxins - transformation . . . 925
15.4.13 Example: Weight-length relationships - transformation . . . 937
A non-linear fit . . . 945
15.4.14 Power/Sample Size . . . 946
15.4.15 The perils of R2 . . . 947
15.5 A no-intercept model: Fulton’s Condition Factor K . . . 950
15.6 Frequent Asked Questions - FAQ . . . 957
15.6.1 Do I need a random sample; power analysis . . . 957
16 SAS CODE NOT DONE 959 17 SAS CODE NOT DONE 960 18 SAS CODE NOT DONE 961 19 Estimating power/sample size using Program Monitor 962 19.1 Mechanics of MONITOR . . . 963
19.2 How does MONITOR work? . . . 972
19.3 Incorporating process and sampling error . . . 977
19.4 Presence/Absence Data . . . 986
19.5 WARNING about using testing for temporal trends . . . 989
20 SAS CODE NOT DONE 991 21 SAS CODE NOT DONE 992 22 SAS CODE NOT DONE 993 23 Logistic Regression - Advanced Topics 994 23.1 Introduction . . . 994
23.2 Sacrificial pseudo-replication . . . 995
23.3 Example: Fox-proofing mice colonies . . . 996
23.3.1 Using the simple proportions as data . . . 997
23.3.2 Logistic regression using overdispersion . . . 999
23.3.3 GLIMM modeling the random effect of colony . . . 1000
23.4 Example: Over-dispersed Seeds Germination Data . . . 1002
25 A short primer on residual plots 1011
25.1 Linear Regression . . . 1012
25.2 ANOVA residual plots . . . 1013
25.3 Logistic Regression residual plots - Part I . . . 1015
25.4 Logistic Regression residual plots - Part II . . . 1016
25.5 Poisson Regression residual plots - Part I . . . 1017
25.6 Poisson Regression residual plots - Part II . . . 1019
26 SAS CODE NOT DONE 1021 27 Tables 1022 27.1 A table of uniform random digits . . . 1022
27.2 Selected Binomial individual probabilities . . . 1026
27.3 Selected Poisson individual probabilities . . . 1034
27.4 Cumulative probability for the Standard Normal Distribution . . . 1037
27.5 Selected percentiles from the t-distribution . . . 1039
27.6 Selected percentiles from the chi-squared-distribution . . . 1040
27.7 Sample size determination for a two sample t-test . . . 1041
27.8 Power determination for a two sample t-test . . . 1043
27.9 Sample size determination for a single factor, fixed effects, CRD . . . 1045
27.10Power determination for a single factor, fixed effects, CRD . . . 1049
28 THE END! 1053 28.1 Statisfaction - with apologies to Jagger/Richards . . . 1053
28.2 ANOVA Man with apologies to Lennon/McCartney . . . 1055
29 An overview of enviromental field studies 1057 29.1 Introduction . . . 1058
29.1.1 Survey Methods . . . 1065
Simple Random Sampling . . . 1065
Systematic Surveys . . . 1067
Cluster sampling . . . 1070
Multi-stage sampling . . . 1074
Multi-phase designs . . . 1076
Summary comparison of designs . . . 1078
29.1.2 Permanent or temporary monitoring stations . . . 1079
29.1.3 Refinements that affect precision . . . 1080
Stratification . . . 1080
Auxiliary variables . . . 1083
Sampling with unequal probability . . . 1083
29.1.4 Sample size determination . . . 1084
29.2 Analytical surveys . . . 1084
29.3 Impact Studies . . . 1087
29.3.1 Before/After contrasts at a single site . . . 1088
29.3.2 Repeated before/after sampling at a single site. . . 1088
29.3.3 BACI: Before/After and Control/Impact Surveys . . . 1089
29.3.5 Enhanced BACI-P: Designs to detect acute vs. chronic effects or to detect changes
in variation as well as changes in the mean. . . 1094
29.3.6 Designs for multiple impacts spread over time . . . 1096
29.3.7 Accidental Impacts . . . 1099
29.4 Conclusion . . . 1108
29.5 References . . . 1110
29.6 Selected journal articles . . . 1112
29.6.1 Designing Environmental Field Studies . . . 1112
29.6.2 Beyond BACI . . . 1113
29.6.3 Environmental impact assessment . . . 1113
29.7 Examples of studies for discussion - good exam questions! . . . 1114
In the beginning...
Contents
1.1 Introduction . . . 15
1.2 Effective note taking strategies . . . 16
1.3 It’s all Γρκ to me . . . 19
1.4 Which computer package? . . . 19
1.5 FAQ - Frequently Asked Question . . . 28
1.5.1 Accessing journal articles from home . . . 28
1.5.2 Downloading from the web . . . 28
1.5.3 Printing 2 pages per physical page and on both sides of the paper . . . 28
1.5.4 Is there an on-line textbook? . . . 29
1.1
Introduction
To many students, statistics is synonymous with sadistics and is not a subject that is “enjoyable”. Obviously, I think this view is mistaken and hope to present some of the interesting things that can be done with statistics. Statistics is all about discovery - how to extract information in the face of uncertainty. In the past, learning about statistics was tedious because of the enormous amount of arithmetic that needed to be done. Now, we let the computer do the heavy lifting, but it now vitally important that you UNDERSTAND what a computer package is doing – after all, these computer packages don’t have a conceptual understanding of what the data are about. They will quite happily compute the average sex (where 0 codes males and 1 codes females) – only you can decide that this is a meaningless statistic to compute.
analysis might be performed using a statistical package. There is often no unique answer to a problem with several good alternatives always available, so don’t let my notes constrain your thinking.
Statistics is fun! Just ask my family:
1.2
Effective note taking strategies
This section is taken from : The Tomorrow’s Professor Listserv - an email list server on topics of general interest to higher education. It is available at Stanford University at http://sll.stanford.edu/.
It will soon become apparent, if it hasn’t already, that not all lectures are fascinating and stimulating, and that not all lecturers are born with a gift for public speaking.
However, the information and ideas that they are trying to impart are just as important, and any notes that you take in the lectures must be understandable to you, not only five minutes after the lecture has finished,
but in several months’ time, when you come to revise from them. The question, then, is how to retain your concentration and produce a good set of notes.
There are a few misconceptions on the part of students as to what can be expected of a lecture session. Firstly, that the responsibility for the success of the lecture is entirely the instructor’s, and that the student’s role is to sit and listen or to take verbatim notes. Secondly, that the purpose of a lecture is to impart infor-mation which will be needed for an exam question. And thirdly, that attending the lecture, and taking notes, is an individual, even competitive, activity.
This page aims to correct these ideas, and to help you develop successful note-taking strategies.
BEFORE THE LECTURE
• If you know the subject of the lecture, do some background reading beforehand. This way, you will go into the lecture with a better understanding, and find it easier to distinguish the points worth noting. • Read through the notes of the previous lecture in the series just before the present one begins. This helps orient your thoughts to the subject in hand, especially if you have just come from a lecture on a completely different subject.
DURING THE LECTURE
• Think of a lecture as an active, learning process, rather than a passive, secretarial exercise. Writing verbatim what the lecturer says, or copying everything down from overheads, does not involve much thought, and subsequent reading of these notes often makes little sense.
• Pages of continuous prose are the least helpful for revision. Some things said in the lecture are obvi-ously more important than others, and the notes you take should reflect this. Try to give them some structure, by using headings and sub-headings, by HIGHLIGHTING or underlining key ideas and re-alizing the links between them. Alternative noting forms to linear notes, such as flow diagrams or star charts, can be used, although these are often more helpful to revise your notes (see After the Lecture). • In some situations, you may be directed in the amount of note-taking necessary. For example, the lecturer may start off by giving you some references on the subject he/she is to lecture on. A good strategy to adopt in this case would be to note down carefully the references, then just LISTEN to the lecture, making brief notes about the main points or specific examples. Taking notes from books is far easier than in lectures, as you can go at your own pace, stop and think about something, re-read a section etc. Use the lecture to try and understand the concepts being explained.
• Or, the lecturer may give hand-outs to accompany the lecture. In this case, you don’t need to make copious notes of your own. Again, listen to what is being said, and annotate the hand-out with any extra information. It gives you more time to think, and perhaps raise questions of your own.
• On the subject of questions, it is commonly believed by students that lecturers are not to be interrupted when they are in full flow. You may find that this isn’t always the case, and there is nothing wrong
with asking individual lecturers if they mind taking questions during the lecture. It is best to establish this at the beginning of the course of lectures.
• However, there is also the problem of speaking out in front of your peers, perhaps asking something foolish, or not having the time to frame your question well. In this case, write down the question in the margin of your notes, to ask the lecturer later, or check with friends or in a textbook. It is far easier to recall the question you wanted to ask in this way, rather than rely on remembering after the lecture has finished (or even when you come to revise from your notes!)
AFTER THE LECTURE
• The best time to review your lecture notes is immediately following the lecture, although this is not always possible if, for example, you have to go straight to another one. However, the sooner you do it, with the lecture still fresh in your mind, the better chance you have to produce a good set of notes for revision.
• Revising your notes does not mean writing them out neatly!
• Try swapping notes with a friend, to check the accuracy/omissions of your own, and your understand-ing of the key points.
• If you feel that your notes are incomplete, or if you jotted down any questions during the lecture, follow this up by asking your tutor, or by reading round the subject.
• Transforming your lecture notes by using a different noting form can sometimes make them clearer, e.g. a flow diagram
• Highlight key points; produce summaries for revision purposes.
• Think how this topic relates to previous ones, or to other courses that you are studying, and begin to recognize themes and relationships.
• Meet with a few friends after lectures, to discuss the lecture topic and answer each others questions. Discussion with your peers often leads to a better understanding of a subject, which in turn makes it easier to remember. Your group could also establish a reading syndicate, whereby reading lists can be divided between members, who each take notes on their allotted texts and give copies to the rest of the group.
STORING YOUR NOTES
• A little time spent at this stage in organizing your notes will make life much easier when you come to revise from them some months later.
• Numbering pages, making a contents page, or using dividers in your file will all make your notes more accessible.
1.3
It’s all Γρκ to me
There are several common Greek letters and special symbols that are used in Statistics. In this section, we illustrate the common Greek letters and notation used.
Check that the following symbols and small equations.
• α - the Greek letter alpha • β - the Greek letter beta • λ - the Greek letter lambda
• µ - the Greek letter mu (looks like a ‘u’ with a tail in front) • σ - the Greek letter sigma (looks like an ‘o’ with a top line)
• X2, X
2- an X with a superscript 2 and then an X with a subscript 2
• Y - Y-bar, a Y with a bar across the top
• bY - Y-hat, a Y with a circumflex over it
• x
y - a fraction x/y in vertical format
• Z = X−µσ - an equation with the x-mu above the Greek letter sigma
• √n - square root of n
• p - p-hat - a p with a circumflex over itb • 6= - a not-equal sign
• ± - a plus/minus sign • × - a multiplication sign
• Pn
i=1something - a summation of something with the index i ranging from 1 to n.
1.4
Which computer package?
Modern Statistics relies heavily upon computing – many would say that many modern statistical methods would be infeasible without modern software. Rather than spending time on tedious arithmetic, or on trying to reinvent the wheel, many people rely upon modern statistical packages.
• SAS. Available in Windoze and Unix flavours. Modern Macintoshs with the Intel chip can run SAS
under Windoze.1
One of the best packages around. SAS can handle nearly any type of data (dates, times, characters, numbers) with many posible analyses (over 100 different base analyses are currently available) and allows virtually arbitrary input formats and structures. SAS is extremely flexible and powerful but has a very steep learning curve. This is the premier statistical procedure – virtually all statistical analyses
can be done with SAS. This is the package that I, as a Professional Statistician2use regularly in my
job.
Not only does SAS have modern statistical procedures, but is also a premiere database management system. It is designed for heavy duty computing.
Refer to http://www.sas.com for more details on this package. The SAS program includes a module SAS/INSIGHT that is virtually identical to JMP (see below).
• SPSS. This is a fairly powerful package (but not nearly as broad as SAS). It is very popular with Social Sciences researchers, but I personally, prefer SAS.
Refer to http://www.spss.com for more details on this package.
• JMP. JMP was originally developed by one of the two SAS developers (who were the 68 and 138 rich-est people in USA/NA in 2003). John Saul developed JMP. He did this originally for the MacIntosh platform and called it John’s MacIntosh Product ergo the name JMP.
JMPruns on Macintosh, Linux, and Windoze platforms.
JMPis easy to use and fairly powerful package. You should be able to do most things in this course
in JMP.
JMPdoes not have the range of procedures as in SAS, nor can it deal with as complex data structures.
However, my guestimation is that most people can do 80% of their statistical computing using JMP. Refer to http://www.jmp.com for more details on this package.
• SYSTAT. This package has good graphical procedures, a fairly wide range of statistical procedures, but the package is showing its age. I find SYSTAT clumsy compared to using JMP and SAS and everytime I use it, I quickly get frustrated by its limitations and clumsy operations.
A review of SYSTAT is available in Hilbe, J. M. (2008).
Systat 12.2: An overview
American Statistician, 62, 177-178
http://dx.doi.org/10.1198/000313008x299339 Refer to http://www.systat.com for more details.
• STATA I have never used STATA but a nice review of the package is found in
Hilbe, J.M. (2005). A review of Stata 9.0. The American Statistician, 59, 335-348.
1There is a VERY old version of SAS that runs under older Macintoshes. This is a very old version and should not be used. 2The Statistical Society of Canada has undertaken a program to accredit statisticians in Canada. Visit http://www.ssc.ca for
According to this review, Stata would be of interest to biostatisticians, mediacal/health outcomes, econometric, and social science research.
• S-PLUS/R. S-PLUS and R are based on the S-programing language. As the name implies, S-PLUS is an extended version of S with a nice graphical interface. R is a freeware version of S − P lus with the basically the same functionality and can be freely downloaded from the WWW. It does not have the nice graphical interface.
These packages are commonly used by statisticians when developing new statistical procedures. They are yery flexible, but require a somewhat steep learning curve.
Refer to http://www.insightful.com/ for information on S-PLUS and http://www.r-project.
org/for information about R.
• Excel. This is the standard spreadsheet program in the MSOffice Suite. Excel comes with some basic statistics but nothing too fancy. While EXCEL has its uses, you will find quickly that it can’t handly more complex analysis situations and gives wrong results without warning!
Except for very simple statistics, I RECOMMEND AGAINST the use of Excel to do statistical analyses.
People are wedded to Excel for often spurious reasons: – "Its free." So is R and you get a much superior product.
– "It is easy to use". Yes, and easy to get WRONG answers without warnings.
– "It has good graphs". Excel has the largest selection of BAD graphs in the world. Hardly any of them are useful!
The following articles discuss some of the problems with Excel. They can also be accessed directly from the web by clicking on their respective links.
– J. Cryer from the University of Iowa discusses some of the problems with using Excel for ana-lyzing data at http://www.cs.uiowa.edu/~jcryer/JSMTalk2001.pdf.
– Yet more problems with Excel are discussed in : Practical Stat with Excel? available at http:
//www.practicalstats.com/xlsstats/excelstats.html and a copy of which
is included in these notes.
– Yet more problems with Excel: Using Excel in Statistics? available at http://www.umass. edu/acco/statistics/handout/excel.html.
– An article by the Statistical Consulting Service at the University of Reading has a brief discus-sion of the pros and cons of using Excel or analyzing data at http://www.rdg.ac.uk/ ssc/software/excel/home.html. Basically, the graphs presented by Excel are often inappropriate for data presentation and you quickly run into limitations of the analysis routines available.
– How to use the basic functions of Excel for Statistics. This page also has a link to discussion about regression in Excel. Using Excel functions in Statistics available at http://physicsnt. clemson.edu/chriso/tutorials/excel/stats.html
– Spreadsheet Addiction. Some of the problems in Excel and alterntives. Has a long bibliography on the problems with Excel. Very nice summary document - well worth the read. Available at http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html.
There are a number of “add ons” available for Excel that seem to be reasonable priced and extend the analyses available.
Nevertheless, the algorithms used in Excel to do the actual computations are flawed and can give INCORRECT results without any warning that something has gone wrong!
For this reason, I generally use Excel only for simple problems - for anything more complex than a simple mean, I reach for a package such a JMP or SAS.
Practical Stats
Make sense of your data!
HomeConsulting Upcoming Classes
Applied Environmental Stats Nondetects (NADA)
Multivariate Relationships Newsletter
Downloads
Statistics With Excel?
Which Test Do I Use? Who Else Has Taken These? Contact Us
Is Microsoft Excel an Adequate Statistics Package?
It depends on what you want to do, but for many tasks, the answer is ‘No’.
Excel is available to many people as part of Microsoft Office. It contains some statistical functions in its basic installation. It also comes with statistical routines in the Data Analysis Toolpak, an add-in found separately on the Office CD. You must install the Toolpak from the CD in order to get these routines on the Tools menu. Once installed, these routines are at the bottom of the Tools menu, in the "Data Analysis" command. People use Excel as their everyday statistics software because they have already purchased it. Excel’s limitations, and occasionally its errors, make this a problem. Below are some of the concerns with using Excel for statistics that are recorded in journals, on the web, and from personal experience.
Limitations of Excel
1. Many statistical methods are not available in Excel.
Excel's biggest problem. Commonly-used statistics and methods NOT available within Excel include: * Boxplots
* p-values for the correlation coefficient
* Spearman’s and Kendall’s rank correlation coefficients * 2-way ANOVA with unequal sample sizes (unbalanced data) * Multiple comparison tests (post-hoc tests following ANOVA)
* p-values for two-way ANOVA * Levene’s test for equal variance
* Nonparametric tests, including rank-sum and Kruskal-Wallis * Probability plots
* Scatterplot arrays or brushing
* Principal components or other multivariate methods * GLM (generalized linear models)
* Survival analysis methods
* Regression diagnostics, such as Mallow’s Cp and PRESS ( it does compute adjusted r-squared) * Durbin-Watson test for serial correlation
* LOESS smooths
Excel's lack of functionality makes it difficult to use for more than computing summary statistics and simple univariate regression. Third-party add-ins to Excel attempt to compensate for these limitations, adding new functionality to the program (see "A Partial Solution", below).
2. Several Excel procedures are misleading.
Probability plots are a standard way of judging the adequacy of the normality assumption in regression. In statistics packages, residuals from the regression are easily, or in some cases automatically, plotted on a normal probability plot. Excel’s regression routine provides a Normal Probability Plot option. However, it produces a probability plot of the Y variable, not of the residuals, as would be expected.
Excel’s CONFIDENCE function computes z intervals using 1.96 for a 95% interval. This is valid only if the population variance is known, which is never true for experimental data. Confidence intervals computed using this function on sample data will be too small. A t-interval should be used instead.
Excel is inconsistent in the type of P-values it returns. For most functions of probabilities, Excel acts like a lookup table in a textbook, and returns one-sided p-values. But in the TINV function, Excel returns a 2-sided p-value. Look carefully at the documentation of any Excel function you use, to be certain you are getting what you want.
Tables of standard distributions such as the normal and t distributions return p-values for tests, or are used to confidence intervals. With Excel, the user must be careful about what is being returned. To compute a 95% t confidence interval around the mean, for example, the standard method is to look up the t-statistic in a textbook by entering the table at a value of alpha/2, or 0.025. This t-statistic is multiplied by the standard error to produce the length of the t-interval on each side of the mean. Half of the error (alpha/2) falls on each side of the mean. In Excel the TINV function is entered using the value of alpha, not alpha/2, to return the same number.
For a one-sided t interval at alpha=0.05, standard practice would be to look up the t-statistic in a textbook for alpha=0.05. In Excel, the TINV function must be called using a value of 2*alpha, or 0.10, to get the value for alpha=0.05. This nonstandard entry point has led several reviewers to state that Excel’s distribution functions are incorrect. If not incorrect, they are certainly nonstandard. Make sure you read the help menu descriptions carefully to know what each function produces.
3. Distributions are not computed with precision.
NEW In reference (1), the authors show that all problems found in Excel 97 are still there in Excel 2000 and
XP. They say that "Microsoft attempted to fix errors in the standard normal random number generator and the inverse normal function, and in the former case actually made the problem worse." From this, you can assume that the problems listed below are still there in the current versions of the software.
Statistical distributions used by Excel do not agree with better algorithms for those distributions at the third digit and beyond. So they are approximately correct, but not as exact as would be desired by an exacting statistician. This may not be harmful for hypothesis tests unless the third digit is of concern (a p-value of 0.056 versus 0.057). It is of most concern when constructing intervals (multiplying a std dev of 35 times 1.96 give 68.6; times 1.97 gives 69.0) As summarized in reference 2:
"…the statistical distributions of Excel already have been assessed by Knusel (1998), to which we refer the
interested reader. He found numerous defects in the various algorithms used to compute several
distributions, including the Normal, Chi-square, F and t, and summarized his results concisely: So one has to warn statisticians against using Excel functions for scientific purposes. The performance of Excel in this area can be judged unsatisfactory."
4. Routines for handling missing data were incorrect.
This was the largest error in Excel, but a 'band-aid' has been added in Office 2000. In earlier versions of Excel, computations and tests were flat out wrong when some of the data cells contained missing values, even for simple summary statistics. See (3) , (5), and page 4 of (6). Error messages are now displayed in Excel 2000 when there are missing values, and no result is given. Although this is still inferior to computing correct results it is somewhat of an improvement.
In reference to pre-2000,
"Excel does not calculate the paired t-test correctly when some observations have one of the measurements
but not the other." E. Goldwater, ref. (5)
5. Regression routines are incorrect for multicollinear data.
This affects multiple regression. A good statistics package will report errors due to correlations among the X variables. The Variance Inflation Factor (VIF) is one measure of collinearity. Excel does not compute collinearity measures, does not warn the user when collinearity is present, and reports parameter estimates that may be nonsensical. See (6) for an example on data from an experiment. Are multicollinear data of concern in ‘practical’ problems? I think so -- I find many examples of collinearity in environmental data sets.
Excel also requires the X variables to be in contiguous columns in order to input them to the procedure. This can be done with cut and paste, but is certainly annoying if many multiple regression models are to be built.
6. Ranks of tied data are computed incorrectly.
When ranking data, standard practice is to assign tied ranks to tied observations. The value of these ranks should equal the median of the ranks that the observations would have had, if they had not been tied. For example, three observations tied at a value of 14 would have had the ranks of 7, 8 and 9 had they not been tied. Each of the three values should be assigned the rank of 8, the median of 7, 8 and 9.
Excel assigns the lowest of the three ranks to all three observations, giving each a rank of 7. This would result in problems if Excel computed rank-based tests. Perhaps it is fortunate none are available.
7. Many of Excel's charts violate standards of good graphics.
Use of perspective and glitz (donut charts?) violate basic principles of graphics. Excel's charts are more suitable to USA Today than to scientific reports. This bothers some people more than others.
"Good graphs should….[a list of traits]…However, Excel meets virtually none of these criteria. The vast
majority of chart types produced by Excel should never be used!" -- Jon Cryer, ref (3).
"Microsoft Excel is an example of a package that does not allow enough user control to consistently make
readable and concise graphs from tables."
- A. Gelman et al., 2002, The American Statistician 56, p.123.
A partial solution:
Some of these difficulties (parts of 1,2, 6 and 7) can be overcome by using a good set of add-in routines. One of the best is StatPlus, which comes with an excellent textbook, "Data Analysis with Microsoft Excel". With StatPlus, Excel becomes an adequate statistical tool., though still not in the areas of multiple regression and ANOVA for more than one factor. Without this add-in Excel is inadequate for anything beyond basic summary statistics and simple regression.
Data Analysis with Microsoft Excel by Berk and Carey published by Duxbury (2000).
Opinion: Get this book if you're going to use Excel for statistics.
(I have no connection with the authors of StatPlus and get no benefit from this recommendation. I'm just a satisfied user.)
Some advice from others:
"If you need to perform analysis of variance, avoid using Excel, unless you are dealing with extremely
simple problems."
- Statistical Services Centre, Univ. of Reading, U.K. (at A, below)
"Enterprises should advise their scientists and professional statisticians not to use Microsoft Excel for
substantive statistical analysis. Instead, enterprises should look for professional statistical analysis software certified to pass the (NIST) Statistical Reference Datasets tests to their users' required level of accuracy."
- The Gartner Group References:
1) On the accuracy of statistical procedures in Microsoft Excel 2000 and Excel XP
B.D. McCullough and B. Wilson, (2002), Computational Statistics & Data Analysis, 40, pp 713 - 721
(2) On the accuracy of statistical procedures in Microsoft Excel ‘97
B.D. McCullough and B. Wilson, (1999), Computational Statistics & Data Analysis, 31, pp 27-37
(3) Problems with using Microsoft Excel for statistics [pdf Download]
J.D. Cryer, (2001), presented at the Joint Statistical Meetings, American Statistical Association, 2001, Atlanta Georgia
[pdf download]
(4) Use of Excel for statistical analysis
Neil Cox, (2000), AgResearch Ruakura
(5) Using Excel for statistical data analysis
Eva Goldwater, (1999), Univ. of Massachusetts Office of Information Technology
[pdf download]
(6) Statistical analysis using Microsoft Excel [pdf download] Jeffrey Simonoff, (2002)
[pdf download]
(7) Spreadsheet addiction
Patrick Burns
(8) On the Accuracy of Statistical Distributions in Microsoft Excel 97 Leo Knuesel
[pdf download]
(9) Statistical flaws in Excel Hans Pottel
[pdf download]
Guides to Excel on the web:
(A) A Beginner's Guide to Excel - Univ. of Reading, UK (B) An Intermediate Guide to Excel - Univ. of Reading, UK
Note: All opinions other than those cited as coming from others are my own. Home >
Statistics With Excel? > © 2007 Practical Stats Email Us
1.5
FAQ - Frequently Asked Question
1.5.1
Accessing journal articles from home
I make reference to several journal articles in these notes. These are often available in e-journals so the link should take you there directly IF you are authorized to access this journal. For example, if you try and access these articles from a computer with an SFU IP address, you should likely be granted permission without problems.
However, if you are trying to access these from home, you must go through the SFU library site and ac-cess the e-journal via the catalogue. You will then be prompted to enter your SFU ACS userid and password to grant you access to this journal
1.5.2
Downloading from the web
When ever I try and download an Excel file, it seems to be corrupted and can’t be opened.
Through out the notes, reference is made to spreadsheets or SAS programs available at my web site. The SAS programs and listings are simple text files and should transfer to your computer without much problem.
If you trying to download an Excel spreadsheet, be sure to specify that the file should be transfered as a source document rather than as text. If you transfer the sheets as text, you will find that the data are corrupted.
1.5.3
Printing 2 pages per physical page and on both sides of the paper
The notes look as if I could print 2 per page. Is this possible, and can I print on both sides of the paper?
Yes, it is possible to print two logical pages per physical page - the text is a bit small, but still readable. On a Macintosh System with a recent OS, when you select Print, it presents the standard print options menu. Under the popdown menu is a Layout option. Select 2 logical pages per physical page. This will work with ALL applications that use the standard print dialogue.
I’m not familiar enough with Windoze machines to offer any advice.
To print on both sides of the paper, you need a printer capable of duplex printing, i.e. on both sides of the paper. I believe that most printers in the public areas of campus are capable of this. You will have to consult
your own printer manual if you are printing at home. Otherwise, you have to print first the odd pages, then take the paper, reverse it, and print the even pages - a recipe for disaster.
1.5.4
Is there an on-line textbook?
Are there any online textbooks in statistics?
Yes, there are several - it is easiest to search the web using google. Beware that some of the advice on the web may be less than perfect.
StatSoft has a highly regarded statistical online text book at http://www.statsoft.com/textbook/ stathome.html.
Contents
1.1 Introduction . . . 15 1.2 Effective note taking strategies . . . 16 1.3 It’s all Γρκ to me . . . 19 1.4 Which computer package? . . . 19 1.5 FAQ - Frequently Asked Question . . . 28 1.5.1 Accessing journal articles from home . . . 28 1.5.2 Downloading from the web . . . 28 1.5.3 Printing 2 pages per physical page and on both sides of the paper . . . 28 1.5.4 Is there an on-line textbook? . . . 29
Introduction to Statistics
Statistics was spawned by the information age, and has been defined as the science of extracting information from data. Technological developments have demanded methodology for the efficient extraction of reli-able statistics from complex databases. As a result, Statistics has become one of the most pervasive of all disciplines.
Theoretical statisticians are largely concerned with developing methods for solving the problems in-volved in such a process, for example, finding new methods for analyzing (making sense of) types of data that existing methods cannot handle. Applied statisticians collaborate with specialists in other fields in ap-plying existing methodologies to real world problems. In fact, most statisticians are involved in both of these activities to a greater or lesser extent, and researchers in most quantitative fields of enquiry spend a great deal of their time doing applied statistics.
The public and private sector rely on statistical information for such purposes as decision making, regu-lation, control and planning.
Ordinary citizens are exposed to many ‘statistics’ on a daily basis. For example:
• “In a poll of 1089 Canadians, 47% were in favor of the constitution accord. This result is accurate to within 3 percentage points, 19 time out of 20.”
• “The seasonally adjusted unemployment rate in Canada was 9.3%”. • “Two out of three dentists recommend Crest.”
What does this all mean?
Our goal is not to make each student a ‘professional statistician’, but rather to give each student a subset of tools with which they can confidently approach many real world problems and make sense of the numbers.
2.1
TRRGET - An overview of statistical inference
Section summary:
1. Distinguish between a population and a sample 2. Why it is important to choose a probability sample
3. Distinguish among the roles of randomization, replication, and blocking 4. Distinguish between an ‘estimate’ or a ‘statistic’ and the ‘parameter’ of interest.
Most studies can be broadly classified into either surveys or experiments.
In surveys, the researcher is typically interested in describing some population - there is usually no attempt to manipulate units within the population. In experiments, units from the population are manipulated in some fashion and a response to the manipulation is observed.
There are four broad phases to the survey or the experiment. These phases define the paradigm of Statistical Inference. These phases will be illustrated in the context of a political poll of Canadians on some issue as illustrated in the following diagram.
The four phases are:
1. What is the population of interest and what is the parameter of interest? This formulates the research question - what is being measured and what is of interest.
In this case, the population of interest is likely all eligible voters in Canada and the parameter of interest is the proportion of all eligible voters in favor of the accord.
It is conceivable, but certainly impractical, that every eligible voter could be contacted and their opin-ion recorded. You would then know the value of the parameter exactly and there would be no need to do any statistics. However, in most real world situations, it is impossible or infeasible to measure every unit in the population.
Consequently, a sample is taken. 2. Selecting a sample
We would like our sample to be as representative as possible - how is this achieved? We would like our answer from our sample to be as precise as possible - how is this achieved? And, we may like to modify our sample selection method to take into account known division of the population - how is this achieved?
Three fundamental principles of Statistics are randomization, replication and blocking.
Randomization This is the most important aspect of experimental design and surveys. Randomiza-tion “makes” the sample ‘representative’ of the populaRandomiza-tion by ensuring that, ‘on average’, the sample will contain a proportion of population units that is about equal, for any variable as found in the population.
If an experiment is not randomized or a survey is not randomly collected, it rarely (if ever) provides useful information.
Many people confuse ‘random’ with ‘haphazard’. The latter only means that the sample was collected without a plan or thought to ensure that the sample obtained is representative of the population. A truly ‘random’ sample takes surprisingly much effort to collect!
E.g. The Gallup poll uses random digit dialing to select at random from all households in Canada with a phone. Is this representative of the entire voting population? How does the Gallup Poll account for the different patterns of telephone use among genders within a household?
A simple random sample is an example of a equal probability sample where every unit in the population has an equal chance of being selected for the sample. As you will see later in the notes, the assumption of equal probability of selection not crucial. What is crucial is that every unit in the population have a known probability of selection, but this probability could vary among units. For example, you may decide to sample males with a higher probability than females.
Replication = Sample Size This ensures that the results from the experiment or the survey will be precise enough to be of use. A large sample size does not imply that the sample is representative - only randomization ensures representativeness.
Do not confuse replication with repeating the survey a second time.
In this example, the Gallup poll interviews about 1100 Canadians. It chooses this number of people to get a certain precision in the results.
Blocking (or stratification) In some experiments or surveys, the researcher knows of a variable that strongly influences the response. In the context of this example, there is strong relationship between the region of the country and the response.
Consequently, precision can be improved, by first blocking or stratifying the population into more homogeneous groups. Then a separate randomized survey is done in each and every stratum and the results are combined together at the end.
In this example, the Gallup poll often stratifies the survey by region of Canada. Within each region of Canada, a separate randomized survey is performed and the results are then combined appropriately at the end.
3. Data Analysis
Once the survey design is finalized and the survey is conducted, you will have a mass of information - statistics - collected from the population. This must be checked for errors, transcribed usually into machine readable form, and summarized.
The analysis is dependent upon BOTH the data collected (the sample) and the way the data was collected (the sample selection process). For example, if the data were collected using a stratified sampling design, it must be analyzed using the methods for stratified designs - you can’t simply pretend after the fact that the data were collected using a simple random sampling design.
We will emphasize this point continually in this course - you must match the analysis with the design! For example, consider a Gallup Poll where 511 out of 1089 Canadians interviewed were in favor of an issue. Then our statistics is that 47% of our sample respondents were in favor.
4. Inference back to the Population
Despite an enormous amount of money spent collecting the data, interest really lies in the population, not the sample. The sample is merely a device to gather information about the population.
How should the information from the sample, be used to make inferences about the population? Graphing A good graph is always preferable to a table of numbers or to numerical statistics. A graph
should be clear, relevant, and informative. Beware of graphs that try to mislead by design or accident through misleading scales, chart junk, or three dimensional effects.
There a number of good books on effective statistical graphics - these should be consulted for
fur-ther information.1 Unfortunately, many people rely upon the graphical tools available in
spread-sheet software such as Excel which invariably leads to poor graphs. As a rule of thumb, Excel has the largest collection of bad graphical designs available in the free world! You may enjoy the artilce on “Using Microsoft Excel to obscure your data and annow your readers” available at http://www.biostat.wisc.edu/~kbroman/presentations/graphs_uwpath08_ handout.pdf.
Estimation The number obtained from our sample is an estimate of the true, unknown, value of the population parameter. How precise is our estimate? Are we within 10 percentage points of the correct answer?
A good survey or experiment will report a measure of precision for any estimate.
In this example, 511 of 1089 people were in favor of the accord. Our estimate of the proportion of all Canadian voters in favor of the accord is 511/1089 = 47%. These results are ‘accurate to
within 3 percentage points, 19 times out of 20’, which implies that we are reasonably confident that the true proportion of voters in favor of the accord is between 47% − 3% = 44% and 47% + 3% = 50%.
Technically, this is known as a 95% confidence interval - the details of which will be explored later in this chapter.
(Hypothesis) Testing Suppose that in last month’s poll (conducted in a similar fashion), only 42% of voters were in favor. Has the support increased? Because each percentage value is accurate to about 3 percentage points, it is possible that in fact there has been no change in support!. It is possible to make a more formal ‘test’ of the hypothesis of no change. Again, this will be explored in more detail later in this chapter.
2.2
Parameters, Statistics, Standard Deviations, and Standard
Er-rors
Section summary:
1. Distinguish between a parameter and a statistic 2. What does a standard deviation measure? 3. What does a standard error measure?
4. How are estimated standard errors determined (in general)?
2.2.1
A review
DDTs is a very persistent pesticide. Once applied, it remains in the environment for many years and tends to accumulate up the food chain. For example, birds which eat rodents which eat insects which ingest DDT contaminated plants can have very high levels of DDT and this can interfere with reproduction. [This is similar to what is happening in the Great Lakes where herring gulls have very high levels of pesticides or what is happening in the St. Lawrence River where resident beluga whales have such high levels of contaminants, they are considered hazardous waste if they die and wash up on shore.] DDT has been banned in Canada for several years, and scientists are measuring the DDT levels in wildlife to see how quickly it is declining.
The Science of Statistics is all about measurement and variation. If there was no variation, there would be no need for statistical methods. For example, consider a survey to measure DDT levels in gulls on Triangle Island off the coast of British Columbia, Canada. If all the gulls on Triangle Island had exactly the same DDT level, then it would suffice to select a single gull from the island and measure its DDT level.
Alas, the DDT level can vary by the age of the gull, by where it feeds and a host of other unknown and uncontrollable variables. Consequently the average DDT level over ALL gulls on Triangle Island seems like