• No results found

PDFbigbook SAS

N/A
N/A
Protected

Academic year: 2021

Share "PDFbigbook SAS"

Copied!
1115
0
0

Loading.... (view fulltext now)

Full text

(1)

Environmental Scientists, Biologists, and Resource Managers

C. J. Schwarz

Department of Statistics and Actuarial Science, Simon Fraser University

[email protected]

(2)

1 In the beginning... 15

1.1 Introduction . . . 15

1.2 Effective note taking strategies . . . 16

1.3 It’s all Γρκ to me . . . 19

1.4 Which computer package? . . . 19

1.5 FAQ - Frequently Asked Question . . . 28

1.5.1 Accessing journal articles from home . . . 28

1.5.2 Downloading from the web . . . 28

1.5.3 Printing 2 pages per physical page and on both sides of the paper . . . 28

1.5.4 Is there an on-line textbook? . . . 29

2 Introduction to Statistics 30 2.1 TRRGET - An overview of statistical inference . . . 31

2.2 Parameters, Statistics, Standard Deviations, and Standard Errors . . . 34

2.2.1 A review . . . 34

2.2.2 Theoretical example of a sampling distribution . . . 39

2.3 Confidence Intervals . . . 41

2.3.1 A review . . . 42

2.3.2 Some practical advice . . . 48

2.3.3 Technical details . . . 49

2.4 Hypothesis testing . . . 50

2.4.1 A review . . . 50

2.4.2 Comparing the population parameter against a known standard . . . 51

2.4.3 Comparing the population parameter between two groups . . . 58

2.4.4 Type I, Type II and Type III errors . . . 62

2.4.5 Some practical advice . . . 65

2.4.6 The case against hypothesis testing . . . 66

2.4.7 Problems with p-values - what does the literature say? . . . 69

Statistical tests in publications of the Wildlife Society . . . 69

The Insignificance of Statistical Significance Testing . . . 69

Followups . . . 71

2.5 Meta-data . . . 71

2.5.1 Scales of measurement . . . 71

(3)

2.5.3 Roles of data . . . 74

2.6 Bias, Precision, Accuracy . . . 74

2.7 Types of missing data . . . 77

2.8 Transformations . . . 79

2.8.1 Introduction . . . 79

2.8.2 Conditions under which a log-normal distribution appears . . . 80

2.8.3 ln() vs. log() . . . 80

2.8.4 Mean vs. Geometric Mean . . . 81

2.8.5 Back-transforming estimates, standard errors, and ci . . . 82

Mean on log-scale back to MEDIAN on anti-log scale . . . 82

2.8.6 Back-transforms of differences on the log-scale . . . 83

2.8.7 Some additional readings on the log-transform . . . 84

2.9 Standard deviations and standard errors revisited . . . 95

2.10 Other tidbits . . . 104

2.10.1 Interpreting p-values . . . 104

2.10.2 False positives vs. false negatives . . . 104

2.10.3 Specificity/sensitivity/power . . . 104

3 Sampling 106 3.1 Introduction . . . 108

3.1.1 Difference between sampling and experimental design . . . 108

3.1.2 Why sample rather than census? . . . 109

3.1.3 Principle steps in a survey . . . 109

3.1.4 Probability sampling vs. non-probability sampling . . . 110

3.1.5 The importance of randomization in survey design . . . 111

3.1.6 Model vs. Design based sampling . . . 114

3.1.7 Software . . . 115

3.2 Overview of Sampling Methods . . . 115

3.2.1 Simple Random Sampling . . . 115

3.2.2 Systematic Surveys . . . 117

3.2.3 Cluster sampling . . . 120

3.2.4 Multi-stage sampling . . . 124

3.2.5 Multi-phase designs . . . 126

3.2.6 Panel design - suitable for long-term monitoring . . . 128

3.2.7 Sampling non-discrete objects . . . 129

3.2.8 Key considerations when designing or analyzing a survey . . . 129

3.3 Notation . . . 131

3.4 Simple Random Sampling Without Replacement (SRSWOR) . . . 131

3.4.1 Summary of main results . . . 132

3.4.2 Estimating the Population Mean . . . 133

3.4.3 Estimating the Population Total . . . 133

3.4.4 Estimating Population Proportions . . . 134

3.4.5 Example - estimating total catch of fish in a recreational fishery . . . 134

What is the population of interest? . . . 136

What is the frame? . . . 137

(4)

Excel analysis . . . 137

SASanalysis . . . 140

3.5 Sample size determination for a simple random sample . . . 141

3.5.1 Example - How many angling-parties to survey . . . 144

3.6 Systematic sampling . . . 147

3.6.1 Advantages of systematic sampling . . . 148

3.6.2 Disadvantages of systematic sampling . . . 148

3.6.3 How to select a systematic sample . . . 148

3.6.4 Analyzing a systematic sample . . . 149

3.6.5 Technical notes - Repeated systematic sampling . . . 149

Example of replicated subsampling within a systematic sample . . . 149

3.7 Stratified simple random sampling . . . 152

3.7.1 A visual comparison of a simple random sample vs. a stratified simple random sample154 3.7.2 Notation . . . 162

3.7.3 Summary of main results . . . 162

3.7.4 Example - sampling organic matter from a lake . . . 164

3.7.5 Example - estimating the total catch of salmon . . . 168

What is the population of interest? . . . 169

What is the sampling frame? . . . 169

What is the sampling design? . . . 170

Excel analysis . . . 170

SASanalysis . . . 173

When should the various estimates be used? . . . 175

3.7.6 Sample Size for Stratified Designs . . . 177

3.7.7 Allocating samples among strata . . . 180

3.7.8 Example: Estimating the number of tundra swans. . . 183

3.7.9 Post-stratification . . . 187

3.7.10 Allocation and precision - revisited . . . 189

3.8 Ratio estimation in SRS - improving precision with auxiliary information . . . 190

3.8.1 Summary of Main results . . . 192

3.8.2 Example - wolf/moose ratio . . . 192

Excel analysis . . . 194

SASAnalysis . . . 198

Post mortem . . . 201

3.8.3 Example - Grouse numbers - using a ratio estimator to estimate a population total . . 201

Excel analysis . . . 203

SASanalysis . . . 206

Post mortem - a question to ponder . . . 209

3.9 Additional ways to improve precision . . . 210

3.9.1 Using both stratification and auxiliary variables . . . 210

3.9.2 Regression Estimators . . . 211

3.9.3 Sampling with unequal probability - pps sampling . . . 211

3.10 Cluster sampling . . . 212

3.10.1 Sampling plan . . . 213

3.10.2 Advantages and disadvantages of cluster sampling compared to SRS . . . 219

(5)

3.10.4 Summary of main results . . . 220

3.10.5 Example - estimating the density of urchins . . . 221

Excel Analysis . . . 222

SASAnalysis . . . 225

Planning for future experiments . . . 227

3.10.6 Example - estimating the total number of sea cucumbers . . . 227

SASAnalysis . . . 231

3.11 Multi-stage sampling - a generalization of cluster sampling . . . 235

3.11.1 Introduction . . . 235

3.11.2 Notation . . . 236

3.11.3 Summary of main results . . . 237

3.11.4 Example - estimating number of clams . . . 238

Excel Spreadsheet . . . 240

SASProgram . . . 240

3.11.5 Some closing comments on multi-stage designs . . . 242

3.12 Analytical surveys - almost experimental design . . . 242

3.13 References . . . 246

3.14 Frequently Asked Questions (FAQ) . . . 247

3.14.1 Confusion about the definition of a population . . . 247

3.14.2 How is N defined . . . 248

3.14.3 Multi-stage vs. Multi-phase sampling . . . 248

3.14.4 What is the difference between a Population and a frame? . . . 249

3.14.5 How to account for missing transects. . . 249

4 Designed Experiments - Terminology and Introduction 250 4.1 Terminology and Introduction . . . 251

4.1.1 Definitions . . . 251

4.1.2 Treatment, Experimental Unit, and Randomization Structure . . . 252

4.1.3 The Three R’s of Experimental Design . . . 255

4.1.4 Placebo Effects . . . 257

4.1.5 Single and bouble blinding . . . 257

4.1.6 Hawthorne Effect . . . 258

4.2 Applying some General Principles of Experimental Design . . . 258

4.2.1 Experiment 1 . . . 259

4.2.2 Experiment 2 . . . 259

4.2.3 Experiment 3 . . . 259

4.2.4 Experiment 4 . . . 260

4.2.5 Experiment 5 . . . 260

4.3 Some Case Studies . . . 261

4.3.1 The Salk Vaccine Experiment . . . 261

4.3.2 Testing Vitamin C - Mistakes do happen . . . 262

4.4 Key Points in Design of Experiments . . . 262

4.4.1 Designing an Experiment . . . 263

4.4.2 Analyzing the data . . . 264

4.4.3 Writing the Report . . . 264

(6)

4.5.1 Introduction . . . 265

4.5.2 Experimental Protocols . . . 265

4.5.3 Some Common Designs . . . 267

5 Single Factor - Completely Randomized Designs (a.k.a. One-way design) 272 5.1 Introduction . . . 273

5.2 Randomization . . . 274

5.2.1 Using a random number table . . . 275

Assigning treatments to experimental units . . . 275

Selecting from the population . . . 276

5.2.2 Using a computer . . . 276

Randomly assign treatments to experimental units . . . 277

Randomly selecting from populations . . . 281

5.3 Assumptions - the overlooked aspect of experimental design . . . 285

5.3.1 Does the analysis match the design? . . . 286

5.3.2 No outliers should be present . . . 286

5.3.3 Equal treatment group population standard deviations? . . . 287

5.3.4 Are the errors normally distributed? . . . 288

5.3.5 Are the errors are independent? . . . 289

5.4 Two-sample t-test- Introduction . . . 289

5.5 Example - comparing mean heights of children - two-sample t-test . . . 290

5.6 Example - Fat content and mean tumor weights - two-sample t-test . . . 297

5.7 Example - Growth hormone and mean final weight of cattle - two-sample t-test . . . 303

5.8 Power and sample size . . . 310

5.8.1 Basic ideas of power analysis . . . 311

5.8.2 Prospective Sample Size determination . . . 312

5.8.3 Example of power analysis/sample size determination . . . 313

Using tables . . . 314

Using a package to determine power . . . 314

5.8.4 Further Readings on Power analysis . . . 319

5.8.5 Retrospective Power Analysis . . . 320

5.8.6 Summary . . . 321

5.9 ANOVA approach - Introduction . . . 322

5.9.1 An intuitive explanation for the ANOVA method . . . 323

5.9.2 A modeling approach to ANOVA . . . 328

5.10 Example - Comparing phosphorus content - single-factor CRD ANOVA . . . 331

5.11 Example - Comparing battery lifetimes - single-factor CRD ANOVA . . . 343

5.12 Example - Cuckoo eggs - single-factor CRD ANOVA . . . 353

5.13 Multiple comparisons following ANOVA . . . 366

5.13.1 Why is there a problem? . . . 366

5.13.2 A simulation with no adjustment for multiple comparisons . . . 367

5.13.3 Comparisonwise- and Experimentwise Errors . . . 369

5.13.4 The Tukey-Adjusted t-Tests . . . 370

5.13.5 Recommendations for Multiple Comparisons . . . 372

5.13.6 Displaying the results of multiple comparisons . . . 373

(7)

5.14.1 Using Tables . . . 376

5.14.2 Using SAS to determine power . . . 377

5.14.3 Retrospective Power Analysis . . . 378

5.15 Pseudo-replication and sub-sampling . . . 379

5.16 Frequently Asked Questions (FAQ) . . . 381

5.16.1 What does the F -statistic mean? . . . 381

5.16.2 What is a test statistic - how is it used? . . . 381

5.16.3 What is MSE? . . . 381

5.16.4 Power - various questions . . . 382

What is meant by detecting half the difference? . . . 382

Do we use the std dev, the std error, or root MSE in the power computations? . . . . 382

Retrospective power analysis; how is this different from regular (i.e., prospective) power analysis? . . . 382

What does power tell us? . . . 383

When to use retrospective and prospective power? . . . 383

When should power be reported . . . 383

What is done with the “total sample size” reported by JMP? . . . 384

5.16.5 How to compare treatments to a single control? . . . 384

5.16.6 Experimental unit vs. observational unit . . . 384

5.16.7 Effects of analysis not matching design . . . 385

5.17 Table: Sample size determination for a two sample t-test . . . 388

5.18 Table: Sample size determination for a single factor, fixed effects, CRD . . . 390

5.19 Scientific papers illustrating the methods of this chapter . . . 393

5.19.1 Injury scores when trapping coyote with different trap designs . . . 393

6 Single factor - pairing and blocking 395 6.1 Introduction . . . 396

6.2 Randomization protocol . . . 399

6.2.1 Some examples of several types of block designs . . . 399

Completely randomized design - no blocking . . . 400

Randomized complete block design - RCB design . . . 400

Randomized complete block design - RCB design - missing values . . . 401

Incomplete block design - not an RCB . . . 401

Generalized randomized complete block design . . . 402

6.3 Assumptions . . . 403

6.3.1 Does the analysis match the design? . . . 403

6.3.2 Additivity between blocks and treatments . . . 404

6.3.3 No outliers should be present . . . 406

6.3.4 Equal treatment group standard deviations? . . . 406

6.3.5 Are the errors normally distributed? . . . 407

6.3.6 Are the errors independent? . . . 408

6.4 Comparing two means in a paired design - the Paired t-test . . . 408

6.5 Example - effect of stream slope upon fish abundance . . . 409

6.5.1 Introduction and survey protocol . . . 409

6.5.2 Using a Differences analysis . . . 412

(8)

6.5.4 Using a General Modeling analysis . . . 417

6.5.5 Which analysis to choose? . . . 420

6.5.6 Comments about the original paper . . . 420

6.6 Example - Quality check on two laboratories . . . 421

6.7 Example - Comparing two varieties of barley . . . 427

6.8 Example - Comparing prep of mosaic virus . . . 432

6.9 Example - Comparing turbidity at two sites . . . 437

6.9.1 Introduction and survey protocol . . . 437

6.9.2 Using a Differences analysis . . . 439

6.9.3 Using a Matched paired analysis . . . 442

6.9.4 Using a General Modeling analysis . . . 443

6.9.5 Which analysis to choose? . . . 446

6.10 Power and sample size determination . . . 447

6.11 Single Factor - Randomized Complete Block (RCB) Design . . . 449

6.11.1 Introduction . . . 449

6.11.2 The potato-peeling experiment - revisited . . . 449

6.11.3 An agricultural example . . . 450

6.11.4 Basic idea of the analysis . . . 451

6.12 Example - Comparing effects of salinity in soil . . . 453

6.12.1 Model building - fitting a linear model . . . 455

6.13 Example - Comparing different herbicides . . . 461

6.14 Example - Comparing turbidity at several sites . . . 468

6.15 Power and Sample Size in RCBs . . . 474

6.16 Example - BPK: Blood pressure at presyncope . . . 476

6.16.1 Experimental protocol . . . 476

6.16.2 Analysis . . . 479

6.16.3 Power and sample size . . . 484

6.17 Final notes . . . 487

6.18 Frequently Asked Questions (FAQ) . . . 488

6.18.1 Difference between pairing and confounding . . . 488

6.18.2 What is the difference between a paired design and an RCB design? . . . 489

6.18.3 What is the difference between a paired t-test and a two-sample t-test? . . . 489

6.18.4 Power in RCB/matched pair design - what is root MSE? . . . 490

6.18.5 Testing for block effects . . . 490

6.18.6 Presenting results for blocked experiment . . . 491

6.18.7 What is a marginal mean? . . . 491

6.18.8 Multiple experimental units within a block? . . . 492

6.18.9 How does a block differ from a cluster? . . . 492

7 Incomplete block designs 493 7.1 Introduction . . . 493

7.2 Example: Investigate differences in water quality . . . 494

8 Estimating an over all mean with subsampling 501 8.1 Average flagellum length . . . 502

(9)

8.1.2 Using the raw measurements . . . 508

8.1.3 Followup . . . 512

9 Single Factor - Sub-sampling and pseudo-replication 513 9.1 Introduction . . . 514

9.2 Example - Fat levels in fish - balanced data in a CRD . . . 514

9.2.1 Analysis based on sample means . . . 516

9.2.2 Analysis using individual values . . . 519

9.3 Example - fat levels in fish - unbalanced data in a CRD . . . 524

9.4 Example - Effect of UV radiation - balanced data in RCB . . . 525

9.4.1 Analysis on sample means . . . 528

9.4.2 Analysis using individual values . . . 531

9.5 Example - Monitoring Fry Levels - unbalanced data with sampling over time . . . 535

9.5.1 Some preliminary plots . . . 538

9.5.2 Approximate analysis of means . . . 540

9.5.3 Analysis of raw data . . . 544

9.5.4 Planning for future experiments . . . 545

9.6 Example - comparing mean flagella lengths . . . 547

9.6.1 Average-of-averages approach . . . 550

9.6.2 Analysis on individual measurements . . . 562

9.6.3 Followup . . . 568

9.7 Final Notes . . . 568

10 Two Factor Designs - Single-sized Experimental units - CR and RCB designs 569 10.1 Introduction . . . 570

10.1.1 Treatment structure . . . 571

Why factorial designs? . . . 572

Why not factorial designs? . . . 573

Displaying and interpreting treatment effects - profile plots . . . 573

10.1.2 Experimental unit structure . . . 579

10.1.3 Randomization structure . . . 581

10.1.4 Putting the three structures together . . . 582

10.1.5 Balance . . . 582

10.1.6 Fixed or random effects . . . 583

10.1.7 Assumptions . . . 584

10.1.8 General comments . . . 585

10.2 Example - Effect of photo-period and temperature on gonadosomatic index - CRD . . . 586

10.2.1 Design issues . . . 587

10.2.2 Preliminary summary statistics . . . 588

10.2.3 The statistical model . . . 593

10.2.4 Fitting the model . . . 594

10.2.5 Hypothesis testing and estimation . . . 595

10.3 Example - Effect of sex and species upon chemical uptake - CRD . . . 603

10.3.1 Design issues . . . 605

10.3.2 Preliminary summary statistics . . . 606

(10)

10.3.4 Fitting the model . . . 609

10.4 Power and sample size for two-factor CRD . . . 619

10.5 Unbalanced data - Introduction . . . 624

10.6 Example - Stream residence time - Unbalanced data in a CRD . . . 626

Design issues . . . 627

10.6.1 Preliminary summary statistics . . . 628

10.6.2 The Statistical Model . . . 629

10.6.3 Hypothesis testing and estimation . . . 631

10.6.4 Power and sample size . . . 641

10.7 Example - Energy consumption in pocket mice - Unbalanced data in a CRD . . . 641

10.7.1 Design issues . . . 643

10.7.2 Preliminary summary statistics . . . 643

10.7.3 The statistical model . . . 646

10.7.4 Fitting the model . . . 646

10.7.5 Hypothesis testing and estimation . . . 648

10.7.6 Adjusting for unequal variances? . . . 656

10.8 Example: Use-Dependent Inactivation in Sodium Channel Beta Subunit Mutation - BPK . . 656

10.8.1 Introduction . . . 656

10.8.2 Experimental protocol . . . 656

10.8.3 Analysis . . . 657

10.9 Blocking in two-factor CRD designs . . . 668

10.10FAQ . . . 669

10.10.1 How to determine sample size in two-factor designs . . . 669

10.10.2 What is the difference between a ‘block’ and a ‘factor’? . . . 669

10.10.3 If there is evidence of an interaction, does the analysis stop there? . . . 670

10.10.4 When should you use raw means or LSmeans? . . . 671

11 SAS CODE NOT DONE 673 12 Two-factor split-plot designs 674 12.1 Introduction . . . 674

12.2 Example - Holding your breath at different water temperatures - BPK . . . 675

12.2.1 Introduction . . . 675

12.2.2 Standard split-plot analysis . . . 677

12.2.3 Adjusting for body size . . . 685

12.2.4 Fitting a regression to temperature . . . 687

12.2.5 Planning for future studies . . . 691

12.3 Example - Systolic blood pressure before presyncope - BPK . . . 698

12.3.1 Experimental protocol . . . 698

12.3.2 Analysis . . . 701

12.3.3 Power and sample size determination . . . 707

13 Analysis of BACI experiments 709 13.1 Introduction . . . 710

13.2 Before-After Experiments - prelude to BACI designs . . . 714

13.2.1 Analysis of stream 1 - yearly averages . . . 717

(11)

13.2.3 Analysis of all streams - yearly averages . . . 721

13.2.4 Analysis of all streams - individual values . . . 724

13.3 Simple BACI - One year before/after; one site impact; one site control . . . 726

13.4 Example: Change in density in crabs near a power plant - one year before/after; one site impact; one site control . . . 727

13.4.1 Analysis . . . 732

13.5 Simple BACI design - limitations . . . 737

13.6 BACI with Multiple sites; One year before/after . . . 737

13.7 Example: Density of crabs - BACI with Multiple sites; One year before/after . . . 739

13.7.1 Converting to an analysis of differences . . . 741

13.7.2 Using ANOVA on the averages . . . 744

13.7.3 Using ANOVA on the raw data . . . 748

13.7.4 Model assessment . . . 751

13.8 BACI with Multiple sites; Multiple years before/after . . . 752

13.9 Example: Counting fish - Multiple years before/after; One site impact; one site control . . . 754

13.9.1 Analysis of the differences . . . 757

13.9.2 ANOVA on the raw data . . . 761

13.9.3 Model assessment . . . 763

13.10Example: Counting chironomids - Paired BACI - Multiple-years B/A; One Site I/C . . . 764

13.10.1 Analysis of the differences . . . 766

13.10.2 ANOVA on the raw data . . . 768

13.10.3 Model assessment . . . 771

13.11Example: Fry monitoring - BACI with Multiple sites; Multiple years before/after . . . 771

13.11.1 A brief digression . . . 772

13.11.2 Some preliminary plots . . . 775

13.11.3 Analysis of the averages . . . 779

13.11.4 Analysis of the raw data . . . 783

13.11.5 Power analysis . . . 786

13.12Closing remarks about the analysis of BACI designs . . . 787

13.13BACI designs power analysis and sample size determination . . . 788

13.13.1 Introduction . . . 788

13.13.2 Power: Before-After design . . . 791

Single Location studies . . . 792

Multiple Location studies . . . 794

13.13.3 Power: Simple BACI design - one site control/impact; one year before/after; inde-pendent samples . . . 798

13.13.4 Power: Multiple sites in control/impact; one year before/after; independent samples . 803 13.13.5 Power: One sites in control/impact; multiple years before/after; no subsampling . . . 808

13.13.6 Power: General BACI: Multiple sites in control/impact; multiple years before/after; subsampling . . . 811

14 Comparing proportions - Chi-square (χ2) tests 814 14.1 Introduction . . . 815

14.2 Response variables vs. Frequency Variables . . . 816

14.3 Overview . . . 818

(12)

14.4.1 Resource selection - comparison to known habitat proportions . . . 820

14.4.2 Example: Homicide and Seasons . . . 826

14.5 Comparing sets of proportions - single factor CRD designs . . . 830

14.5.1 Example: Elk habitat usage - Random selection of points . . . 830

14.5.2 Example: Ownership and viability . . . 834

14.5.3 Example: Sex and Automobile Styling . . . 839

14.5.4 Example: Marijuana use in college . . . 843

14.5.5 Example: Outcome vs. cause of accident . . . 847

14.5.6 Example: Activity times of birds . . . 851

14.6 Pseudo-replication - Combining tables . . . 853

14.7 Simpson’s Paradox - Combining tables . . . 857

14.7.1 Example: Sex bias in admissions . . . 857

14.7.2 Example: - Twenty-year survival and smoking status . . . 858

14.8 More complex designs . . . 859

14.9 Final notes . . . 859

14.10Appendix - how the test statistic is computed . . . 860

14.11Fisher’s Exact Test . . . 862

14.11.1 Sampling Protocol . . . 864

14.11.2 Hypothesis . . . 864

14.11.3 Computation . . . 864

14.11.4 Example: Relationship between Aspirin Use and MI . . . 867

Mechanics of the test . . . 868

14.11.5 Avoidance of cane toads by Northern Quolls . . . 870

15 Correlation and simple linear regression 878 15.1 Introduction . . . 879 15.2 Graphical displays . . . 880 15.2.1 Scatterplots . . . 880 15.2.2 Smoothers . . . 882 15.3 Correlation . . . 886 15.3.1 Scatter-plot matrix . . . 887 15.3.2 Correlation coefficient . . . 889 15.3.3 Cautions . . . 891 15.3.4 Principles of Causation . . . 893 15.4 Single-variable regression . . . 895 15.4.1 Introduction . . . 895

15.4.2 Equation for a line - getting notation straight (no pun intended) . . . 895

15.4.3 Populations and samples . . . 896

15.4.4 Assumptions . . . 897

Linearity . . . 897

Correct scale of predictor and response . . . 897

Correct sampling scheme . . . 897

No outliers or influential points . . . 898

Equal variation along the line . . . 898

Independence . . . 898

(13)

X measured without error . . . 899

15.4.5 Obtaining Estimates . . . 900

15.4.6 Obtaining Predictions . . . 902

15.4.7 Residual Plots . . . 903

15.4.8 Example - Yield and fertilizer . . . 903

15.4.9 Example - Mercury pollution . . . 914

15.4.10 Example - The Anscombe Data Set . . . 923

15.4.11 Transformations . . . 924

15.4.12 Example: Monitoring Dioxins - transformation . . . 925

15.4.13 Example: Weight-length relationships - transformation . . . 937

A non-linear fit . . . 945

15.4.14 Power/Sample Size . . . 946

15.4.15 The perils of R2 . . . 947

15.5 A no-intercept model: Fulton’s Condition Factor K . . . 950

15.6 Frequent Asked Questions - FAQ . . . 957

15.6.1 Do I need a random sample; power analysis . . . 957

16 SAS CODE NOT DONE 959 17 SAS CODE NOT DONE 960 18 SAS CODE NOT DONE 961 19 Estimating power/sample size using Program Monitor 962 19.1 Mechanics of MONITOR . . . 963

19.2 How does MONITOR work? . . . 972

19.3 Incorporating process and sampling error . . . 977

19.4 Presence/Absence Data . . . 986

19.5 WARNING about using testing for temporal trends . . . 989

20 SAS CODE NOT DONE 991 21 SAS CODE NOT DONE 992 22 SAS CODE NOT DONE 993 23 Logistic Regression - Advanced Topics 994 23.1 Introduction . . . 994

23.2 Sacrificial pseudo-replication . . . 995

23.3 Example: Fox-proofing mice colonies . . . 996

23.3.1 Using the simple proportions as data . . . 997

23.3.2 Logistic regression using overdispersion . . . 999

23.3.3 GLIMM modeling the random effect of colony . . . 1000

23.4 Example: Over-dispersed Seeds Germination Data . . . 1002

(14)

25 A short primer on residual plots 1011

25.1 Linear Regression . . . 1012

25.2 ANOVA residual plots . . . 1013

25.3 Logistic Regression residual plots - Part I . . . 1015

25.4 Logistic Regression residual plots - Part II . . . 1016

25.5 Poisson Regression residual plots - Part I . . . 1017

25.6 Poisson Regression residual plots - Part II . . . 1019

26 SAS CODE NOT DONE 1021 27 Tables 1022 27.1 A table of uniform random digits . . . 1022

27.2 Selected Binomial individual probabilities . . . 1026

27.3 Selected Poisson individual probabilities . . . 1034

27.4 Cumulative probability for the Standard Normal Distribution . . . 1037

27.5 Selected percentiles from the t-distribution . . . 1039

27.6 Selected percentiles from the chi-squared-distribution . . . 1040

27.7 Sample size determination for a two sample t-test . . . 1041

27.8 Power determination for a two sample t-test . . . 1043

27.9 Sample size determination for a single factor, fixed effects, CRD . . . 1045

27.10Power determination for a single factor, fixed effects, CRD . . . 1049

28 THE END! 1053 28.1 Statisfaction - with apologies to Jagger/Richards . . . 1053

28.2 ANOVA Man with apologies to Lennon/McCartney . . . 1055

29 An overview of enviromental field studies 1057 29.1 Introduction . . . 1058

29.1.1 Survey Methods . . . 1065

Simple Random Sampling . . . 1065

Systematic Surveys . . . 1067

Cluster sampling . . . 1070

Multi-stage sampling . . . 1074

Multi-phase designs . . . 1076

Summary comparison of designs . . . 1078

29.1.2 Permanent or temporary monitoring stations . . . 1079

29.1.3 Refinements that affect precision . . . 1080

Stratification . . . 1080

Auxiliary variables . . . 1083

Sampling with unequal probability . . . 1083

29.1.4 Sample size determination . . . 1084

29.2 Analytical surveys . . . 1084

29.3 Impact Studies . . . 1087

29.3.1 Before/After contrasts at a single site . . . 1088

29.3.2 Repeated before/after sampling at a single site. . . 1088

29.3.3 BACI: Before/After and Control/Impact Surveys . . . 1089

(15)

29.3.5 Enhanced BACI-P: Designs to detect acute vs. chronic effects or to detect changes

in variation as well as changes in the mean. . . 1094

29.3.6 Designs for multiple impacts spread over time . . . 1096

29.3.7 Accidental Impacts . . . 1099

29.4 Conclusion . . . 1108

29.5 References . . . 1110

29.6 Selected journal articles . . . 1112

29.6.1 Designing Environmental Field Studies . . . 1112

29.6.2 Beyond BACI . . . 1113

29.6.3 Environmental impact assessment . . . 1113

29.7 Examples of studies for discussion - good exam questions! . . . 1114

(16)

In the beginning...

Contents

1.1 Introduction . . . 15

1.2 Effective note taking strategies . . . 16

1.3 It’s all Γρκ to me . . . 19

1.4 Which computer package? . . . 19

1.5 FAQ - Frequently Asked Question . . . 28

1.5.1 Accessing journal articles from home . . . 28

1.5.2 Downloading from the web . . . 28

1.5.3 Printing 2 pages per physical page and on both sides of the paper . . . 28

1.5.4 Is there an on-line textbook? . . . 29

1.1

Introduction

To many students, statistics is synonymous with sadistics and is not a subject that is “enjoyable”. Obviously, I think this view is mistaken and hope to present some of the interesting things that can be done with statistics. Statistics is all about discovery - how to extract information in the face of uncertainty. In the past, learning about statistics was tedious because of the enormous amount of arithmetic that needed to be done. Now, we let the computer do the heavy lifting, but it now vitally important that you UNDERSTAND what a computer package is doing – after all, these computer packages don’t have a conceptual understanding of what the data are about. They will quite happily compute the average sex (where 0 codes males and 1 codes females) – only you can decide that this is a meaningless statistic to compute.

(17)

analysis might be performed using a statistical package. There is often no unique answer to a problem with several good alternatives always available, so don’t let my notes constrain your thinking.

Statistics is fun! Just ask my family:

1.2

Effective note taking strategies

This section is taken from : The Tomorrow’s Professor Listserv - an email list server on topics of general interest to higher education. It is available at Stanford University at http://sll.stanford.edu/.

It will soon become apparent, if it hasn’t already, that not all lectures are fascinating and stimulating, and that not all lecturers are born with a gift for public speaking.

However, the information and ideas that they are trying to impart are just as important, and any notes that you take in the lectures must be understandable to you, not only five minutes after the lecture has finished,

(18)

but in several months’ time, when you come to revise from them. The question, then, is how to retain your concentration and produce a good set of notes.

There are a few misconceptions on the part of students as to what can be expected of a lecture session. Firstly, that the responsibility for the success of the lecture is entirely the instructor’s, and that the student’s role is to sit and listen or to take verbatim notes. Secondly, that the purpose of a lecture is to impart infor-mation which will be needed for an exam question. And thirdly, that attending the lecture, and taking notes, is an individual, even competitive, activity.

This page aims to correct these ideas, and to help you develop successful note-taking strategies.

BEFORE THE LECTURE

• If you know the subject of the lecture, do some background reading beforehand. This way, you will go into the lecture with a better understanding, and find it easier to distinguish the points worth noting. • Read through the notes of the previous lecture in the series just before the present one begins. This helps orient your thoughts to the subject in hand, especially if you have just come from a lecture on a completely different subject.

DURING THE LECTURE

• Think of a lecture as an active, learning process, rather than a passive, secretarial exercise. Writing verbatim what the lecturer says, or copying everything down from overheads, does not involve much thought, and subsequent reading of these notes often makes little sense.

• Pages of continuous prose are the least helpful for revision. Some things said in the lecture are obvi-ously more important than others, and the notes you take should reflect this. Try to give them some structure, by using headings and sub-headings, by HIGHLIGHTING or underlining key ideas and re-alizing the links between them. Alternative noting forms to linear notes, such as flow diagrams or star charts, can be used, although these are often more helpful to revise your notes (see After the Lecture). • In some situations, you may be directed in the amount of note-taking necessary. For example, the lecturer may start off by giving you some references on the subject he/she is to lecture on. A good strategy to adopt in this case would be to note down carefully the references, then just LISTEN to the lecture, making brief notes about the main points or specific examples. Taking notes from books is far easier than in lectures, as you can go at your own pace, stop and think about something, re-read a section etc. Use the lecture to try and understand the concepts being explained.

• Or, the lecturer may give hand-outs to accompany the lecture. In this case, you don’t need to make copious notes of your own. Again, listen to what is being said, and annotate the hand-out with any extra information. It gives you more time to think, and perhaps raise questions of your own.

• On the subject of questions, it is commonly believed by students that lecturers are not to be interrupted when they are in full flow. You may find that this isn’t always the case, and there is nothing wrong

(19)

with asking individual lecturers if they mind taking questions during the lecture. It is best to establish this at the beginning of the course of lectures.

• However, there is also the problem of speaking out in front of your peers, perhaps asking something foolish, or not having the time to frame your question well. In this case, write down the question in the margin of your notes, to ask the lecturer later, or check with friends or in a textbook. It is far easier to recall the question you wanted to ask in this way, rather than rely on remembering after the lecture has finished (or even when you come to revise from your notes!)

AFTER THE LECTURE

• The best time to review your lecture notes is immediately following the lecture, although this is not always possible if, for example, you have to go straight to another one. However, the sooner you do it, with the lecture still fresh in your mind, the better chance you have to produce a good set of notes for revision.

• Revising your notes does not mean writing them out neatly!

• Try swapping notes with a friend, to check the accuracy/omissions of your own, and your understand-ing of the key points.

• If you feel that your notes are incomplete, or if you jotted down any questions during the lecture, follow this up by asking your tutor, or by reading round the subject.

• Transforming your lecture notes by using a different noting form can sometimes make them clearer, e.g. a flow diagram

• Highlight key points; produce summaries for revision purposes.

• Think how this topic relates to previous ones, or to other courses that you are studying, and begin to recognize themes and relationships.

• Meet with a few friends after lectures, to discuss the lecture topic and answer each others questions. Discussion with your peers often leads to a better understanding of a subject, which in turn makes it easier to remember. Your group could also establish a reading syndicate, whereby reading lists can be divided between members, who each take notes on their allotted texts and give copies to the rest of the group.

STORING YOUR NOTES

• A little time spent at this stage in organizing your notes will make life much easier when you come to revise from them some months later.

• Numbering pages, making a contents page, or using dividers in your file will all make your notes more accessible.

(20)

1.3

It’s all Γρκ to me

There are several common Greek letters and special symbols that are used in Statistics. In this section, we illustrate the common Greek letters and notation used.

Check that the following symbols and small equations.

• α - the Greek letter alpha • β - the Greek letter beta • λ - the Greek letter lambda

• µ - the Greek letter mu (looks like a ‘u’ with a tail in front) • σ - the Greek letter sigma (looks like an ‘o’ with a top line)

• X2, X

2- an X with a superscript 2 and then an X with a subscript 2

• Y - Y-bar, a Y with a bar across the top

• bY - Y-hat, a Y with a circumflex over it

• x

y - a fraction x/y in vertical format

• Z = X−µσ - an equation with the x-mu above the Greek letter sigma

• √n - square root of n

• p - p-hat - a p with a circumflex over itb • 6= - a not-equal sign

• ± - a plus/minus sign • × - a multiplication sign

• Pn

i=1something - a summation of something with the index i ranging from 1 to n.

1.4

Which computer package?

Modern Statistics relies heavily upon computing – many would say that many modern statistical methods would be infeasible without modern software. Rather than spending time on tedious arithmetic, or on trying to reinvent the wheel, many people rely upon modern statistical packages.

(21)

• SAS. Available in Windoze and Unix flavours. Modern Macintoshs with the Intel chip can run SAS

under Windoze.1

One of the best packages around. SAS can handle nearly any type of data (dates, times, characters, numbers) with many posible analyses (over 100 different base analyses are currently available) and allows virtually arbitrary input formats and structures. SAS is extremely flexible and powerful but has a very steep learning curve. This is the premier statistical procedure – virtually all statistical analyses

can be done with SAS. This is the package that I, as a Professional Statistician2use regularly in my

job.

Not only does SAS have modern statistical procedures, but is also a premiere database management system. It is designed for heavy duty computing.

Refer to http://www.sas.com for more details on this package. The SAS program includes a module SAS/INSIGHT that is virtually identical to JMP (see below).

• SPSS. This is a fairly powerful package (but not nearly as broad as SAS). It is very popular with Social Sciences researchers, but I personally, prefer SAS.

Refer to http://www.spss.com for more details on this package.

• JMP. JMP was originally developed by one of the two SAS developers (who were the 68 and 138 rich-est people in USA/NA in 2003). John Saul developed JMP. He did this originally for the MacIntosh platform and called it John’s MacIntosh Product ergo the name JMP.

JMPruns on Macintosh, Linux, and Windoze platforms.

JMPis easy to use and fairly powerful package. You should be able to do most things in this course

in JMP.

JMPdoes not have the range of procedures as in SAS, nor can it deal with as complex data structures.

However, my guestimation is that most people can do 80% of their statistical computing using JMP. Refer to http://www.jmp.com for more details on this package.

• SYSTAT. This package has good graphical procedures, a fairly wide range of statistical procedures, but the package is showing its age. I find SYSTAT clumsy compared to using JMP and SAS and everytime I use it, I quickly get frustrated by its limitations and clumsy operations.

A review of SYSTAT is available in Hilbe, J. M. (2008).

Systat 12.2: An overview

American Statistician, 62, 177-178

http://dx.doi.org/10.1198/000313008x299339 Refer to http://www.systat.com for more details.

• STATA I have never used STATA but a nice review of the package is found in

Hilbe, J.M. (2005). A review of Stata 9.0. The American Statistician, 59, 335-348.

1There is a VERY old version of SAS that runs under older Macintoshes. This is a very old version and should not be used. 2The Statistical Society of Canada has undertaken a program to accredit statisticians in Canada. Visit http://www.ssc.ca for

(22)

According to this review, Stata would be of interest to biostatisticians, mediacal/health outcomes, econometric, and social science research.

• S-PLUS/R. S-PLUS and R are based on the S-programing language. As the name implies, S-PLUS is an extended version of S with a nice graphical interface. R is a freeware version of S − P lus with the basically the same functionality and can be freely downloaded from the WWW. It does not have the nice graphical interface.

These packages are commonly used by statisticians when developing new statistical procedures. They are yery flexible, but require a somewhat steep learning curve.

Refer to http://www.insightful.com/ for information on S-PLUS and http://www.r-project.

org/for information about R.

• Excel. This is the standard spreadsheet program in the MSOffice Suite. Excel comes with some basic statistics but nothing too fancy. While EXCEL has its uses, you will find quickly that it can’t handly more complex analysis situations and gives wrong results without warning!

Except for very simple statistics, I RECOMMEND AGAINST the use of Excel to do statistical analyses.

People are wedded to Excel for often spurious reasons: – "Its free." So is R and you get a much superior product.

– "It is easy to use". Yes, and easy to get WRONG answers without warnings.

– "It has good graphs". Excel has the largest selection of BAD graphs in the world. Hardly any of them are useful!

The following articles discuss some of the problems with Excel. They can also be accessed directly from the web by clicking on their respective links.

– J. Cryer from the University of Iowa discusses some of the problems with using Excel for ana-lyzing data at http://www.cs.uiowa.edu/~jcryer/JSMTalk2001.pdf.

– Yet more problems with Excel are discussed in : Practical Stat with Excel? available at http:

//www.practicalstats.com/xlsstats/excelstats.html and a copy of which

is included in these notes.

– Yet more problems with Excel: Using Excel in Statistics? available at http://www.umass. edu/acco/statistics/handout/excel.html.

– An article by the Statistical Consulting Service at the University of Reading has a brief discus-sion of the pros and cons of using Excel or analyzing data at http://www.rdg.ac.uk/ ssc/software/excel/home.html. Basically, the graphs presented by Excel are often inappropriate for data presentation and you quickly run into limitations of the analysis routines available.

– How to use the basic functions of Excel for Statistics. This page also has a link to discussion about regression in Excel. Using Excel functions in Statistics available at http://physicsnt. clemson.edu/chriso/tutorials/excel/stats.html

– Spreadsheet Addiction. Some of the problems in Excel and alterntives. Has a long bibliography on the problems with Excel. Very nice summary document - well worth the read. Available at http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html.

(23)

There are a number of “add ons” available for Excel that seem to be reasonable priced and extend the analyses available.

Nevertheless, the algorithms used in Excel to do the actual computations are flawed and can give INCORRECT results without any warning that something has gone wrong!

For this reason, I generally use Excel only for simple problems - for anything more complex than a simple mean, I reach for a package such a JMP or SAS.

(24)

Practical Stats

Make sense of your data!

Home

Consulting Upcoming Classes

Applied Environmental Stats Nondetects (NADA)

Multivariate Relationships Newsletter

Downloads

Statistics With Excel?

Which Test Do I Use? Who Else Has Taken These? Contact Us

Is Microsoft Excel an Adequate Statistics Package?

It depends on what you want to do, but for many tasks, the answer is ‘No’.

Excel is available to many people as part of Microsoft Office. It contains some statistical functions in its basic installation. It also comes with statistical routines in the Data Analysis Toolpak, an add-in found separately on the Office CD. You must install the Toolpak from the CD in order to get these routines on the Tools menu. Once installed, these routines are at the bottom of the Tools menu, in the "Data Analysis" command. People use Excel as their everyday statistics software because they have already purchased it. Excel’s limitations, and occasionally its errors, make this a problem. Below are some of the concerns with using Excel for statistics that are recorded in journals, on the web, and from personal experience.

Limitations of Excel

1. Many statistical methods are not available in Excel.

Excel's biggest problem. Commonly-used statistics and methods NOT available within Excel include: * Boxplots

* p-values for the correlation coefficient

* Spearman’s and Kendall’s rank correlation coefficients * 2-way ANOVA with unequal sample sizes (unbalanced data) * Multiple comparison tests (post-hoc tests following ANOVA)

(25)

* p-values for two-way ANOVA * Levene’s test for equal variance

* Nonparametric tests, including rank-sum and Kruskal-Wallis * Probability plots

* Scatterplot arrays or brushing

* Principal components or other multivariate methods * GLM (generalized linear models)

* Survival analysis methods

* Regression diagnostics, such as Mallow’s Cp and PRESS ( it does compute adjusted r-squared) * Durbin-Watson test for serial correlation

* LOESS smooths

Excel's lack of functionality makes it difficult to use for more than computing summary statistics and simple univariate regression. Third-party add-ins to Excel attempt to compensate for these limitations, adding new functionality to the program (see "A Partial Solution", below).

2. Several Excel procedures are misleading.

Probability plots are a standard way of judging the adequacy of the normality assumption in regression. In statistics packages, residuals from the regression are easily, or in some cases automatically, plotted on a normal probability plot. Excel’s regression routine provides a Normal Probability Plot option. However, it produces a probability plot of the Y variable, not of the residuals, as would be expected.

Excel’s CONFIDENCE function computes z intervals using 1.96 for a 95% interval. This is valid only if the population variance is known, which is never true for experimental data. Confidence intervals computed using this function on sample data will be too small. A t-interval should be used instead.

Excel is inconsistent in the type of P-values it returns. For most functions of probabilities, Excel acts like a lookup table in a textbook, and returns one-sided p-values. But in the TINV function, Excel returns a 2-sided p-value. Look carefully at the documentation of any Excel function you use, to be certain you are getting what you want.

Tables of standard distributions such as the normal and t distributions return p-values for tests, or are used to confidence intervals. With Excel, the user must be careful about what is being returned. To compute a 95% t confidence interval around the mean, for example, the standard method is to look up the t-statistic in a textbook by entering the table at a value of alpha/2, or 0.025. This t-statistic is multiplied by the standard error to produce the length of the t-interval on each side of the mean. Half of the error (alpha/2) falls on each side of the mean. In Excel the TINV function is entered using the value of alpha, not alpha/2, to return the same number.

For a one-sided t interval at alpha=0.05, standard practice would be to look up the t-statistic in a textbook for alpha=0.05. In Excel, the TINV function must be called using a value of 2*alpha, or 0.10, to get the value for alpha=0.05. This nonstandard entry point has led several reviewers to state that Excel’s distribution functions are incorrect. If not incorrect, they are certainly nonstandard. Make sure you read the help menu descriptions carefully to know what each function produces.

3. Distributions are not computed with precision.

NEW In reference (1), the authors show that all problems found in Excel 97 are still there in Excel 2000 and

(26)

XP. They say that "Microsoft attempted to fix errors in the standard normal random number generator and the inverse normal function, and in the former case actually made the problem worse." From this, you can assume that the problems listed below are still there in the current versions of the software.

Statistical distributions used by Excel do not agree with better algorithms for those distributions at the third digit and beyond. So they are approximately correct, but not as exact as would be desired by an exacting statistician. This may not be harmful for hypothesis tests unless the third digit is of concern (a p-value of 0.056 versus 0.057). It is of most concern when constructing intervals (multiplying a std dev of 35 times 1.96 give 68.6; times 1.97 gives 69.0) As summarized in reference 2:

"…the statistical distributions of Excel already have been assessed by Knusel (1998), to which we refer the

interested reader. He found numerous defects in the various algorithms used to compute several

distributions, including the Normal, Chi-square, F and t, and summarized his results concisely: So one has to warn statisticians against using Excel functions for scientific purposes. The performance of Excel in this area can be judged unsatisfactory."

4. Routines for handling missing data were incorrect.

This was the largest error in Excel, but a 'band-aid' has been added in Office 2000. In earlier versions of Excel, computations and tests were flat out wrong when some of the data cells contained missing values, even for simple summary statistics. See (3) , (5), and page 4 of (6). Error messages are now displayed in Excel 2000 when there are missing values, and no result is given. Although this is still inferior to computing correct results it is somewhat of an improvement.

In reference to pre-2000,

"Excel does not calculate the paired t-test correctly when some observations have one of the measurements

but not the other." E. Goldwater, ref. (5)

5. Regression routines are incorrect for multicollinear data.

This affects multiple regression. A good statistics package will report errors due to correlations among the X variables. The Variance Inflation Factor (VIF) is one measure of collinearity. Excel does not compute collinearity measures, does not warn the user when collinearity is present, and reports parameter estimates that may be nonsensical. See (6) for an example on data from an experiment. Are multicollinear data of concern in ‘practical’ problems? I think so -- I find many examples of collinearity in environmental data sets.

Excel also requires the X variables to be in contiguous columns in order to input them to the procedure. This can be done with cut and paste, but is certainly annoying if many multiple regression models are to be built.

6. Ranks of tied data are computed incorrectly.

When ranking data, standard practice is to assign tied ranks to tied observations. The value of these ranks should equal the median of the ranks that the observations would have had, if they had not been tied. For example, three observations tied at a value of 14 would have had the ranks of 7, 8 and 9 had they not been tied. Each of the three values should be assigned the rank of 8, the median of 7, 8 and 9.

Excel assigns the lowest of the three ranks to all three observations, giving each a rank of 7. This would result in problems if Excel computed rank-based tests. Perhaps it is fortunate none are available.

7. Many of Excel's charts violate standards of good graphics.

(27)

Use of perspective and glitz (donut charts?) violate basic principles of graphics. Excel's charts are more suitable to USA Today than to scientific reports. This bothers some people more than others.

"Good graphs should….[a list of traits]…However, Excel meets virtually none of these criteria. The vast

majority of chart types produced by Excel should never be used!" -- Jon Cryer, ref (3).

"Microsoft Excel is an example of a package that does not allow enough user control to consistently make

readable and concise graphs from tables."

- A. Gelman et al., 2002, The American Statistician 56, p.123.

A partial solution:

Some of these difficulties (parts of 1,2, 6 and 7) can be overcome by using a good set of add-in routines. One of the best is StatPlus, which comes with an excellent textbook, "Data Analysis with Microsoft Excel". With StatPlus, Excel becomes an adequate statistical tool., though still not in the areas of multiple regression and ANOVA for more than one factor. Without this add-in Excel is inadequate for anything beyond basic summary statistics and simple regression.

Data Analysis with Microsoft Excel by Berk and Carey published by Duxbury (2000).

Opinion: Get this book if you're going to use Excel for statistics.

(I have no connection with the authors of StatPlus and get no benefit from this recommendation. I'm just a satisfied user.)

Some advice from others:

"If you need to perform analysis of variance, avoid using Excel, unless you are dealing with extremely

simple problems."

- Statistical Services Centre, Univ. of Reading, U.K. (at A, below)

"Enterprises should advise their scientists and professional statisticians not to use Microsoft Excel for

substantive statistical analysis. Instead, enterprises should look for professional statistical analysis software certified to pass the (NIST) Statistical Reference Datasets tests to their users' required level of accuracy."

- The Gartner Group References:

1) On the accuracy of statistical procedures in Microsoft Excel 2000 and Excel XP

B.D. McCullough and B. Wilson, (2002), Computational Statistics & Data Analysis, 40, pp 713 - 721

(2) On the accuracy of statistical procedures in Microsoft Excel ‘97

B.D. McCullough and B. Wilson, (1999), Computational Statistics & Data Analysis, 31, pp 27-37

(3) Problems with using Microsoft Excel for statistics [pdf Download]

J.D. Cryer, (2001), presented at the Joint Statistical Meetings, American Statistical Association, 2001, Atlanta Georgia

[pdf download]

(4) Use of Excel for statistical analysis

Neil Cox, (2000), AgResearch Ruakura

(28)

(5) Using Excel for statistical data analysis

Eva Goldwater, (1999), Univ. of Massachusetts Office of Information Technology

[pdf download]

(6) Statistical analysis using Microsoft Excel [pdf download] Jeffrey Simonoff, (2002)

[pdf download]

(7) Spreadsheet addiction

Patrick Burns

(8) On the Accuracy of Statistical Distributions in Microsoft Excel 97 Leo Knuesel

[pdf download]

(9) Statistical flaws in Excel Hans Pottel

[pdf download]

Guides to Excel on the web:

(A) A Beginner's Guide to Excel - Univ. of Reading, UK (B) An Intermediate Guide to Excel - Univ. of Reading, UK

Note: All opinions other than those cited as coming from others are my own. Home >

Statistics With Excel? > © 2007 Practical Stats Email Us

(29)

1.5

FAQ - Frequently Asked Question

1.5.1

Accessing journal articles from home

I make reference to several journal articles in these notes. These are often available in e-journals so the link should take you there directly IF you are authorized to access this journal. For example, if you try and access these articles from a computer with an SFU IP address, you should likely be granted permission without problems.

However, if you are trying to access these from home, you must go through the SFU library site and ac-cess the e-journal via the catalogue. You will then be prompted to enter your SFU ACS userid and password to grant you access to this journal

1.5.2

Downloading from the web

When ever I try and download an Excel file, it seems to be corrupted and can’t be opened.

Through out the notes, reference is made to spreadsheets or SAS programs available at my web site. The SAS programs and listings are simple text files and should transfer to your computer without much problem.

If you trying to download an Excel spreadsheet, be sure to specify that the file should be transfered as a source document rather than as text. If you transfer the sheets as text, you will find that the data are corrupted.

1.5.3

Printing 2 pages per physical page and on both sides of the paper

The notes look as if I could print 2 per page. Is this possible, and can I print on both sides of the paper?

Yes, it is possible to print two logical pages per physical page - the text is a bit small, but still readable. On a Macintosh System with a recent OS, when you select Print, it presents the standard print options menu. Under the popdown menu is a Layout option. Select 2 logical pages per physical page. This will work with ALL applications that use the standard print dialogue.

I’m not familiar enough with Windoze machines to offer any advice.

To print on both sides of the paper, you need a printer capable of duplex printing, i.e. on both sides of the paper. I believe that most printers in the public areas of campus are capable of this. You will have to consult

(30)

your own printer manual if you are printing at home. Otherwise, you have to print first the odd pages, then take the paper, reverse it, and print the even pages - a recipe for disaster.

1.5.4

Is there an on-line textbook?

Are there any online textbooks in statistics?

Yes, there are several - it is easiest to search the web using google. Beware that some of the advice on the web may be less than perfect.

StatSoft has a highly regarded statistical online text book at http://www.statsoft.com/textbook/ stathome.html.

Contents

1.1 Introduction . . . 15 1.2 Effective note taking strategies . . . 16 1.3 It’s all Γρκ to me . . . 19 1.4 Which computer package? . . . 19 1.5 FAQ - Frequently Asked Question . . . 28 1.5.1 Accessing journal articles from home . . . 28 1.5.2 Downloading from the web . . . 28 1.5.3 Printing 2 pages per physical page and on both sides of the paper . . . 28 1.5.4 Is there an on-line textbook? . . . 29

(31)

Introduction to Statistics

Statistics was spawned by the information age, and has been defined as the science of extracting information from data. Technological developments have demanded methodology for the efficient extraction of reli-able statistics from complex databases. As a result, Statistics has become one of the most pervasive of all disciplines.

Theoretical statisticians are largely concerned with developing methods for solving the problems in-volved in such a process, for example, finding new methods for analyzing (making sense of) types of data that existing methods cannot handle. Applied statisticians collaborate with specialists in other fields in ap-plying existing methodologies to real world problems. In fact, most statisticians are involved in both of these activities to a greater or lesser extent, and researchers in most quantitative fields of enquiry spend a great deal of their time doing applied statistics.

The public and private sector rely on statistical information for such purposes as decision making, regu-lation, control and planning.

Ordinary citizens are exposed to many ‘statistics’ on a daily basis. For example:

• “In a poll of 1089 Canadians, 47% were in favor of the constitution accord. This result is accurate to within 3 percentage points, 19 time out of 20.”

• “The seasonally adjusted unemployment rate in Canada was 9.3%”. • “Two out of three dentists recommend Crest.”

What does this all mean?

Our goal is not to make each student a ‘professional statistician’, but rather to give each student a subset of tools with which they can confidently approach many real world problems and make sense of the numbers.

(32)

2.1

TRRGET - An overview of statistical inference

Section summary:

1. Distinguish between a population and a sample 2. Why it is important to choose a probability sample

3. Distinguish among the roles of randomization, replication, and blocking 4. Distinguish between an ‘estimate’ or a ‘statistic’ and the ‘parameter’ of interest.

Most studies can be broadly classified into either surveys or experiments.

In surveys, the researcher is typically interested in describing some population - there is usually no attempt to manipulate units within the population. In experiments, units from the population are manipulated in some fashion and a response to the manipulation is observed.

There are four broad phases to the survey or the experiment. These phases define the paradigm of Statistical Inference. These phases will be illustrated in the context of a political poll of Canadians on some issue as illustrated in the following diagram.

(33)

The four phases are:

1. What is the population of interest and what is the parameter of interest? This formulates the research question - what is being measured and what is of interest.

In this case, the population of interest is likely all eligible voters in Canada and the parameter of interest is the proportion of all eligible voters in favor of the accord.

It is conceivable, but certainly impractical, that every eligible voter could be contacted and their opin-ion recorded. You would then know the value of the parameter exactly and there would be no need to do any statistics. However, in most real world situations, it is impossible or infeasible to measure every unit in the population.

Consequently, a sample is taken. 2. Selecting a sample

We would like our sample to be as representative as possible - how is this achieved? We would like our answer from our sample to be as precise as possible - how is this achieved? And, we may like to modify our sample selection method to take into account known division of the population - how is this achieved?

Three fundamental principles of Statistics are randomization, replication and blocking.

Randomization This is the most important aspect of experimental design and surveys. Randomiza-tion “makes” the sample ‘representative’ of the populaRandomiza-tion by ensuring that, ‘on average’, the sample will contain a proportion of population units that is about equal, for any variable as found in the population.

If an experiment is not randomized or a survey is not randomly collected, it rarely (if ever) provides useful information.

Many people confuse ‘random’ with ‘haphazard’. The latter only means that the sample was collected without a plan or thought to ensure that the sample obtained is representative of the population. A truly ‘random’ sample takes surprisingly much effort to collect!

E.g. The Gallup poll uses random digit dialing to select at random from all households in Canada with a phone. Is this representative of the entire voting population? How does the Gallup Poll account for the different patterns of telephone use among genders within a household?

A simple random sample is an example of a equal probability sample where every unit in the population has an equal chance of being selected for the sample. As you will see later in the notes, the assumption of equal probability of selection not crucial. What is crucial is that every unit in the population have a known probability of selection, but this probability could vary among units. For example, you may decide to sample males with a higher probability than females.

Replication = Sample Size This ensures that the results from the experiment or the survey will be precise enough to be of use. A large sample size does not imply that the sample is representative - only randomization ensures representativeness.

Do not confuse replication with repeating the survey a second time.

In this example, the Gallup poll interviews about 1100 Canadians. It chooses this number of people to get a certain precision in the results.

(34)

Blocking (or stratification) In some experiments or surveys, the researcher knows of a variable that strongly influences the response. In the context of this example, there is strong relationship between the region of the country and the response.

Consequently, precision can be improved, by first blocking or stratifying the population into more homogeneous groups. Then a separate randomized survey is done in each and every stratum and the results are combined together at the end.

In this example, the Gallup poll often stratifies the survey by region of Canada. Within each region of Canada, a separate randomized survey is performed and the results are then combined appropriately at the end.

3. Data Analysis

Once the survey design is finalized and the survey is conducted, you will have a mass of information - statistics - collected from the population. This must be checked for errors, transcribed usually into machine readable form, and summarized.

The analysis is dependent upon BOTH the data collected (the sample) and the way the data was collected (the sample selection process). For example, if the data were collected using a stratified sampling design, it must be analyzed using the methods for stratified designs - you can’t simply pretend after the fact that the data were collected using a simple random sampling design.

We will emphasize this point continually in this course - you must match the analysis with the design! For example, consider a Gallup Poll where 511 out of 1089 Canadians interviewed were in favor of an issue. Then our statistics is that 47% of our sample respondents were in favor.

4. Inference back to the Population

Despite an enormous amount of money spent collecting the data, interest really lies in the population, not the sample. The sample is merely a device to gather information about the population.

How should the information from the sample, be used to make inferences about the population? Graphing A good graph is always preferable to a table of numbers or to numerical statistics. A graph

should be clear, relevant, and informative. Beware of graphs that try to mislead by design or accident through misleading scales, chart junk, or three dimensional effects.

There a number of good books on effective statistical graphics - these should be consulted for

fur-ther information.1 Unfortunately, many people rely upon the graphical tools available in

spread-sheet software such as Excel which invariably leads to poor graphs. As a rule of thumb, Excel has the largest collection of bad graphical designs available in the free world! You may enjoy the artilce on “Using Microsoft Excel to obscure your data and annow your readers” available at http://www.biostat.wisc.edu/~kbroman/presentations/graphs_uwpath08_ handout.pdf.

Estimation The number obtained from our sample is an estimate of the true, unknown, value of the population parameter. How precise is our estimate? Are we within 10 percentage points of the correct answer?

A good survey or experiment will report a measure of precision for any estimate.

In this example, 511 of 1089 people were in favor of the accord. Our estimate of the proportion of all Canadian voters in favor of the accord is 511/1089 = 47%. These results are ‘accurate to

(35)

within 3 percentage points, 19 times out of 20’, which implies that we are reasonably confident that the true proportion of voters in favor of the accord is between 47% − 3% = 44% and 47% + 3% = 50%.

Technically, this is known as a 95% confidence interval - the details of which will be explored later in this chapter.

(Hypothesis) Testing Suppose that in last month’s poll (conducted in a similar fashion), only 42% of voters were in favor. Has the support increased? Because each percentage value is accurate to about 3 percentage points, it is possible that in fact there has been no change in support!. It is possible to make a more formal ‘test’ of the hypothesis of no change. Again, this will be explored in more detail later in this chapter.

2.2

Parameters, Statistics, Standard Deviations, and Standard

Er-rors

Section summary:

1. Distinguish between a parameter and a statistic 2. What does a standard deviation measure? 3. What does a standard error measure?

4. How are estimated standard errors determined (in general)?

2.2.1

A review

DDTs is a very persistent pesticide. Once applied, it remains in the environment for many years and tends to accumulate up the food chain. For example, birds which eat rodents which eat insects which ingest DDT contaminated plants can have very high levels of DDT and this can interfere with reproduction. [This is similar to what is happening in the Great Lakes where herring gulls have very high levels of pesticides or what is happening in the St. Lawrence River where resident beluga whales have such high levels of contaminants, they are considered hazardous waste if they die and wash up on shore.] DDT has been banned in Canada for several years, and scientists are measuring the DDT levels in wildlife to see how quickly it is declining.

The Science of Statistics is all about measurement and variation. If there was no variation, there would be no need for statistical methods. For example, consider a survey to measure DDT levels in gulls on Triangle Island off the coast of British Columbia, Canada. If all the gulls on Triangle Island had exactly the same DDT level, then it would suffice to select a single gull from the island and measure its DDT level.

Alas, the DDT level can vary by the age of the gull, by where it feeds and a host of other unknown and uncontrollable variables. Consequently the average DDT level over ALL gulls on Triangle Island seems like

References

Related documents