• No results found

Springer Texts in Statistics

N/A
N/A
Protected

Academic year: 2021

Share "Springer Texts in Statistics"

Copied!
24
0
0

Loading.... (view fulltext now)

Full text

(1)

Springer Texts in Statistics

Advisors:

George Casella Stephen Fienberg Ingram Olkin

Springer Science+Business Media, LLC

(2)

Springer Texts in Statistics

Alfred: Elements of Statistics for the Life and Social Sciences Berger: An Introduction to Prob ability and Stochastic Processes Bilodeau and Brenner: Theory of Multivariate Statistics Biom: Probability and Statistics: Theory and Applications

Brockwell and Davis: An Introduction to Times Series and Forecasting Chow and Teicher: Probability Theory: Independence, Interchangeability,

Martingales, Third Edition

Christensen: Plane Answers to Complex Questions: The Theory of Linear Models, Second Edition

Christensen: Linear Models for Multivariate, Time Series, and Spatial Data Christensen: Log-Linear Models and Logistic Regression, Second Edition Creighton: A First Course in Probability Models and Statistical Inference Dean and Voss: Design and Analysis ofExperiments

du Toit, Steyn, and Stumpf Graphical Exploratory Data Analysis Durrett: Essentials of Stochastic Processes

Edwards: Introduction to Graphical Modelling, Second Edition Finkelstein and Levin: Statistics for Lawyers

Flury: A First Course in Multivariate Statistics

Jobson: Applied Multivariate Data Analysis, Volume I: Regression and Experimental Design

Jobson: Applied Multivariate Data Analysis, Volume 11: Categorical and Multivariate Methods

Kalbjleisch: Probability and Statistical Inference, Volume I: Probability, Second Edition

Kalbjleisch: Probability and Statistical Inference, Volume 11: Statistical Inference, Second Edition

Karr: Probability

Keyjitz: Applied Mathematical Demography, Second Edition Kiefer: Introduction to Statistical Inference

Kokoska and Nevison: Statistical Tables and Formulae

Kulkarni: Modeling, Analysis, Design, and Control of Stochastic Systems Lehmann: Elements ofLarge-Sample Theory

Lehmann: Testing Statistical Hypotheses, Second Edition

Lehmann and Casella: Theory ofPoint Estimation, Second Edition Lindman: Analysis ofVariance in Experimental Design

Lindsey: Applying Generalized Linear Models Madansky: Prescriptions for Working Statisticians

McPherson: Applying and Interpreting Statistics: A Comprehensive Guide, Second Edition

Mueller: Basic Principles of Structural Equation Modeling: An Introduction to LISREL and EQS

(continued after index)

(3)

GIen McPherson

Applying and

Interpreting Statistics

A Comprehensive Guide

Second Edition

With 64 Figures

, Springer

(4)

Glen McPherson

School of Mathematics and Physics The University of Tasmania Hobart , Tasmania 7001 Australia

[email protected]

Editorial Board George Casella Department of Statistics University of Florida P.O. Box 118545

Gainesville, FL 32611-8545 USA

Stephen Fienberg Department of Statistics Camegie Mellon University Pittsburgh, PA 15213-3890 USA

Ingram Olkin Department of Statistics Stanford University Stanford, CA 94305 USA

Cover illustration: A portion of Figure 17.7.2, which illustrates the position of the first canonical variable axis in a multivariate analysis.

Library of Congress Cataloging-in-Publication Data McPherson, Gien.

Apply ing and interpreting statistics: a comprehensive guide/Glen McPherson.-2nd ed.

p. cm. - (Springer texts in statistics)

Rev. ed. of: Statistics in scientific investigation. c1990 . Includes bibliographical references and index.

I.Research-Statistical methods . 2. Science-Statistical methods . 3. Statistics.

I.McPherson, Gien. Statistics in scientific investigation. 11. Title . III. Series.

QI80.55.S7 M36 2001

507'.2-dc21 ()()-056318

Printed on acid-free paper.

Microsoft, Windows, Excel, and Word are registered trademarks of Microsoft Corporation; MINITAB is a registered trademark of Minitab, Inc.; SAS is a registered trademark of Tbe SAS Institute , Inc.; and SPSS is a registered trademark of SPSS, Inc.

First edition,Statistics in Scientific Investigation: Its Basis, Application, and Interpretation, © 1990 Springer- Verlag New York, Inc.

ISBN 978-1-4419-2879-5 ISBN 978-1-4757-3435-5 (eBook) DOI 10.1007/978-1-4757-3435-5

© 2001 Springer Science+BusinessMedia New York

Originally published by Springer-Verlag New York, Inc in 2001.

Softcover reprint ofthe hardcover 2nd edition 2001

All rights reserved . Tbis work may not be translated or copied in whole or in part without the written permission of the publisher SpringerScience+Business Media,LLC , except for

brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation , computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden .

Tbe use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone .

Production managed by Francine McNeill ; manufacturing supervised by Jerome Basma . Photocomposed copy prepared from the author's LaTeX files.

9 8 7 654 3 2 1

SPIN 10751768

(5)

Affectionately dedicated to Wendy, Ewen, and Meegan, to my mother,

and to the memory of my father.

(6)

Preface to the Second Edition

In the period since the first edition was published, I have appreciated the corre- spondence from all parts of the world expressing thanks for the presentation of statistics from a user's perspective. It has been particularIy pleasing to have been invited to contribute to course restructuring and development based on the ap- proach to learning and applying statistics that underlies this book. In addition, I have taken account of suggestions and criticisms, and I hope that this new edition will address all major concerns.

The range of readily accessible statistical methods has greatly expanded over the past decade, particularly with the growing accessibility of comprehensive statisti- cal computing packages. The approach adopted in this book has anticipated the changes by its emphasis on building understanding and skills in method selection and interpretation of findings. There has been a reduction in computational for- mulas to reflect the fact that basic statistical analyses are now almost universally undertaken on computers. This has allowed the inclusion of a more general cover- age of unifying methodology, particularly Generalized linear methodology, which permits users to more accurately match their requirements to statistical models and methods.

A major addition is a chapter on the commonly used multivariate methods.

The aim of this chapter is to provide a practical introduction to selected methods that aids in selection, application, and interpretation of results. This chapter was requested by readers who find many of the specialist books in this area rather daunting or who do not wish to spend time studying the methods in depth.

The opening chapter has been revised to reflect the ever-broadening role of statistics in our society. Increasingly, my statistical colleagues and I are asked to provide our consulting expertise in areas that previously did not make use

(7)

viii Preface to the Second Edition

of statistical tools. I highlight the growing role for statistics in a world where accountability is the key, noting how statistics can playa pivotal role for researchers seeking grants, and research and development departments seeking funds or trying to ensure that management understand the importance of their findings and their contributions.

In this edition, there is a maintenance of the aim of making statistics accessible to readers who lack calculus or matrix algebra. However, for mathematically advanced readers, the latter sections of Chapter 5 contain the basic elements of probability theory that underlie the statistical development of the book.

As in the first edition, real-life examples from a broad range of areas of appli- cation are the lifeblood of this book. They continue to reflect the basic premise underlying the book, namely, the need for statistics to be seen as an integral com- ponent of investigation that examines questions and provides answers that can be expressed in plain English rather than in mathematical jargon.

My thanks go to the readers and colleagues who have provided valuable com- ment on the book, and to Springer-Verlag New York, Inc., and their staff for the exceptionally high quality of publishing support they have provided. But all of this would be to no avail without the deep and continuing support and understanding of my wife, Wendy, who had to suffer the (figurative) loss of a husband for peri- ods during book writing and editing. To Wendy, I express a deep debt of gratitude.

Hobart, Tasmania, Australia GIen McPherson

(8)

Preface to the First Edition

In this book, I have taken on the challenge of providing an insight into statistics and a blueprint for statistical application for a wide audience. For students in the sciences and related professional areas and for researchers who may need to apply statistics in the course of scientific experimentation, the development em- phasizes the manner in wh ich statistics fits into the framework of the scientific method. Mathematics students will find a unified, but nonmathematical, structure for statistics that can provide the motivation for the theoretical development found in standard texts on theoretical statistics. For statisticians and students of statis- tics, the ideas contained in the book and their manner of development may aid in the development of better communication between scientists and statisticians.

The demands made of readers are twofold:

• a minimal mathematical prerequisite, which is simply an ability to comprehend formulas containing mathematical variables, such as those derived from a high- school course in algebra or the equivalent; and

• a grasp of the process of scientific modeling, which comes with either experience in scientific experimentation or practice with solving mathematical problems.

The base on which this book is developed differs from that found in the myr- iad of standard introductory books on "applied statistics." The common approach takes what might be termed the statistician's view of statistics. The reader is presented with the tools of statistics and applications of those tools based on a methodological taxonomy. There is a danger with this approach that the essen- tial structure and function of statistics will not be grasped, leading to misuse of statistics. In books divided on methodological grounds, the opportunity for com-

(9)

x Preface to the First Edition

paring alternative methodologies is diminished, lessening the guidance in method selection.

The opening chapters of this book provide a user's view of statistics in the form of a unified and cohesive framework for statistics on which to hang the theory and applications that follow. The role of prob ability, examples of statistical models, and the general theoretical constructs are fitted within the basic structure. In considering applications of statistics, grouping is based on experimental quest ions rather than on methodological considerations.

The material in this book is a portion of a comprehensive structure of statistics that I have constructed during my twenty years of academic life as a teacher and consultant. The remaining material is to be published in two more books that are in preparation. The structure has been guided by two basic principles:

• a need for a simple but comprehensive view of statistics; and

• the requirement that any description be understandable by readers in their own language.

The comprehensive aspect is difficult to satisfy, because there are a number of schools of statistics. The contents of this book naturally reflect my leanings, but I believe that ablend of different approaches is optimal and have attempted to develop a view of statistics that presents the several approaches as variations on one theme rather than as antagonists to that theme.

Since my aim is to explain statistics in the language of the experimenter, I make many references to real-life experiments. With few exceptions, these are based on data from my own consulting work, although in some cases I have used a little

"statistical" license because the experimenter did not wish to have the material published, because I was critical of the experimental practice or the statistical analysis employed, or because a modification of quest ion or data suited my needs.

I have been forced to simplify or vary the scientific explanation of many of the examples in order that the statistical point not be submerged in the description.

I apologize in advance for the scientific incompleteness of the descriptions and urge that the conclusions and interpretations be considered only in the context of examples of statistical applications. The use of real-life examples is necessary to establish the importance of experience in any application of statistics and to em- phasize that real-life applications of statistics involve more than a routine process of putting numbers into formulas and mechanically drawing conclusions.

The decision of which topics to include and the depth to which chosen topics should be covered has been one of the most difficult aspects in the preparation of the book. I am conscious of the fact that others would have chosen differently, and I would appreciate comments from readers regarding areas that they believe should be included, expanded, or perhaps excluded. While I have made every endeavor to make the book error free, I am only too aware that in a book of this size and breadth there will be typing and possibly statistical errors. While I acknowledge the assistance of others in the preparation of the book, I take sole responsibility for these errors.

(10)

Preface to the First Edition xi Computational Considerations

Computers are now an integral part of statistical analysis. The presentation of methods in this book is based on the assumption that readers have access to at least one general-purpose statistical package. In general, methods have been restricted to those that are widely available. Only where the simplicity of methods makes the use of electronic calculators feasible, or where computational facilities may not be readily available, are computational details described.

For the various methods and examples in the book, the relevant sets of com- mands for programming in MINITAB and SAS are available on a set of disks together with the data from all tables, examples, and problems in the book. These may be obtained from the author at the address provided on the copyright page.

Exercises

A selection of exercises is provided to accompany most chapters. The exercises are primarily concerned with the application of statistics in scientific experimentation.

There are a limited nu mb er of theoretical exercises and some simulation exercises.

Many of the exercises require the use of a statistical computing package. Solutions to exercises including the relevant programs written in MINITAB and SAS are available from the author at the address given on the copyright page.

Acknowledgments

Over the many years in which the structure underlying this book was being de- veloped, many people offered valuable advice and engaged in useful exchanges concerning the nature of statistics and the role of statistics in scientific experi- mentation. I express my gratitude to all of those people. One person who stands alone in having freely offered his extensive wisdom and knowledge throughout the project is my longtime colleague and friend David Ratkowsky. Our many thought- provo king discussions (and arguments) have greatly enriched the content of the book. Jane Watson, who worked with me in teaching courses based on the book, offered many valuable suggestions. Scientists within the University of Tasmania and elsewhere generously allowed me to use material from consulting projects.

Many variations and refinements come from my students and scientific colleagues who used earlier versions of the book. The Institute of Actuaries kindly permitted the use of the data in Table 1.2.2 from an article by R.D. Clarke.

Typing of earlier versions of the book was undertaken by several secretaries in the mathematics department at the U niversity of Tasmania, most notably Lorraine Hickey. However, my deepest gratitude goes to the developers of the scientific word- processing package T3, who provided me with a simple means of preparing the final draft myself. Meegan assisted with the production of the figures in the book, Ewen assisted with data checking, and Wendy spent many ho urs proofreading and checking the final draft.

Hobart, Tasmania, Australia GIen McPherson

(11)

Supplementary Material Available on a Web Site

To support the practical value of the book as a guide in choosing and applying statistical methods and in the interpretation of results, the following supplementary material is available from the following Web site:

http://www.springer-ny.com/detail.tpl?ISBN=0387951105

The supplementary files are referenced in the text at the end of descriptions of the examples to which they apply.

Reports. For examples that require statistical analysis of the data, there is a cor- responding Word document that contains the following information: background information and aims; design descriptions, where relevant; fuH details of statistical analyses; and interpretation of findings. The documents also include additional ex- planation and practical comments, plus reference to possible alternative approaches to an analysis, if relevant.

An added function of the reports is to illustrate the integration of two or more statistical methods in the analysis of a single data set.

Excel data sets. Tables of data provided in the book, plus additional data sets for which analyses are provided in the book, are available in Excel data sets.

SAS data sets and programs. For statistical analyses that I have undertaken using the SAS statistical package, SAS data sets are provided. An extensive set of macros is also included. These provide comprehensive programs for model and data-checking plus analyses for the ex am pIes in the book. These macros can easily be applied to reader's data.

(12)

XIV Supplementary Material Available on a Web Site

SPSS data sets and programs. For all statistical analyses that I have under- taken using the SPSS statistical package, SPSS data sets are provided. Annotated SPSS programs are provided that apply model and data-checking procedures and analyses for the examples in the book.

Naming Convention and Storage Format

At the end of each example description in the book, the report name and names of files holding data and statistical computing programs are provided. For instance, the end of the description of Example 20.1 is

(Report: Greenhouse effect.doc; data and packages: e21xl).

This identifies the Word document containing the report as "Greenhouse effect,"

the Excel file containing the data as e21x1.xls, the program to analyze the data using SAS as e21x1.sas with the SAS data set in e21x1.sd2, the program to analyze the data using SPSS as e21x1.sps, and the SPSS data file as e2lx1.sav.

All reports are contained in the zip file Reports.zip.

Excel file names are generally defined by the example numbers in which the data are referenced. There are some exceptions that are identified in the example descriptions in which they apply. Data presented in tables that are not part of examples are referenced by the table number, for example, tlx3x1.xls is the Excel file containing the data for Table 1.3.1, or by problem number if they are data provided for problems, for example, pl8x4.xls is the Excel file containing the data for Problem 18.4. All Excel data sets are contained in the zip file Excel.zip.

SAS and SPSS data sets and programs are generally defined by the example numbers in which the data are referenced. There are some exceptions that are identified in the example descriptions in which they apply. Standard extensions are used, for example, sav and sps are used for SPSS data sets and programs, respectively. SAS data sets and programs are contained in the zip file SAS.zip, and SPSS data sets and programs are contained in the zip file SPSS.zip.

(13)

Plan of the Book

The development of the book traces the manner in which students and investi- gators might naturally wish to seek knowledge about statistics according to the following scheme:

The Role

0/

Statistics

The opening chapter highlights the areas in which statistics can playavital role.

Additionally, it introduces and illustrates both the descriptive and analytical roles of statistics.

The Analytical Role

0/

Statistics

While the involvement of statistics in investigations and operational activities is varied both with respect to areas of application and forms of statistical analysis employed, there is a general structure that provides a unified view of statistics as a means of comparing data with a statistical model. Chapters 2, 3, and 4 describe this structure.

Probability and Statistics

It becomes apparent in developing a means of measuring the extent of agreement between the data and the statistical model that probabilistic ideas are involved.

Chapter 5 provides a link between the role of probability in statistical analysis and the formal basis and rules of probability as a discipline in its own right.

(14)

xvi Plan of the Book Statistical Models

There prove to be a small nu mb er of statistical models that cover a vast majority of scientific applications of statistics. These models are introduced in Chapter 6, and the basic information required in the comparison of these models with data is introduced in Chapter 7.

Mathematics and a Theory 01 Statistics

While this book is developed from the perspective of the user of statistics, it is im- portant even for nonmathematical users to appreciate the need for a mathematical structure for statistics. Chapter 8 provides a description of the basic approaches and tools used in statistical analysis. There is a consideration of the task faced by statisticians in defining the best statistical method for a given situation.

Examples 01 Statistical Applications

Chapter 9 is the first of a number of chapters that provide an illustration of sta- tistical analysis. The primary purpose of such chapters is to establish a simple, but generally applicable, approach to the employment of statistical methods. Such examples are designed to give the reader the insight to select and apply statistical methodology.

Model and Data Checking

A statistical model generally contains ass um pt ions that are distinct from the ques- tion of interest to the experimenter. These ass um pt ions are statistical expressions of the structure of the experimental setup or population that is under investiga- tion. If there are errors in these ass um pt ions or in the data that provide the factual information, then erroneous conclusions may arise from the analysis. Chapter 10 contains an explanation of the nature of the statistical assumptions that commonly appear in statistical models and presents methods for detecting errors in both these assumptions and in the data employed in the statistical analysis.

Defining Experimental Aims

A valuable role of statistics is to be found in its requirement for a precise statement of experimental aims. Chapters 11 and 12 examine this aspect using two areas of application. In Chapter 11, interest is in the average value of a quantity under study. In Chapter 12, the problem is to decide whether one group is different from another or one treatment is better than another.

Types olInvestigation

In Chapter 13, a distinction is made between compamtive studies, in which the investigator merely acts as an observer, and designed experiments, in which the

(15)

Plan of the Book XVll

experimenter plays a more active role. In the latter type of study, the experimenter assigns subjects or experimental units to treatments for the purpose of comparing the effects of the treatments. A special type of observational study known as a survey is also introduced.

The special methods employed in designed experiments for the selection of sam- pIe members and their allocation between treatments are introduced.

Comparisons Involving More Than Two Groups or Treatments

The methods introduced in Chapter 12 for the comparison of two groups or treat- ments are extended in Chapters 14, 15, and 16 to cover situations in which there are more than two groups or treatments. Chapter 15 presents an important class of experimental designs and associated statistical methods that permit the sepa- ration of treatment effects from the effects of factors that are not of interest to the experimenter.

Relationships

In cases where more than one variable is recorded on each object, the relations between the variables may be studied or used to answer questions raised by the investigator. Chapter 17 intro duces models and methods for this purpose and dis- cusses practical issues in their implementation. Where a relationship exists, it may be exploited to predict values in a response variable from values of explanatory variables. The means by which this is done are described in Chapter 18.

Variability

There are circumstances in which the measurement of variation or, equivalently, measurement of precision is of interest. Chapter 19 is concerned with the definition and measurement of these quantities.

Cause and Effect

One of the common areas of interest in scientific investigation is the exploration of a cause-effect structure for a collection of variables. The extent to which statistics may be used in this process is the topic of Chapter 20.

Time-Related Changes

The nature of statistical models in studies of changes in response over time and how such models are employed is the topic of Chapter 21.

(16)

U sing the Book

The book is designed to act primarily as a textbook or a book to leam about statistics, and subsequently as a reference book.

As a textbook, the core material is contained in Chapters 1 to 11, and the minimal introduction is provided by studying the following topics:

1. The roles of statistics

Sections 1.1 and 1.2. (Section 1.3 on descriptive statistics also should be studied by students with no previous background in statistics.)

2. The structure and function of statistics Section 1.4 and Chapters 2, 3, and 4.

3. The role of probability in statistics

Sections 5.1 and 5.2 provide an informal introduction. More mathematically inclined students should also work through Sections 5.3 and 5.4. Note that these two sections assume a knowledge of calculus.

4. Statistical models

Sections 6.1 and 6.5 introduce two important statistical models, and Sections 7.1 and 7.3 provide basic information for the application of these models.

5. Theoretical constructs

Chapter 8 should be studied in its entirety by mathematically inclined readers.

Those with limited mathematical background should read Sections 8.1, 8.2.3, 8.2.5, 8.3.1, 8.3.3, and 8.3.5.

6. A simple application of statistics: Proportions and probabilities Section 9.1 and Sections 9.2.1 to 9.2.4.

(17)

xx U sing the Book 7. Model and data checking

Chapter 10.

8. A simple application of statistics: Averages

Section 11.1 intro duces an experimental problem and considers its translation into a statistical problem. Section 11.2.1 discusses the choice of statistical methodol- ogy. Section 11.2.2 read in conjunction with Sections 7.7.2 and 7.7.3 provides a method for studying medians based on a distribution-free model. Section 11.2.3 provides a method for studying means based on a distribution-free model. Section 11.2.4 read in conjunction with Section 7.4 provides a method for studying means based on the assumption of a Normal distribution.

Beyond this basic introduction, the course may be extended with a consideration of the following topics:

1. Gomparing two groups or treatments Chapter 12.

2. Design of experiments Chapter 13.

3. Gomparing three or more groups or treatments: An informal introduction Sections 14.1.1 to 14.1.4.

4. Linear and multiplicative models and methodology Chapter 14.

5. Generalized linear models Chapter 14.

6. Experimental and treatment designs and their applications Chapter 15.

7. Gomparing frequency tables Chapter 16.

8. Multivariate statistical methods Chapter 17.

9. Use of supplementary information: Regression analysis Chapter 18.

10. Variability and variance components Chapter 19.

11. Gause and effect Chapter 20.

12. Studies involving time or time-related variables Chapter 21.

(18)

Contents

Preface to the Second Edition ... vii

Preface to the First Edition ... IX Supplementary Material A vailable on a Web Site ... Xlll Plan of the Book ... xv

U sing the Book ... xix

Commonly Used Symbols ... xxvii

1 The Importance of Statistics in an Information-Based World ... 1

1.1 The Expanding Role of Statistics ... 1

1.2 The Place of Statistics in Investigation and Operational Activities ... 4

1.3 Descriptive Statistics ... 12

1.4 Analytical Statistics ... 29

Problems ... 31

2 Data: The Factual Information ... 35

2.1 Data Collection ... 35

2.2 Types of Data ... 40

Problems ... 43

3 Statistical Models: The Experimenter's View ... 46

3.1 Components of a Model ... 46

3.2 The Investigator's Aims and Statistical Hypotheses ... 47

3.3 Distributional Assumptions ... 54

(19)

xxii Contents

3.4 Design Structure ... 56

Problems ... 57

4 Comparing Model and Data ... 59

4.1 Intuitive Ideas ... 59

4.2 The Role of "Statistics" ... 63

4.3 Measuring Agreement Between Model and Data ... 64

Problems ... ... 71

5 Probability: A Fundamental Tool of Statistics ... .... 72

5.1 Probability and Statistics ... 72

5.2 Sampling Distributions ... 74

5.3 Probability: Definitions and Rules ... 81

5.4 Random Variables ... 83

Appendix. Combinatorial Formulas ... 90

Problems ... 92

6 Some Widely Used Statistical Models ... 96

6.1 The Binomial Model... 96

6.2 The Two-State Population Model ... 97

6.3 A Model for Occurrences of Events ... 99

6.4 The Multinomial Model ... 101

6.5 The Normal Distribution Model ... 103

6.6 The Logistic Model ... 109

Problems ... 114

7 Some Important Statistics and Their Sampling Distributions ... 116

7.1 The Binomial Distribution ... 116

7.2 The Poisson Distribution ... 119

7.3 The Normal Distribution ... 120

7.4 The t-Distribution ... ~... 125

7.5 The Chi-Squared Distribution ... 129

7.6 The F-Distribution ... 130

7.7 Statistics Based on Signs and Ranks ... 132

7.8 Statistics Based on Permutations and Simulation ... 137

Problems ... 141

8 Statistical Analysis: The Statistician's View ... 145

8.1 The Range in Models and Approaches ... 145

8.2 Hypothesis Testing ... 147

8.3 Estimation... 158

8.4 Likelihood ... 164

8.5 The Bayesian Approach ... 165

8.6 Choosing the Best Method ... ... 170

Problems ... 171

(20)

Contents xxiii

9 Examining Proportions and Success Rates ... 174

9.1 Experimental Aims ... 174

9.2 Statistical Analysis ... 178

Problems ... 188

10 Model and Data Checking ... 191

10.1 Sources of Errors in Models and Data ... 191

10.2 Detecting Errors in Statistical Models ... 193

10.3 Analyzing Residuals ... 197

10.4 Checking Data ... 210

10.5 Data Modification or Method Modification? ... 213

Problems ... 215

11 Questions About the "Average" Value ... 218

11.1 Experimental Considerations ... 218

11.2 Choosing and Applying Statistical Methods ... 221

Problems ... 230

12 Comparing Two Groups, Treatments, or Processes ... 233

12.1 Forms of Comparison ... 233

12.2 Comparisons Based on Success Rates and Proportions ... 240

12.3 Comparisons of Means ... 249

12.4 Comparisons of Medians ... 263

12.5 A Consideration of Rare Events ... 271

Problems ... 276

13 Comparative Studies, Surveys, and Designed Experiments ... 282

13.1 Types of Investigations ... 282

13.2 Experimental and Treatment Designs ... 286

13.3 Paired Comparisons ... 297

13.4 Surveys ... 309

13.5 Determining Sample Sizes ... 318

Appendix 13.1. Steps in Randomly Selecting a Sample from a Population ... 321

Appendix 13.2. Steps in Random Allocation of Units Between Treatments ... 322

Problems ... 323

14 Comparing More Than Two Treatments or Groups ... 326

14.1 Approaches to Analysis ... 326

14.2 Statistical Models ... 336

14.3 Statistical Methods ... 342

14.4 Selecting a Method ... 350

14.5 Comparisons Based on Medians ... 352

Problems ... 357

(21)

xxiv Contents

15 Comparing Mean Response When There Are Three or

More Treatments ... 359

15.1 Experimental and Statistical Aims ... 359

15.2 Defining and Choosing Designs ... 360

15.3 Model and Data Checking .... ... 380

15.4 Analysis of Variance and One-Stratum Designs ... 388

15.5 Designs with Experimental Design Structure ... 392

15.6 Factorial Arrangement of Treatments ... 407

15.7 More Detailed Comparison of Treatment Differences ... 414

15.8 Analyzing Counts of Events Using a Generalized Linear Model Approach ... 418

Problems ... 421

16 Comparing Patterns of Response: Frequency Tables ... 424

16.1 The Scope of Applications ... 424

16.2 Statistical Models ... 430

16.3 Statistical Methods ... 434

Problems ... 447

17 Studying Relations Between Variables ... 449

17.1 Univariate Versus Multivariate Applications ... 449

17.2 The Scope of Multivariate Applications ... 449

17.3 The Building Blocks of Multivariate Methods ... 450

17.4 Seeking Evidence of a Relationship Between Two Variables ... 463

17.5 Studying Relations Among Three or More Scaled Variables ... 471

17.6 Relations Between Three or More Categorical Variables ... 484

17.7 Comparing Patterns in Groups or Treatments ... 487

17.8 Studying Relations Between Individual Units or Subjects ... 496

17.9 Classification: Assigning Objects to Predefined Groups ... 502

Problems ... 508

18 Prediction and Estimation: The Role of Explanatory Variables ... 512

18.1 Regression Analysis ... 512

18.2 Statistical Models ... 519

18.3 Statistical Methods ... .... ... 523

18.4 Practical Considerations ... .... ... 530

18.5 Statistical Analysis ... 533

18.6 Applications of Regression Analysis ... 538

18.7 Experimental Design and Regression Analysis ... 551

18.8 Generalized Linear Models and Their Analysis ... 556

Problems ... 561

(22)

Contents xxv

19 Questions About Variability ... 566

19.1 Variability: Its Measurement and Application ... 566

19.2 Variance Components ... 573

Problems ... 583

20 Cause and Effect: Statistical Perspectives ... 586

20.1 The Allocation of Causality: Scientific Aims and Statistical Approaches ... 586

20.2 Statistical Methods in Use ... 589

21 Studying Changes in Response over Time ... 593

21.1 Applications ... 593

21.2 Time Series ... 596

21.3 Time Series: Statistical Models and Methods .. ... 603

21.4 Statistical Quality Control ... ... ... 609

A Tables for Some Common Probability Distributions ... 613

A.1 The Normal Distribution ... 613

A.2 The t-Distribution ... 618

A.3 The Chi-Squared Distribution ... 620

References 625 Index ... 629

(23)

Commonly U sed Symbols

Mathematical Symbols exp

In or log

1 1

n!

min(x,y) (~)

The exponential function, e.g., exp(y).

The naturallogarithm function, e.g., ln(y), log(y).

Two bars enclosing a symbol, a number, or an expression indicate that the number or the value of the symbol should be treated as a positive value, e.g., 1

+

2[ and 1 - 21 are both equal to 2.

Approximately equal to.

Read n factorial. Defined in the appendix to Chapter 5.

Read "n choose x." Defined in the appendix to Chapter 5.

The smaller of the values contained in parentheses.

Statistical Variables

x, y, etc. Lowercase symbols in italic are generally used to represent response variables.

Vi, YI, etc. A subscript is employed where it is necessary to distinguish among variables.

Yij Multiple subscripts are introduced where there is a need to identify variables associated with different groups (Section 10.3.1).

Yo, xo, etc. The subscript 0 is an abbreviation for the word observed and indicates the numerical value observed in the data for the response variable that it subscripts.

(24)

xxviii Commonly Used Symbols

y The statistic denoting the arithmetic mean of the response variable y in the sampie (Section 1.3.6).

ii

The statistic denoting the median of the response variable y in the sampie (Section 1.3.5).

iJ

A hat appearing over a variable indicates an estimator of the variable it covers.

Parameters

A, Cl!, etc. Uppercase letters or Greek alphabetic characters in italic denote

parameters of probability distributions or component equations.

A, B,

etc. A hat appearing over a parameter indicates an estimator (Le., statistic) of the parameter it covers.

A, a A lowercase italic letter when used in conjunction with its uppercase form denotes an estimate of the value of the parameter.

Sampling Distributions and Probability Distributions

?T( ) The probability distribution of a random variable (Section 5.4.1).

?Ts( ) The joint prob ability distribution of random variables (Section 5.4.4).

Special Distributions

N(O,1)

z

t(v) ta:

(v)

X2(v) X~(v) F(vl, V2) Fa:(Vr, V2)

Normal distribution with mean /L and variance a2 (Sections 6.5.4 and 7.3).

Standard Normal distribution (Section 7.3.2).

Widely used to denote a variable with a Standard Normal distribution (Section 7.3.2).

Value of a Standard Normal variable that satisfies the prob ability statement Pr(z

>

za:) = Cl! (Section 7.3.2).

t-statistic with v degrees of freedom (Section 7.4).

Value of a t(v)-statistic that satisfies the prob ability statement Pr{lt(v)1 ~ Ita:(v)l}

=

Cl! (Section 7.4.2).

Chi-squared statistic with v degrees of freedom (Section 7.5.4).

Value ofaX2(v)-statistic that satisfies the probability statement Pr{x2(v) ~

x!

(v)} = Cl! (Section 7.5.2).

F-statistic with VI and v2 degrees of freedom (Section 7.6.2).

Value of an F(VI, v2)-statistic that satisfies the probability statement Pr{F(vr, V2) ~ Fa:(vr, V2)} = Cl! (Section 7.6.2).

References

Related documents