vii
TABLE OF CONTENTS
CHAPTER TITLE PAGE
COVER i DECLARATION ii DEDICATION iii ACKNOWLEDGEMENTS iv ABSTRACT v ABSTRAK vi
TABLE OF CONTENTS vii
LIST OF TABLES xi
LIST OF FIGURES xii
LIST OF SYMBOLS xiv
LIST OF APPENDICES xv
1 INTRODUCTION 1
1.1 Count Data 1
1.2 Statement of the Problem 3
1.3 Objectives of the Study 4
1.4 Scope of the Study 4
1.5 Significance of the Study 4
1.6 Outline of the Study 5
1.7 Analysis Flow Chart 7
2 LITERATURE REVIEW 8
2.1.1 Random component 8
2.1.2 Systematic component 9
2.1.3 Link 9
2.2 Principles of Statistical Modelling 10 2.2.1 Exploratory Data Analysis 10
2.2.2 Model Fitting 10
2.3 Poisson Distribution 12
2.4 Poisson Regression 14
2.5 Problems in Poisson Regression 16
2.5.1 Truncation and Censoring 16
2.5.2 Excess Zero 17
2.5.3 Overdispersion 19
2.6 Alternative Count Models 20
2.7 Negative Binomial Regression 21
3 POISSON REGRESSION ANALYSIS 23
3.1 The Model 23
3.2 Estimation of Parameters Using Maximum Likelihood Estimation (MLE)
24
3.3 Standard Errors for Regression Coefficients 31
3.4 Interpretation of Coefficients 31
3.5 Elasticity 34
3.6 Model Checking Using Pearson Chi-Squares and Deviance 35 3.6.1 Pearson Chi-Squares 36 3.6.2 Deviance 36 3.7 Model Residuals 37 3.8 Inference 38 3.8.1 Test of Significance 38 3.8.2 Confidence Intervals 39
ix
3.9 Handling Overdispersion 39
3.9.1 Quasi-Likelihood Method 39
3.9.1.1 Estimating the Overdispersion Parameter
43
3.9.1.2 Testing for Overdispersion 44 3.9.2 Negative Binomial Regression Analysis 45
3.10 Example 46
4 ANALYSIS OF POISSON REGRESSION USING SAS
58
4.1 Introduction 58
4.2 Nursing Home Data 58
4.3 Choosing the Right Model 65
4.4 Results and Discussion 67
4.5 Negative Binomial Regression 76
4.6 Results and Discussion 77
5 SIMULATION STUDY 85
5.1 Data Simulation 85
5.2 Analysis of Data with No Overdispersion 86
5.2.1 Results and Discussion 93
5.2.1.1 Goodness-of-fit 93
5.2.1.2 Significance, Confidence Intervals, and Standard Errors
94
5.3 Analysis of Data with Overdispersion 94
5.3.1 Results and Discussion 101
5.3.1.1 Goodness-of-fit 101
5.3.1.2 Significance, Confidence Intervals, and Standard Errors
102
5.5 The Simulation Codes 103 5.5.1 Codes for the simulation of
non-overdispersed data
104
5.5.2 Codes for the simulation of overdispersed data
105
6 SUMMARY AND CONCLUSIONS 106
6.1 Summary 106
6.2 Conclusions 107
6.3 Recommendations 107
REFERENCES 108
xi
LIST OF TABLES
TABLE NO. TITLE PAGE
3.1 Elephant’s mating success regarding age 47
3.2 Iterative reweighted least squares results 49
3.3 Residuals for elephant’s mating success data 53
3.4 Adjusted standard errors 57
4.1 Nursing home data 59
4.2 Log likelihood and deviance for model M1, M2, and M3 65 4.3 Elasticities of the explanatory variables in nursing home
data for model M3
69
4.4 Pearson residuals and adjusted residuals for nursing home data
70
4.5 Comparison among standard errors for ordinary Poisson regression and corrected Poisson regression
75
4.6 Elasticities of the explanatory variables in nursing home data for negative binomial regression model
80
4.7 Residuals for negative binomial regression 81
5.1 Pearson chi-square and deviance for Poisson regression and negative binomial regression obtained from data that has no overdispersion
93
5.2 Pearson chi-square and deviance for Poisson regression and negative binomial regression obtained from overdispersed data
LIST OF FIGURES
FIGURE NO. TITLE PAGE
2.1 Steps in model fitting 11
3.1 Steps in quasi-likelihood approach 40
3.2 SAS result for analysis of elephant’s mating success data 50
4.1 SAS output for model (M1) 62
4.2 SAS output for model (M2) 63
4.3 SAS output for model (M3) 64
4.4 SAS output for corrected Poisson regression 74 4.5 SAS output for negative binomial regression 79 5.1 Poisson regression SAS output of non-overdispersed data
for µ =10
87 5.2 Poisson regression SAS output of non-overdispersed data
for µ =20
88
5.3 Poisson regression SAS output of non-overdispersed data for µ =50
89 5.4 Negative binomial regression SAS output of
non-overdispersed data for µ =10
90 5.5 Negative binomial regression SAS output of
non-overdispersed data for µ =20
91 5.6 Negative binomial regression SAS output of
non-overdispersed data for µ =50
92 5.7 Poisson regression SAS output of overdispersed data for
10 =
µ 95
5.8 Poisson regression SAS output of overdispersed data for 20
=
µ 96
5.9 Poisson regression SAS output of overdispersed data for 50
=
µ 97
5.10 Negative binomial regression SAS output of overdispersed data for µ =10
xiii
5.11 Negative binomial regression SAS output of overdispersed data for µ =20
99 5.12 Negative binomial regression SAS output of overdispersed
data for µ =50
LIST OF SYMBOLS Y Response variable X Predictor variable β Regression coefficient η Link Function I Information matrix U Score fuction W Weight matrix E Elasticity 2
X Pearson chi-squares statistic D Deviance statistic
R Pearson residual Z Wald statistic
xv
LIST OF APPENDICES
APPENDIX TITLE PAGE
A SAS Codes for Elephant’s Mating Success Data 114 B The Values of µˆ for Elephant’s Mating Success Data i 115
C SAS Codes for Nursing Home Data 117
D SAS Output of Residual Analysis for Poisson Regression in Nursing Home Data
119
E The Values of µˆ for Nursing Home Data i 124 F SAS Output of Residual Analysis for Negative Binomial
Regression in Nursing Home Data
126