Recommendations and Future Directions

Chapter 9 Concluding Remarks

9.3 Recommendations and Future Directions

Similar to any other research endeavors, the end of this research has opened up new questions and new research problems to be answered. Given the importance of small area estimates of poverty measures in aid allocation and in monitoring progress to- wards the Millennium Development Goals it is very important that the method used for generating such estimates is theoretically sound mathematically and statistically and in agreement with acceptable standards for small area estimation. The investigation conducted is only part of a more extensive examination needed if the ELL method can justifiably maintain its role as the official method for generating small area estimates of poverty measures in Third World countries. For example, one further possible area of investigation is comparison of the different survey fitting procedures under a linear mixed model for household level income or expenditure which accounts for area level effects in addition to cluster and household levels.

Under the ELL method, the formulation of the linear mixed model for household level income or expenditure is based on the assumption that small areas are indepen- dent; this assumption is however not necessarily true for different small areas (e.g.,

Philippines’ municipalities). Correlation between adjacent small areas needs further examination. If there is sufficient evidence to show that the poverty situation in adjacent small areas (e.g., municipalities) tends to be more similar than those far apart or significant correlation can be established, then linear mixed models with area level random effects may not be sufficient and random effects at an even higher level may be warranted. Other models could be considered that can account for the spatial correlation, e.g. conditional autoregressive (CAR) models, although care is needed with these models since fitting extra correlated area based random effects can complicate model diagnostics and hide problem with underlying models.

In this thesis, the ESPREE method has been shown to be better than the ELL-based updating method, theoretically and in application to real data. However, there are still several avenues for improvement. The current model used for ESPREE could be further improved by including area level effects to account for the correlation between households within small areas (i.e., fitting a GLMM). Another way of improving the model would be to incorporate spatial variation in a similar manner as mentioned above for the ELL household level income or expenditure model, i.e. fitting a GLM with a CAR model to account for the spatial correlation between small areas. One of the recently proposed models that accounts for spatial heterogeneity called geograph- ically weighted regression (GWR) could also be considered. In addition, research should be conducted on effective ways to combine ESPREE based updated estimates with estimates from other data sources or estimation techniques that could lead to improvement of small area updated estimates of poverty measures.

Further investigation is also needed on various diagnostic or evaluation methods for assessing competing small area updating methods or small area models. For example, while it is already apparent that the ESPREE method has various advantages over the ELL-based updating methods, those arguments and evidence could be investigated further by developing other statistical diagnostic procedures primarily designed for small area updating models for poverty measures in Third World countries. Explo- ration of some of the diagnostics employed by Brown et al. (2001) and more recently by Inglese et al. (2008) should be conducted. Moreover, validation studies comparing

key informants’ assessment of the poverty situation in their area and the updated estimates should also be conducted in more regions of the country for a more extensive assessment of the acceptability of the estimates and their ability to reflect the real poverty situation on the ground.

In Chapter 6, various estimation procedure of the variance for the ESPREE based estimates have been presented. The BRR method combined with the bootstrap method have been used in the application of the ESPREE method to the Philippine data. These two methods were used in combination for simplicity of implementation, given that the sets of pseudo-census data were available in the form of bootstrap estimates from a previous poverty mapping project (Haslett and Jones, 2005) and the sampling design and the survey data available conform to what is required for a replication method such as BRR. Comparison of the different procedures needs to be conducted to assess their performance in measuring the variance of the updated small area estimates of poverty measures in Third World countries.

ACTIONAID-International (2006). Participatory poverty assessment, attapeu, lao pdr. A publication of the Mekong Wetlands Biodiversity Conservation and Sus- tainable Use Program.

ADB (2001). Participatory poverty assessment in cambodia.

http://www.adb.org/Documents/Books/Participatory Poverty/Participatory Poverty.pdf. ADB (2006). Poverty definition, measurement, and analysis.

http://www.adb.org/Statistics/Poverty/glossary.asp.

Agresti, A. (2002). Categorical Data Analysis. Wiley Series in Probability and Statis- tics, Hoboken, New Jersey.

Balisacan, A. M. and Edillon, R. G. (2003). Second poverty mapping and targeting for phases iii and iv of kalahi-cidss: Final report. Unpublished report.

Balisacan, A. M., Edillon, R. G., and Ducanes, G. M. (2002). Poverty mapping and targeting for kalahi-cidss: Final report. Unpublished report.

Battese, G. E., Harter, R. M., and Fuller, W. A. (1988). An error components model for prediction of county crop area using survey and satellite data. Journal of the American Statistical Association, 83:28–36.

Bishop, Y. M., Fienberg, S. E., and Holland, P. W. (1975). Discrete Multivariate Analysis. MIT Press, Cambridge, Massachusetts.

Blackorby, C. and Donaldson, D. (1980). Ethical indices for the measurement of poverty. Econometrica, 48:1053–1060.

Bogue, D. J. (1950). A technique for making extensive postcensal estimates. The Journal of American Statistical Association (JASA), 45(5):149–162.

Brooks, S. P. (1998). Markov chain monte carlo method and its appplication. The Statistician, 47:69–100.

Brown, G., Chambers, R., Heady, P., and Heasman, D. (2001). Evaluation of small area estimation methods - an application to unemployment estimates from the uk lfs. Proceedings of Statistics Canada Symposium.

Carlin, B. P. and Louis, T. A. (2008). Bayesian methods for data analysis. CRC Press, Boca Raton, 3rd edition.

Chakravarty, S. R. (1983). A new index of poverty. Mathematical Social Sciences, 6(3):307.

Chambers, R. (2006). What is poverty? who asks? who answers? Poverty in Focus, UNDP:December 2006, 3–4.

Cressie, N. (1992). Reml estimation in empirical bayes smoothing of census under- count. Survey Methodology, 18:75–94.

Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap. Chapman and Hall, New York.

Elbers, C., Lanjouw, J., and Lanjouw, P. (2002). Micro-level estimation of poverty and inequality. Research Working Paper 2911, World Bank, Development Research Group, Washington, D.C.

Elbers, C., Lanjouw, J., and Lanjouw, P. (2003). Micro-level estimation of poverty and inequality. Econometrica, 71:355–364.

FORBES (2008). Philippines 2007 overseas workers’ remittances at record 14.4 billion dollars. http://www.forbes.com/feeds/afx/2008/02/15/afx4659876.html.

Foster, J., Greer, J., and Thorbeck, E. (1984). A class of decomposable poverty measures. Econometrica, 52:761–766.

Foster, J. and Shorrocks, A. F. (1991). Subgroup consistent poverty indices. Econo- metrica, 59:687–709.

Frankel, M. R. (1971). Inference from survey samples. Int. Social Res., Univ. Michi- gan, Ann Arbor.

Fujii, T. (2003). Commune-level estimation of poverty measures and its application in cambodia. Preprint.

Ghosh, M. and Rao, J. N. K. (1994). Small area estimation: an appraisal. Statistical Science, 9:55–93.

Godambe, V. (1991).Estimating Functions. Oxford University Press, Inc., New York. Goldstein, H. (2003). Multilevel Statistical Models. 3rd ed. edition.

Gonzalez, M. E. (1973). Use and evaluation of synthetic estimators. Proceedings of the Social Statistics Section, pages 33–36.

Gonzalez, M. E. and Hoza, C. (1978). Small area estimation with application to unemployment and housing estimates. Journal of the American Statistical Association, 73:7–15.

Grizzle, J. E., Starmer, C. F., and Koch, G. G. (1969). Analysis of categorical data by linear models. Biometrics, 25:489–504.

Haslett, S. (1990). Degrees of freedom and parameter estimability in hierarchical models for sparse complete contingency tables. Computational Statistics and Data Analysis, 9:179–195.

Haslett, S., Green, A., and Zingel, C. (1998). Small area estimation given regular updates of census auxiliary variables. Proceedings of the New Techniques and Technologies for Statistics Conference. Sorrento, Italy.

Haslett, S. and Jones, G. (2004). Local estimation of poverty and malnutrition in bangladesh. Bangladesh Bureau of Statistics and United Nations World Food Pro- gramme.

Haslett, S. and Jones, G. (2005). Local estimation of poverty in the philippines. Philippine National Statistics Co-ordination Board / World Bank Report, (http://siteresources.worldbank.org/INTPGI/Resources/342674- 1092157888460/ Local Estimation of Poverty Philippines.pdf).

Haslett, S. J., Isidro, M. C., and Jones, G. (2010). Comparison of survey regression techniques in the context of small area estimation of poverty. To be published in Statistics Canada publication Survey Methodology, Catalogue 12-001-XIE2010002, December 2010, vol. 36 no. 2.

Haslett, S. J. and Jones, G. (2006). Small area estimation of poverty, caloric intake and malnutrition in nepal. Published: NepalCentral Bureau of Statistics / World Food Programme, United Nations / World Bank, September 2006, 184pp, ISBN 999337018-5.

Haslett, S. J., Noble, A., and Zabala, F. (2007). New approaches to small area estimation of unemployment. Project Report for Official Statistics Research Programme - Statistics New Zealand.

Henderson, C. R. (1953). Estimation of variance and variance components. Biomet- rics, 9:226–252.

Hoogeveen, J., Emwanu, T., and Okwi, P. (2003). Updating small area welfare indi- cators in the absence of a new census. Preprint.

Inglese, F., Russo, A., and Russo, M. (2008). Diagnostics of small area model-based estimators. http://www.sis-statistica.it/files/pdf/atti/rs08 spontanee a 2 5.pdf. Isaki, C. and Fuller, W. (1982). Survey design under the regression superpopulation

model. Journal of the American Statistical Association, 77:89–96.

Jiang, J. and Lahiri, P. (2006). Mixed model prediction and small area estimation.

Test, 15(1):1–96.

Jitsuchon, S. and Lanjouw, P. (2005). Updating poverty maps: Emerging experience with available methods and data from thailand. Preprint.

Judkins, D. R. (1990). Fay’s method for variance estimation. Journal of Official Statistics, 6(3):223–239.

Kakwani, N. (1980). On a class of poverty measures. Econometrica, 48(2):438–439. Kanji, G. K. and Chopra, P. K. (2007). Poverty as a system: Human contestability

approach to poverty measurement. Journal of Applied Statistics, 34(9):1135–1158. Kauermann, G. and Carroll, R. J. (2001). A note on the efficiency of sandwich covariance matrix estimation. Journal of the American Statistical Association, 96(456):1387–1396.

Kott, P. (2001). The delete-a-group jackknife. Journal of Official Statistics, 17:521– 526.

Kovar, J. G. (1985). Variance estimation of nonlinear statistics in stratified samples.

Methodology Branch Working Paper, (87-004E). Statistics Canada.

Krewski, D. and Rao, J. N. K. (1981). Inference from stratified samples: Properties of the linearization, jackknife and balanced repeated replication methods.The Annals of Statistics, 9:1010–1019.

Lanjouw, P. and van der Wiede, R. (2006). Determining changes in welfare distribu- tions at the micro-level: Updating poverty maps. Powerpoint presentation at the NSCB Workshop for the NSCB/World Bank Intercensal Updating Project.

Larbi, G. A. (1999). The new public management approach and crisis states. UNRISD Discussion paper No. 112.

Lohr, S. L. (1999). Sampling: Design and Analysis. Duxbury Press, Brooks/Publishing Company.

Longford, N. T. (1999). Multivariate shrinkage estimation of small area means and proportions. Journal of the Royal Statistical Society Ser. A, 162:227–245.

Marker, D. (1999). Organization of small area estimators using a generalized linear regression framework. Journal of Official Statistics, 15:1–24.

Martinez, C. and Yang, D. (2005). Remittances and poverty in migrants’ home areas: Evidence from the philippines. SDT - Departamento de Economia, Universidad de Chile, 257.

Mellor, R. W. (1973). Subsample replication variance estimation. Unpublished PhD Thesis, Harvard University, Cambridge, Massachussets.

Militino, A. F., Ugarte, M. D., Goicoa, T., and Gonzalez-Audicana, M. (2006). Using small area models to estimate the total area occupied by olive trees. Journal of Agricultural, Biological, and Environmental Statistics, 11:450–461.

Minot, N., Baulch, B., and Epprecht, M. (2003). Poverty and inequality in vietnam: Spatial patterns and geographical determinants.International Food Policy Research Institute, Washington, D.C. and Institute of Development Studies. University of Sussex.

Nelder, J. A. and Wedderburn, R. W. M. (1972). Generalized linear models. Journal of the Royal Statistical Society, 135(3):370–384.

Neri, L., Ballini, F., and Betti, G. (2005). Poverty and inequality mapping in transition countries. Statistics in Transition, 7(1):135–157.

Noble, A. (2003). Small area estimation through glm. Unpublished PhD Thesis, Massey University, New Zealand.

Noble, A., Haslett, S. J., and Arnold, G. (2002). Small area estimation via generalized linear models. Journal of Official Statistics, 18:45–60.

NRC (2000). Small-area estimates of school-age children in poverty: Evaluation of current methodology. C. F. Citro and G. Kalton (Eds.), Committee on National Statistics, Washington, DC: National Academy Press.

NSCB (2000). Profile of censuses and surveys. National Statistical Coordination Board, Philippines.

NSCB (2005). Estimation of local poverty in the philippines. National Statistical Coordination Board, Philippines.

NSCB (2009). 2003 city and municipal level poverty estimates. National Statistical Coordination Board, Philippines.

NSCBR-I (2000). Provincial poverty maps.http://www.nscb.gov.ph/ru1/poverty maps.htm. Osberg, L. and Xu, K. (2008). How should we measure poverty in a changing world?

methodological issues and chinsese case study. Review of Development Economics, 12(2):419–441.

Pfefferman, D. (1993). The role of sampling weights when modeling survey data.

International Statistical Review, 61:317–337.

Pfefferman, D., Moura, F. A., and Silva, P. L. (2006). Multi-level modelling under informative sampling. Biometrika, 93:949–959.

Pfefferman, D., Skinner, C. J., Holmes, D. J., Goldstein, H., and Rasbash, J. (1998). Weighting for unequal selection probabilities in multilevel models. J. R. Statist. Soc B, 60:23–40.

Prasad, N. G. N. and Rao, J. N. K. (1990). The estimation of the mean squared error of small area estimators. Journal of the American Statistical Association, 85:163–171.

Purcell, N. J. (1979). Efficient estimation for small domains: a categorical data analysis approach. Unpublished PhD Thesis. University of Michigan.

Purcell, N. J. and Kish, L. (1979). Estimation for small domains. Biometrics, 35:365– 384.

Purcell, N. J. and Kish, L. (1980). Postcensal estimates for local areas. International Statistical Review, 48:3–18.

Purcell, N. J. and Linacre, S. (1976). Techniques for the estimation of small area characteristics. Unpublished Paper, Australian Bureau of Statistics, Canberra. Qiao, C. (2006). Small area estimates produced using loglinear models in sas -

discussion and implementation. http://www.msd.govt.nz/documents/about-msd- and-our-work/publications-resources/ working-papers/wp-01-06-report-loglinear- modelling.doc.

Quenouille, M. H. (1949). Approximate tests of correlation in time-series. J. R. Statist. Soc. B, 11:68–84.

Rao, J. N. K. (1999). Some recent advances in model-based small area estimation.

Survey Methodology, 25:175–186.

Rao, J. N. K. (2003). Small Area Estimation. Wiley Series in Methodology, New York.

RDC-I (2008). Approving the common masterlist of priority program benefeciaries (cmbpp) for region i poverty reduction program. RDC Resolution no. 57 S. 2008. Robinson, G. K. (1991). That blup is a good thing: The estimation of random effects.

Statistical Science, 6(1):15–51.

Saei, A., Zhang, L. C., and Chambers, R. (2005). Generalized structure preserving estimation for small areas. Statistics in Transition, 7(3):685–696.

Schaible, W. L. (1996). Use of Small Area Statistics in U.S. Federal Programs. Springer-Verlag, Inc., New York.

Sen, A. K. (1976). Poverty: An ordinal approach to measurement. Econometrica, 44(2):219–231.

Siddhartha, C. and Greenberg, E. (1995). Understanding the metropolis-hastings algorithm. The American Statistician, 49(4):327–335.

Simonoff, J. S. (2003). Analyzing Categorical Data. Springer texts in Statistics, New York.

Skinner, C. J., Holt, D., and Smith, T. M. F. (1989). Analysis of Complex Survey Data. John Wiley, Chichester.

Stukel, D. M. and Rao, J. N. K. (1999). Small area estimation under two-fold nested error regression models. Journal of Statistical Planning and Inference, 78:131–147. Taylor, M. F. (2001). British Household Panel Survey user manual, volume B1.

Codebook.

Tukey, J. W. (1958). Bias and confidence in not-quite large samples. Ann. Math. Statist., 29.

UN-website (2009). Millenium development goals.

http://www.un.org/milleniumgoals/bkdg.shtml.

Wolter, K. M. (1985). Introduction to Variance Estimation. Springer-Verlag, New York.

Xiaoyun, L. and Remenyi, J. (2008). Making poverty mapping and monitoring participatory. Development in Practice, 18(4-5):599–610.

Yang, D. and Choi, H. (2007). Are remittances insurance? evidence from rainfall shocks in the philippines. The World Bank Economic Review, 21(2):219–248. You, Y. and Rao, J. N. K. (2002). A pseudo-empirical best linear unbiased predic-

tion approach to small area estimation using survey weights. Survey Methodology, 30:431–439.

You, Y. and Rao, J. N. K. (2003). Pseudo hierarchical bayes small area estimation combining unit level models and survey weights. Journal of Statistical Planning and Inference, 111:197–208.

You, Y., Rao, J. N. K., and Kovacevic, M. (2003). Estimating fixed effects and variance components in a random intercept model using survey data. Statistics Canada International Symposium Series-Proceedings.

Zhang, L. C. and Chambers, R. L. (2004). Small area estimates for cross- classifications. J. R. Statist. Soc. B, 66:479–496.

Zhao, Q. (2006). User manual for povmap version 1.1a.

http://siteresources.worldbank.org/INTPGI/Resources/342674- 1092157888460/Zhao ManualPovMap.pdf.

Appendices

Do Files in Stata for Different Survey Fitting Methods

ELL no hetero.do

// This is the program for fitting the regression model using the ELL method with random errors assumed to be homoscedastic.

set mem 100m set matsize 7000

cd “C:\SAE1\Results ELL”

use “E:\SAE\SURVEY2 all.dta”, replace

/*Re-scale the weights so it would sum up to the sample size*/ egen totwt=total(sswgthh)

gen rwt=sswgthh*( N/totwt) egen trwt=total(rwt)

#delimit ;

/*OLS regression using re-scaled survey weights*/

global Xvars “famsize famsizesqc type mult per kids roof light per 61up roof strong wall light wall salvaged wall strong fa xs fa s fa l fa xl fa xxl fa xxxl all eled all hsed all coed dom help head male no spouse hou 9600 hea rel mus Per eng Hou coelpg Hou own ref Hou own tel Per wor prh Per ind 52”;

#delimit cr

regress lnincpp $Xvars [pweight=rwt] global rmseB=e(rmse)

/*Save residuals, coefficients and covariance matrix*/ predict resB, resid

matrix beta=e(b) matrix VarB=e(V)

/*Calclulate the number of regressors*/ global nx=0

foreach var of varlist $Xvars { global nx=$nx+1

}

display $nx

/*Calculate the variance component - cluster level (var-upsilon)*/ by bcode, sort: gen nb= N

global nb=nb

egen upsilon=mean(resB), by(bcode) egen wtb=total(rwt), by(bcode) gen epsilon=resB-upsilon

egen epsmn=mean(epsilon), by(bcode)

gen df eps2=(epsilon-epsmn)*(epsilon-epsmn) gen wb=wtb/trwt

preserve

collapse nb trwt upsilon (sum)sumdeps2=df eps2 wtb=rwt u=resB, by(bcode) gen taub2=(1/(nb*(nb-1)))*sumdeps2 /*computation based on SAS prog, but differ from the one in the appendix of the ELL paper*/

gen wb=wtb/trwt gen nume1=wb*(upsilon*upsilon) gen denom=wb*(1-wb) gen nume2=denom*taub2 egen Snume1=total(nume1) egen Snume2=total(nume2) egen Sdenom=total(denom)

gen var ups=(Snume1-Snume2)/Sdenom

egen totres=total(u)/*just to check the sum of resids*/ display totres

egen twb=total(wb) /*just to check the sum of the adjusted weigths(=1)*/ display twb

display var ups scalar vU=var ups[1]

global Nbcode= N /*number of bcode or clusters*/ restore gen wtdres=resB*rwt

egen twtdres=total(wtdres) display twtdres

egen totres1=total(resB) display totres1

gen var eps=($rmseB*$rmseB)-vU /*no heteroscedasticity model with location effect*/

/* if no location effect*/ replace epsilon=resB if vU<0

replace var eps=$rmseB*$rmseB if vU<0 /*no heteroscedasticity model & no location effect*/

scalar vE=var eps[1] display vE

/*—————–GLS estimation———————*/ egen bcode1=group(bcode)

by bcode1, sort: gen numbcode= N gen const=1

global XvarsC “$Xvars const” global mb=$nx+1

matrix XWBX=J($mb,$mb,0) matrix XWBWX=J($mb,$mb,0) matrix XWBY=J($mb,1,0)

In document Intercensal updating of small area estimates : a thesis presented in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Palmerston North, New Zealand (Page 171-200)