Modelling approach - Data and methodology

Chapter 3. Data and methodology

3.5 Modelling approach

My objective in the chapters that follow is to examine the level of post-move satisfaction that movers report and also the factors that are associated with different levels of post-move satisfaction. I consider both across the range of satisfaction domains. In order to do so, I start with the following generic equation for each post-move satisfaction domain:

(3.1)

where yi is the estimated level of satisfaction for the i^th individual, Xi is the vector of the included independent variables and α, β and εi are estimates of the constant, slopes and error terms respectively.

I apply my estimation model using both OLS and logit regression in order to estimate the association that each factor has with my ordered, discrete dependent variable.

Multiple Regression

Agreement over the most appropriate method of estimating the determinants of subjective wellbeing, happiness or post-move satisfaction is by no means fixed within literature. In this section I outline my decision to use OLS regression instead of alternative methods of estimating post-move satisfaction and discuss the methods used in existing post-move satisfaction studies.

Within the post-move satisfaction literature, the predominant method of analysing cross-sectional data has been through the use of logit models. Ordinal logit models were utilised by Lundholm and Malmberg (2006) in their study of the post-move satisfaction of post-movers. Multinomial logit models have also been used, effectively treating the satisfaction responses as neither cardinal nor ordinal, but simply discrete.

Barcus (2004), De Jong et al. (2002) and Lu (2002) each used multinomial logit models

to measure the probability of movers being either better off, or worse off, or about the same.

The remaining studies of post-move satisfaction used linear regression models for their ease of interpretation. The standard OLS models have been the predominant method of analysing residential satisfaction as well (Lu, 1999). Although the ordinal logit (or probit) models have been considered by some to be superior (Lu, 1999, McKelvey and Zavoina, 1975), as others have concluded, the results of the two methods are fundamentally comparable (Clark et al., 2008a, Ferrer-i-Carbonell and Frijters, 2004). The ease of interpreting its estimated coefficients is a key factor in researchers deciding to utilise OLS regression (Nowok et al., 2011, Ryan, 2012).

When researchers use OLS regression they assume that the dependent variable of satisfaction is cardinal in nature. For this to be the case, movers must have a common interpretation of each level of post-move satisfaction and also interpret the difference between successive levels of satisfaction equally (Ferrer-i-Carbonell and Frijters, 2004, Ng, 1997). For example, movers should have a common understanding of how satisfied „very satisfied‟ is and also evaluate the difference between „equally satisfied‟, „dissatisfied‟ and „satisfied‟ as being the same degree of difference as that between „satisfied‟ and „very satisfied‟.

Ferrer-i-Carbonell and Frijters (2004) cite studies by Sandvik et al. (1993) and Diener et al. (1999), which show that, by observing others, individuals are able to determine and then interpret their emotional state, suggesting that “there is a common human „language‟ of satisfaction and that satisfaction is roughly observable and comparable among individuals” (Ferrer-i-Carbonell and Frijters, 2004: p. 644).

Furthermore, individuals associate a similar numerical value to satisfaction levels and the distribution of these numerical values tend to be relatively evenly spaced (van Praag, 1991). Movers have even been considered to evaluate their level of satisfaction as if the question asked was cardinal in nature (Schwartz, 1995).

McKelvey and Zavoina (1975) suggest that multiple regression does not model the true relation between the ordinal dependent variables and the independent variables.

They conclude that the lumped nature of the dependent variable introduces a bias “into the estimate of β which is dependent on the distribution of the independent variable”

(McKelvey and Zavoina, 1975: p. 119). This bias may underestimate the effect of some

variables. In measuring the residential satisfaction of individuals, Lu (1999) considered the use of both ordered logit and regression models. While he concluded that the

“results from multiple regression models should be accepted with a grain of salt” (Lu, 1999: pp. 284) for the reasons stated by McKelvey and Zavoina (1975), both models showed largely similar results.

It is with these considerations in mind that I utilise the OLS regression model in my statistical modelling. For much of my initial exploratory work I utilised an ordered probit model and throughout my analysis, I continued to compare my OLS regression results with the probit model, and found no significant differences between the results of the two models.

Independent variables

I anticipate that the factors influencing the post-move satisfaction of movers will be both complex and contingent. Correspondingly, my initial model starts with only a single independent variable in the model, distance. As I consider subsequent factors that may be associated with different levels of post-move satisfaction, I add further independent variables to the model and consider their association both with post-move satisfaction and also the pre-existing independent variables. With each variable added, my model increases in complexity. The order with which I add additional factors is provided in Table 3.3, under three categories of variables, those pertaining to the move, the mover and the area. A full list of independent variables is given in Table 3.4

Table 3.3: Order that the independent variables are added, by category

Move Mover Area

Distance 1 Age 4 Neighbourhood

deprivation 9

LLM change 2 Ethnicity 5 Urban hierarchy 10

Time since move 3 Sex 6

Reasons for moving 11 Cohabitation 7

Socio-economic status 8

Table 3.4: Description of the independent variables used in the analysis of post-move satisfaction

Variable Categories Frequency (%) or mean Variable Categories Frequency (%) or mean

Distance km [ln(km)] 61.4 (mean) Income Unknown 5.4

9-12 months 13.2 Income change Increased 26.4

1-2 years* 34.4 (compared with one year

Labour force status Not in labour force* 25.7

Three 22.3 Unemployed 6.1

Four 16.5 Managers and professionals 26.3

Five or more* 41 Trades and services 22.5

Length at previous address

Years [ln(years)] 0.74(mean) Primary and secondary 13.1

Unknown 6.4

Not otherwise Identified (n.i.e) 4.6 Forced moves Voluntary* 81

Place of birth Overseas 22.8 Forced 19

Cohabitation status Existing couple* 53 Economic* 32.9

New couple 5.2 Housing 18.3

Different couple 0.6 Environment 9.3

Still single 35.6 Other 3.1

Newly single 5.5 Multiple motives One* 67.9

Education None 22.7 Multiple 32.1

Secondary 24 * Denotes reference category

Post-School* 38.4

Bachelor or higher 15

Source: Statistics New Zealand, 2007

3.6 Summary

In this chapter I have introduced the scope of the study and discussed the sources of my data and my analytical approach. At the centre of my study is the 2007 Survey of Dynamics and Motivations for Migration in New Zealand, a cross-sectional analysis of a sample of over 26,000 New Zealanders. From this sample, the characteristics of approximately 4,900 movers and their moves are available. The DMM survey was supplemented by a number of additional datasets in order to provide information not only of the mover, but also the characteristics of the areas that they move from and to, and additional information about the move itself.

The distribution of responses to the post-move satisfaction question was found to be negatively skewed. The negative skew was particularly pronounced for overall post-move satisfaction, with 87% of respondents reporting a degree of satisfaction with the outcomes of their move. This distribution was expected and notably similar to the distribution of responses found in other surveys.

I then introduced my statistical model and reviewed the relative positive and negative factors associated with using OLS regression. I chose to utilise OLS regression techniques over logit (probit) or multinominal alternatives given the relative ease of interpretation compared with models such as ordinal logit, which treat the dependent variable differently. In order to understand the probability of a positive satisfaction outcome, I utilise logistic regression in key areas.

In order to understand the association between post-move satisfaction and each of the independent variables that I analysed, I outlined the categories and the order in which I would progressively add them to my models. In the following chapter, I start by only considering the association between post-move satisfaction and the most salient geographical variable, distance. In subsequent chapters I introduce successive variables. This sequential approach allows me to assess the degree to which each new variable is associated with the post-move satisfaction as well as how it contributes to altering the statistical influence of variables already in the model.

In document The post-move satisfaction of individuals moving within New Zealand (Page 75-81)