Table 4.7 GHS Summary Wage Equations
4.3 Statistical Methodology
4.3.1 Matching Earnings into the LFS
In recent research it has been noted that while there is considerable virtue in having large datasets for microeconometric analysis, such data sets often contain relatively limited information.
However, other datasets are available which contain more detailed information on smaller samples. Imbens and Lancaster (1994) have recently developed a methodology for exploiting the availability of such additional detailed information, either at the aggregate level, or for smaller selected samples. This methodology is a generalisation of that used by Ar sllano and Meghir (1992) for matching complimentary datasets. The merit of their methodology is that it avoids the need to collect expensive new data that contains all the variables of interest for a large sample.
Thus, while LFS data does not contains earnings or income information the General Household Survey (GHS) allows us to estimate, for each year, the relationship between (log) earnings and a vector, X, of characteristics for any occupation. This can then be used to predict the earnings of individuals surveyed in the LFS with those same characteristics in any occupation. That is,
l o g < [4.3.1]
A
the predicted wage for individual i when in occupation j in year t, where (3 is an estimated
GHS and the j subscript indicates that the equation is estimated from the subsample of individuals who are employed in occupation j.
The intuition behind the above matching process is as follows. An individual's wage in made up of two components, the first which can be attributed to the individuals observable characteristics which include age, qualifications, sex and ethnicity and the second component which is unobservable and includes factors such as personal motivation, organisational abilities and interpersonal skills. In this paper we predict earnings for persons in the LFS by using only the first component of earnings, i.e. that based on individual observable characteristics. The GHS informs us of the payments to each observable characteristic from which we can construct earnings (the observable part only) of individuals in the LFS. The advantage of ignoring the unobservable
eÇÇaV
component of wages is that such a fixed^may well be correlated with transitions and we thereby effectively eliminate this source of simultaneity between wages and transitions.
4.3.2 Construction of the Alternative Earnings
To analyse the transitions we require counterfactual earnings or wage rates for each individual, whether or not they move, in order to constmct their expected differential from moving. Even if the LFS collected earnings information we would still need to constmct the same wage for those that move occupations and the move wage for those that stay in the same occupation. Indeed for simultaneity reasons we may wish to instmment the observed wage - in effect predicting both components of the wage differential for each individual.
Our procedure is to use the detailed information in each corresponding year of the GHS to predict the alternative earnings measures. We use these to constmct the differentials for each individual in each year of the LFS. What matters for transitions therefore is the difference in relative differentials between movers and stayers.
In order to compute the expected earnings for any individual we need to weigh the observed job specific earnings with the probabilities that each individual would be able to attain such an occupation. That is, we count the proportion of individuals in each occupation group at time t who
share a particular set of characteristics. That is we compute the proportion of individuals in
occupation k of type i at each period (7C * Then we define our offer weights as
[4.3.2]
for an individual of type i currently in occupation k at time t. The idea is to capture the chances of an individual with particular characteristics moving to a higher occupation (j) by the proportion of like individuals already in a higher occupation.
Thus, these two equations allows us to compute occupational wage differentials for each individual in each year of the LFS data and, moreover, do so in a way which allows for estimated changes in the structure of rewards over time.
For self employment the alternative wage does not involve any such implicit weighting. It is simply based on the predicted self employed earnings given individual observable characteristics.
4.3.3 Model Specification
The statistical framework we use here is an adaptation of the mover-stayer model. The mobihty decision is based on a comparison of two prospective income streams associated with moving to a new job with a higher wage (or moving into self-employment) or staying with the existing one. Mobility from occupation k to occupation j is desired if
>C,. [4.3.3]
Note that we cannot use the proportions of type i who move from j to k since this would not only be very imprecise (since only small numbers are involved) but also endogenous. In any case, we might expect the proportions who move into an occupation to be the same as the proportions in the occupation in steady state.
where COand cof, are the wage incomes associated with moving to occupation j and staying with
occupation k, and C, are the costs of moving from one to the other.
However, individuals may not be able to obtain a job with a higher wage. Thus, the probability that individual i is observed to move to a higher wage job is given by
> c, [4.3.4]
j
where is the probability that individual i is able to secure a job in occupation j.
To construct the likelihood function for the observed labour market transitions of a group of employed individuals we assume that the probability of moving upward to higher occupational class employment as opposed to staying with the same type of occupation is described by a simple Logit model, thus we exclude those individuals who transited to self-employment, unemployment, or retirement. The model is then extended, to a Multinomial Logit model where individuals are faced with the multiple decision of, moving into self employment, staying in the same job, or moving upward.
For each individual, the probability of upward movement depends on a vector Z„ comprising both variables that influence individual preferences such as marital status and dependent children, and variables reflecting the earnings opportunities in the current and alternative job and indicators of the local labour environment. It is the inclusion of the earnings variables, which are treated as endogenous to the transition decision, that defines our "structural" model. Therefore, for individual i,
Pr (move upi=l) = A(y 'Z,. ) [4.3.5]
which, given the Logit specification, becomes
ex p (y 'Z ,.)
Pr (move upi=l) = ---7--- r [4.3.6]
The earnings equation has the form
logo),i = p ; ' X „ + e , [4.3.7]
which is used to match GHS earnings into the LFS sets as described above. The vector X \-
contains variables which will affect the individuals' earnings in occupation j: including detailed educational qualifications, age, experience, experience squared, region, and whether the individual is non-white or has any kind of health limitation^"^. Note that since we wish to compute wages for the same individual in different occupations the wage equations are estimated for each occupation separately. It would, in any case be inappropriate to include occupation dummies since occupation is endogenous and it would also would imply that the differentials were the same across individuals. Moreover, since we want to capture the effects of changes in returns to characteristics over time we estimate the wage equations for each year separately.