• No results found

4.3 Preliminary Analysis

4.3.4 Results

4.3.4.1 Model Selection

The first step of the cross-sectional multilevel analysis was to estimate the model in equation 4.1 considering only the set of fixed main effects as explanatory variables. The covariates included in the model were those described in Table 4.1 and their effects were assessed in terms of their significance. It is worth mentioning that the variable for metropolitan region is the only cluster level covariate included at this stage as it is an important variable for the estimation of labour income in Brazil. All other variables considered at this stage were level one covariates.

The first column of Table 4.3 presents the results for this first model esti- mated using the Stata command xtmixed. Notice that the variables for the num- ber of household members and proxy respondent variables were not significant at the 5% level, and therefore not included in the model. It is worth remembering that the variables for age and duration of employment are now centred respectively around the age of 40 and the average employment duration.

Table 4.3 also presents the estimated variance components for this model. Both variance terms are statistically significant. The between PSU variance ˆσ2

u of

0.05 is relatively small compared to the within PSU variance ˆσe2 of 0.30, but still significantly different to zero. This gives an estimated intra-cluster correlation coefficient ˆρ of around 15%. This indicates the conditional correlation within cluster and that the PSU level should be considered in this analysis.

Before attempting to interpret this initial model, the residual diagnostics is performed. Figure 4.3 displays the level one residual in the first row and level two residual in the second row.

Figure 4.3: Residual Diagnostic - Main Effects Model

Observe that the residuals at the heads of household level seem to be nor- mally distributed. However, level two residuals show the presence of some extreme positive values. This may indicate the violation of the normality assumption. The fourth plot shows some evidence of non-constant variance for the PSU level resid- uals. This might also indicate the presence of unmeasured cluster effects or that the level two residuals are correlated with explanatory variables. The inclusion of additional PSU level variables or contextual effects may address this problem.

As identified earlier in this chapter, there is some evidence of gender and race discrimination in the Brazilian labour market. Before the inclusion of contextual variables, the model selection proceeded with the inclusion of interaction terms between level one variables. For this purpose, only the interaction terms between the dummy variables for males and whites and all other variables were considered and tested in the model. The column (2) of Table 4.3 presents the results for the model fitted with only the significant interaction terms. The inclusion of such terms improved the model fit, as indicated by the LRT (L2 = 556.43 with

19 degrees of freedom). However, the residual diagnostic plots for this model presented the same patterns as those of the previous model.

To try to improve the fit of this model the next step was to include contextual effects. Unfortunately, PSU level variables, other than the metropolitan region, were not available in the data set. Due to confidentiality protection such variables are not immediately available in any of the official surveys. One alternative found was to construct PSU level variables from the monthly PME data. This was performed by pooling data from years 2004 and 2005 for all interviewed individuals. Population means and proportions for specific variables were calculated taking the sampling design and sampling weights into account for each PSU. For simplicity, the contextual variables were calculated for the variables initially considered as covariates in the model. Before deciding which of these variables to include in the analysis, the level two residuals were plotted against the average PSU values of the explanatory variables in the cross-sectional data set. Some of these plots are presented in Figure 4.4 and similar behaviour to that in the fourth plot on Figure 4.3 was observed. Further model selection was performed and Figure 4.4 presents the plots for the significant contextual variables in the column labelled (3) of Table 4.3.

The column labelled (3) of Table 4.3 presents the final cross-sectional mul- tilevel model for the log of real labour income of employed heads of household. Figure 4.5 presents the residual diagnostics plots for this model. Notice that the inclusion of PSU level variables improved the shape of distribution of the level two residuals. This model can then be interpreted, starting from the estimates of the fixed part of the model as presented in the following sub-section.

Figure 4.4: Level Two Residuals by Significant Contextual Effects

Note: Level two residuals on the vertical axis.

Table 4.3: Cross-sectional Multilevel Modelling: Two-level Variance Compo- nents Model

(1) (2) (3)

Coeff SE Coeff SE Coeff SE Constant (intercept) 4.334 0.038 4.319 0.056 4.753 0.162 Month -0.004 0.001 -0.004 0.001 -0.004 0.001 squared term 0.213† 0.056† 0.205† 0.056† 0.216† 0.055† Males 0.379 0.006 0.598 0.070 0.594 0.070 White 0.109 0.006 0.092 0.010 0.071 0.010 Age 5.333† 0.272† 2.293† 0.513† 1.427† 0.509† squared term -0.351† 0.015-0.2930.028-0.3110.027† Education -0.011 0.002 -0.029 0.004 -0.030 0.004 squared term 5.711† 0.133† 6.454† 0.231† 6.055† 0.229† Type of Worker (Employer as baseline) Informal -0.545 0.012 -0.615 0.027 -0.593 0.027 Formal -0.360 0.010 -0.435 0.025 -0.412 0.025 Military service -0.247 0.015 -0.369 0.030 -0.339 0.029 Self-Employed -0.635 0.011 -0.806 0.026 -0.783 0.026 Type of Activity (Manufacturing as baseline) Building -0.069 0.010 0.181 0.052 0.180 0.052 Commerce -0.114 0.008 -0.001 0.017 -0.001 0.017 Financial -0.024 0.009 0.154 0.019 0.147 0.019 Social Services -0.044 0.011 0.082 0.018 0.081 0.018 Domestic Services -0.135 0.012 -0.067 0.017 -0.059 0.017 Other Services(a) -0.025 0.008 0.035 0.017 0.031 0.016

Other Activities(b) -0.272 0.027 -0.080 0.082 -0.088 0.081 Metropolitan Region (Recife as baseline) Salvador 0.116 0.023 0.117 0.023 0.077 0.016 Belo Horizonte 0.310 0.021 0.311 0.021 0.241 0.014 Rio de Janeiro 0.311 0.021 0.313 0.021 0.218 0.015 S˜ao Paulo 0.461 0.021 0.460 0.021 0.342 0.016 Porto Alegre 0.362 0.022 0.360 0.022 0.224 0.021 Duration of Employment (×120) 0.216 0.005 0.239 0.009 0.234 0.009 Squared term (×120) -0.049 0.002 -0.068 0.005 -0.068 0.005 Working Hours (in Log) 0.459 0.008 0.494 0.011 0.498 0.011 Interaction Terms of Male and:

White - - 0.024 0.011 0.024 0.011 Age - - 0.004 0.001 0.004 0.001 Squared term - - 0.000 0.000 0.000 0.000 Education - - 0.026 0.005 0.026 0.005 Squared term - - -0.001 0.000 -0.001 0.000 Type of Worker (Employer as baseline) Informal - - 0.074 0.030 0.072 0.030 Formal - - 0.079 0.027 0.080 0.027 Military service - - 0.167 0.035 0.166 0.034 Self-Employed - - 0.217 0.028 0.215 0.028 Type of Activity (Manufacturing as baseline) Building - - -0.298 0.053 -0.286 0.053 Commerce - - -0.152 0.019 -0.156 0.019 Financial - - -0.239 0.022 -0.240 0.021 Social Services - - -0.196 0.023 -0.199 0.023 Domestic Services - - -0.312 0.037 -0.312 0.037

Table 4.3 – continued from previous page

(1) (2) (3)

Coeff SE Coeff SE Coeff SE Other Services - - -0.086 0.019 -0.085 0.019 Other Activities - - -0.257 0.087 -0.241 0.086 Duration of Employment (×120) - - -0.034 0.010 -0.034 0.010 squared term (×120) - - 0.027 0.005 0.027 0.005 Working Hours (in Log) - - -0.094 0.016 -0.092 0.016 Contextual Effects

Proportion of White - - - - 0.104 0.029

Average Age - - - - 0.009 0.002

Proportion of Formal Workers - - - - -1.324 0.111 Proportion of Informal Workers - - - - -1.606 0.131 Proportion of Military Workers - - - - -1.866 0.124 Proportion of Self-Employed Workers - - - - -1.574 0.125 Proportion with Proxy Respondent - - - - 0.157 0.040

Average Education - - - - 0.073 0.004 ˆ σ2 u 0.055 0.002 0.055 0.002 0.011 0.001 ˆ σ2e 0.301 0.002 0.298 0.002 0.297 0.002 ˆ ρ 0.154 0.156 0.035 Number of Observations 54,663 54,663 54,663 -2×Log-Likelihood 93,056 92,500 90,084

(1) Model with level one main effects and metropolitan region variable. (2) Model adding interaction terms.

(3) Model adding other contextual variables.

(a) Other Services include services as post offices, housing, food, personal, urban cleaning and aerial transportation.

(b) Other Activities include all other activities not yet classified, such agriculture, fishing, forestry, international organizations and non specified activities.

Values at 10−3.