4.3 Preliminary Analysis
4.3.1 Data Selected for Cross-sectional Analysis
Chapter 3 presented a description of the PME survey and its drawbacks in re- spect to the matching of households and individual records throughout the survey months. One alternative found in that chapter was to make use of the subset of the heads of household in what was defined as the general working data set. This working data set is formed by pooling data from the 2004 and 2005 PME surveys. In order to conduct a cross-sectional analysis of the PME data, where the number of measurement occasions equals one, the cross-sectional working data set to be used in this chapter only considers the first interview of these heads of households. In this sense, only the PSU and the heads of household levels are accounted for.
The analysis sample in this chapter, therefore, contains the first occasion data of the 57,412 employed heads of household. The sample distribution of the labour income, as expected, was shown to be very skewed to the right. Therefore, the logarithm transformation is applied and Figure 4.1 presents the sample histogram of the log-transformed income. It is worth mentioning that this labour income is expressed in real terms, that is, it is inflated to prices of September 20064.
Figure 4.1: Sample Distribution of the Log-transformed Real Labour Income
0 .2 .4 .6 .8 Density 2 4 6 8 10 12 Log−Income (Real)
4This was the price index available at the time this data set was constructed. The nominal income was adjusted for inflation relative to the base month of September 2006.
The PME data set has limited choices of individual and household variables. Table 4.1 presents the distribution of the heads of household in the analysis sample according to some of their characteristics and their jobs characteristics. A better explanation of some of the variables in Table 4.1 is presented in Table 4.2 at the end of this sub-section. The results in this table were generated taking the PME sampling weights and the design variables into account. It was performed using Stata svy commands (STATA Press, 2005). Note that, due to the way the cross- sectional data was selected, the survey weights available for each head of household are from different survey periods. As the records selected in the working cross- sectional data set are from January 2004 to December 2005, the percent of the population and the average real income, as presented in Table 4.1, are over a 24 month period.
The PME covers the urban areas of the six main metropolitan areas of Brazil. S˜ao Paulo is the largest metropolitan area. In the PME, the employment status of the population is investigated for those aged 10 or older. Table 4.1 shows the age distribution of the selected heads of household. They are more concentrated in the age group between 40 and 49 years of age. In addition, employed heads of household are in the majority well educated; 43% have 11 or more years of schooling, which represents the completion of the equivalent of high school or over. The second most frequent education level is for those that have from 4 to 7 years of schooling. This represents attending basic education, but not completing it.
Table 4.1: Distribution of Employed Heads of Households and Average Labour Income by Selected Covariates
Distribution of Average employed HoHH (%) Income Metropolitan Regions Recife 6.07 853 Salvador 6.85 994 Belo Horizonte 9.65 1,165 Rio de Janeiro 26.57 1,126 S˜ao Paulo 42.07 1,438 Porto Alegre 8.79 1,172 Age Group 10 to 19 0.37 491 20 to 29 13.47 812 30 to 39 28.68 1,137 40 to 49 30.92 1,368 50 to 59 18.96 1,452 60 or over 7.60 1,439 Gender Men 73.30 1,375 Women 26.70 878 Skin Colour White 56.47 1,604 Black 9.09 663 Yellow 0.78 2,933 Parda (Mixed) 33.55 765 Indigenous 0.11 749
Collapsed Skin Colour
Others 43.53 783 White 56.47 1,604 Education in Years of Schooling Less than 1 3.56 415 1 to 3 years 7.25 496 4 to 7 years 27.92 613 8 to 10 years 17.97 757 11 years or over 43.09 2,089 Not defined 0.21 501 Occupation Employee 68.30 1,167 Self-Employed 24.15 911 Employer 7.43 3,095 Unpaid Workers 0.11 - Informality Formal 71.73 1,189 Informal 28.27 786 Employees in the Private Sector 82.33 1,144 Public Sector 17.67 1,781
Sector & Informality
Formal Private Sector 62.92 1,216 Informal Private Sector 19.42 910 Formal Public Sector 3.23 1,540 Informal Public Sector 1.61 1,242 Military Services 12.83 1,918
Table 4.1 – continued from previous page
Distribution of Average employed HoHH Income Activities Manufacturing 18.60 1,328 Building 9.93 824 Commerce 18.71 1,055 Financial 13.84 1,822 Social Services 13.88 1,771 Domestic Services 6.31 359 Other Services 18.00 1,082 Other Activities 0.73 913 Duration of Employment
Average duration in months 98 - Hours Worked
Average hours worked per week 44 - Average Number
of Household Members 3 -
Total - 1,243
A recent study from IBGE pointed to an increase in the selection of women as the reference person in the household, with this proportion being almost 30% in August 2006 (IBGE, 2006b). Table 4.1 shows that the proportion of female heads of household in the analysis sample is 26.7%. The variable that measures skin colour in the PME is self-declared and classifies the population into five sub-groups: White, Black, Yellow (Asian descendants), Mixed and Indigenous. Table 4.1 shows that more than half of the heads of household classify themselves as white. A common practice adopted in studies of the Brazilian population is to create a dichotomous variable that represents white against others, collapsing all other categories. This is also presented in Table 4.1.
On job characteristics, Table 4.1 shows that these heads of household are in the majority employees (68%) or self-employed (24%). There is a quite small share of unpaid workers and only 7% are employers. The share of the employee heads of household can be further classified as formal (72%) and informal (28%) employees. Furthermore, Table 4.1 shows that few employees are in the public sector. The majority are formal employees in the private sector and around 13% are in military service. These characteristics are, however, only for employees.
Looking again at all employed heads of households, Table 4.1 shows that they are mostly engaged in the commerce and manufacturing activities as well as other services. Table 4.1 also presents average values for some continuous characteristics for this sub-set of employed heads of household, such as: their average duration of employment is of 98 months and they work on average 44 hours per week. The
last variable presented in Table 4.1 is the total number of household members. This variable is discrete with an average of 3 members in the household.
Table 4.1 also presents the average labour income according to these same characteristics. It can be observed from this table that employed heads of house- hold in the Southern metropolitan regions - Belo Horizonte, Rio de Janeiro, S˜ao Paulo and Porto Alegre - have higher average income than those in the northeast regions. S˜ao Paulo is the metropolitan region with higher average income. There is income differential for gender and skin colour: female heads of household have lower average income than males and whites higher than others. Older heads of household have higher average income as well as employers, and informal employ- ees earn less than formal ones. Heads of households employed in the public sector have higher average earnings than those in the private sector; both formal and informal heads of household earn more in the public sector than in the private sector. However, those in the military services have the highest earnings on aver- age. Heads of household engaged in financial activities earn on average more than on any other activity. To conclude, Table 4.1 shows that education returns are as expected, with higher average income for the best educated.
Table 4.2 presents the set of variables considered as explanatory variables in the models to follow. These variables were selected based on the review presented earlier in this chapter. These are almost all the variables from Table 4.1, excluding those that were applied to employee heads of household only. Furthermore, the variable for type of worker now breaks the employee category into two: formal and informal heads of household. Table 4.2 provides further explanation for each of the variables. In summary they are: interview month, dummies for male and white, age, education, number of members in the household, type of worker, type of activity, metropolitan region, duration of employment and working hours. There is also a dummy indicating whether the heads of household had their questionnaire completed by another member of the household and not themselves. Furthermore, squared terms for age and education are also considered. It is worth mentioning that, in this analysis sample, the income variable contains 4.8% of missing values. Therefore, heads of household with missing income are not included in the analysis that follows. Also note that the small sub-set of unpaid workers, which have zero labour income, is not included either. This brings the final data set to a total of 54,663 heads of households.
Table 4.2: Variables Included in the Analyses
Variables Categories Definition Metropolitan Regions Recife This is a cluster level variable.
Salvador It is the stratifying variable used Belo Horizonte in the design of the PME. Rio de Janeiro This variable is included in S˜ao Paulo the model as a control variable. Porto Alegre Recife is the baseline category.
Month of First Interview Represented in the model as a continuous variable from 0 to 23 starting with January 2004 till December 2005. Gender Females An indicator for male heads of household
Males is considered in the model. Age Group It is used in a continuous form.
The squared term is also considered. Under the mincer equation age is a proxy for experience. In the multilevel model age is centred around 40.
Skin Colour White This categorical variable is Black later collapsed to represent an Yellow indicator for whites against all Parda (Mixed) other categories, referred to as Indigenous others.
Education in It is used in a continuous form Years of Schooling from 0 to 17 years of schooling.
The squared term is also considered.
Type of Worker Employer Employer is the baseline category. Informal Employee Employees are now separated into Formal Employee formal and informal employees. Military Service
Self-Employed
Type of Activity Manufacturing Manufacturing is the baseline Building category. Commerce Financial Services Social Services Domestic Services Other Services Other Activities
Duration of Employment Continuous variable in months. Serves as a proxy for experience in the firm. Working Hours Working hours expressed on the natural
logarithm scale.
Proxy Respondent This variable is used as a control for hard to count heads of household. Number of Household Members This variable is used as a control for