• No results found

Sample sizes and mean characteristics

5 The Impact of EMA Box 5.1 Summary

Variants 1 and 3 Variant 2 Variant 4Group

5.4.1 Sample sizes and mean characteristics

All the same individual characteristics are used for matching as in the one-way matching methodology described in Section 5.2. In order to obtain sufficient accuracy in the matching, and at the same time to retain sufficient sample sizes, young men and women in rural and urban areas have been considered together for this two-way matching.17 Not surprisingly, the characteristics of those eligible for EMA and those who are ineligible tend to be significantly different in a number of respects, particularly income and socio-economic status. For this reason once people are matched according to their propensity to be eligible for EMA, as well as according to their propensity to be in a pilot area, there are quite a number of individuals who cannot be matched at all. The smaller matched samples contain individuals who are much more similar to each other than the unmatched groups and there is, therefore,

considerable convergence in mean characteristics of the different groups once matching has taken place.

The original unmatched sample sizes and sample sizes after matching has occurred are shown in Appendix 5.3. Details are included for two sets of matching rules, depending on how similar it is insisted that individuals and their nearest match need to be for them to be included in the analysis. The first set uses a maximum closeness of 0.7, while the second imposes a maximum closeness of 0.3.18 Considering the group of eligible individuals in a pilot area to whom each of the other individuals have been matched, the original unmatched sample contains 4,716 young people. Once all those who could not be closely enough matched to a young person in each of the other three groups have been dropped from the analysis, the sample sizes are reduced to 2,373 using the 0.7 matching rule, or approximately one half of the original sample. Using the 0.3 rule, the usable matched sample decreased even more, to 1,457 or less than one-third of the original sample. Such large losses in sample sizes arise because only those who look most like the ineligibles in their characteristics (i.e.

16 Raw effect is the effect from comparing education participation in a matched sample of those eligible for the

EMA in the pilot areas with those who would have been eligible in the control areas, i.e. before any area specific effects are controlled for.

17 Unfortunately this means that it is not possible to calculate disaggregated results looking at the effect of

have the lowest propensity score to be eligible for EMA) will be similar enough to be

matched to an ineligible individual. Similarly, amongst the ineligible groups, only those who look most like the eligibles in their characteristics (i.e. have the highest propensity score to be eligible for the EMA) can be used as matches. Again, as with the one-way matching, the numbers in each cell (pilots, controls, eligibles and ineligibles) are identical after matching.

The sample selection employed by matching can be seen clearly by the difference in mean characteristics of the samples before and after matching has taken place. Whereas there are wide divergences in the average characteristics of the unmatched samples between eligibles and ineligibles, the differences are much smaller amongst those who have been successfully matched. The stricter the matching rule, the closer the balance of characteristics amongst the final matched samples. Table A.5.3 in Appendix 5.3 also shows the mean of some of the characteristics upon which matching has taken place in each of the groups, both in the matched and unmatched samples.

The first characteristic described in the table is average family income in each of the groups. Note that this variable was not in fact used in the matching process itself because income itself determines eligibility through the EMA means test, so no matching could take place at all on this basis. However, through the process of matching on other related variables it can be seen that average family incomes in the different groups are also drawn considerably closer. Prior to matching, the mean family income amongst the group of eligibles in the pilots and controls was approximately £15,400 per annum, compared to approximately £46,000 amongst the ineligible groups. After matching, it is mostly young people at the upper end of the income scale amongst eligibles who have been used (in practice these young people will tend to be on the EMA taper rather than eligible for the full EMA). Therefore, the average incomes of the group of eligibles used in both pilots and controls rises to approximately £18,400 when the 0.7 rule is applied, and to £19,500 when the 0.3 rule is in operation. The average incomes of the ineligibles who have been used as matches are also considerably lower than amongst the unmatched group, at approximately £40,000.

Examining some of the other characteristics upon which the matching has taken place, prior to matching as many as 31 per cent of the EMA eligible population in the pilot areas lived in council or housing association accommodation, compared to only two per cent of the

ineligible pilots and controls. By contrast, only seven per cent of the 0.7 rule sample of eligibles in pilot areas are of this tenure type, and just four per cent of those who have been matched using the 0.3 rule. The proportion of those in the matched sample of ineligibles who live in such accommodation is also higher than in the unmatched samples, so that matching achieves near convergence in the sample characteristics. A similar pattern can be seen in many of the other characteristics. For example, the proportion of eligible young people in pilot areas whose father is in full-time work is 41 per cent in the unmatched sample,

compared to approximately 85 per cent of those ineligibles in pilot and control areas. Using the samples matched according to the 0.3 rule, the proportion of each group whose father works full-time is more or less equal, at between 68-70 per cent.

The implications of dropping such a large proportion of eligible young people from the sample, and using this smaller group of relatively better-off eligible individuals and relatively worse-off ineligibles to test for area effects, are not necessarily as serious as they might seem. This smaller sample is being used only to see whether it is likely that unobserved area affects are biasing the estimated EMA effects. Given that these are unobserved area effects rather than unobserved individual effects that we are trying to control for, it is unlikely that these effects will vary greatly for different groups of individuals. This however, is untestable in these data. If we are willing to assume that these area effects, such as differences in school quality between areas, impact similarly on all groups, then the estimate of the unobserved area effect can be used for the eligible EMA population at large, not just that sub-group from whom the effect is estimated.