• No results found

2.5 Results

2.5.2 Causal analysis

However, these results are contingent on the exogeneity of PGi which is relaxed in

Table 2.3. The first column corrects for measurement error in PGi by exploiting the correlation

between the DSM definition of PGi and the score from the alternative screen. The Staiger and

Stock (1997) rule of thumb that the first stage F-statistic should exceed 10 is easily satisfied for the results in the first column of Table 2.3, suggesting that the PGSI score is a very strong instrument. The second column also includes parental PG in the instrument set in an attempt to deal with the second potential source of endogeneity. The F-statistic still satisfies the rule of thumb. In each case presented in Table 2.3, the PGi coefficient is substantially larger than that

in Table 2.2. The crucial welfare effect, δ/γ, remains the same irrespective of whether Parental PG is added to the instrument set. However, column 3 which, uses ParentalPGi alone as an IV,

does not produce as large an F-statistic in the first stage as the other cases. Not surprisingly, because it is common for the IV estimate to be even more biased than the OLS estimate in such cases, the estimate of δ becomes much larger because of this relatively weak instrument.

Forrest (2016) is rightly suspicious of the ability of this data to yield a valid instrument for PG. Wardle et al (2011) note (in their Table 6.3) that individuals with parents who were problem gamblers were themselves five times more likely to be problem gamblers (as defined by DSM) than those who were not PG. But it is not sufficient that the instrument be correlated with the endogenous variable. It must also be the case that the only transmission route by which ParentalPGi affects Wi is through its effect on PG. In the just identified case, whether or not

the instrument has a direct effect on the dependent variable of interest rather than just via the endogenous variable, is not something that can be readily inferred, so the validity of the instrument(s) remains an article of faith. One might argue that ParentalPGi in the past has an

effect on current own well-being apart than through its effect through own PGiwhich, if true,

the difference in W for those with ParentalPG = 1 compared to 0 is statistically insignificant in the raw data.

Table 2.3: IV estimated parameters of interest Dependent variable: 𝑊(

Instruments: PGSI score

Parental PG and

PGSI score Parental PG

Ln Y (𝜸) 0.529*** 0.529*** 0.458*** (0.0589) (0.0567) (0.0873) PG (𝜹) -2.482*** -2.498*** -18.072* (0.6079) (0.5972) (9.5863) 𝜹/𝜸 -4.695*** -4.724*** -39.462* (1.2869) (1.2512) (25.0554) CV (£b, pa) 46.0 46.3 386

First stage F-statistic 4890.97*** 2488.75*** 28.41***

Notes: Estimated standard errors, obtained from bootstrapping, are in parentheses. ***/**/* indicates statistically significant at 1%/5%/10%. F-statistic is the Stock-Yogo definition – using the Windemeijer definition for multiple instruments in column 2 produces very similar results. Female, age, age2, marital status, and ethnicity are included as control variables and full results are presented in Appendix Table A2.15. The first stage estimates for this specification, and for the alternative PGSI definition of PG, are presented in Appendix Table A2.16. Corresponding results where PG is defined using the PGSI score and instrumented using the DSM score are provided Appendix Table A2.17.

It has been argued that in the over-identified case, where there already exists one or more valid instruments, it is possible to test validity of a second instrument, conditional on the validity of a first instrument, using the Sargan–Hansen test (see Sargan, 1958, and Hansen, 1982). It seems likely that the PGSI score is a valid IV for PG, as defined by the DSM screen, since both screens have been designed with the objective of assigning PG status and inspection of the questions in the appendix below suggest a lot of overlap across the two screens. Indeed, the overlap is so great that the conclusions of this chapter do not depend greatly on which screen is used (see Appendix Table A2.17 for PGSI results). Moreover, even though parental PG is found to be a statistically valid IV, and so yields a consistent estimate of the effect of own PG on W, there is still the question of how one interprets the resulting estimate. In a model with heterogeneous effects, while OLS estimation yields a biased estimate of the average effect of PG on W, this is not the case with IV. However, while an IV estimate is unbiased, IV does not necessarily yield an estimate of an average effect in the same way as OLS does. In particular, IV estimates are best interpreted as the causal effect of the treatment (in this case, PG) on individuals who are treated by virtue of the instrument. This is referred to as a Local Average Treatment Effect (LATE) in the literature. In the present PG case, exogenous variation

in PG occurs only for the group of individuals who are PG by virtue of parental PG – the complier group.

Some applied econometricians have argued that, while the IV analysis here does not necessarily obtain an estimate of the average effect of the treatment in question on a readily identifiable population, it nonetheless estimates something that is still relevant for policy. In contrast, others argue that a LATE estimate is not useful and must be augmented with something else to produce economically meaningful parameters (e.g. a structural econometric model). Unlike the case of a schooling reform instrument used in Harmon and Walker (1995) in the context of the causal effect of education on wages, it is difficult to argue that the adults were so affected by their parents’ PG that they became PG themselves, especially because this group is so small. In particular, it is quite conceivable that Parental PG makes some people more likely to be PG (compliers) through some common environment or even genes, while others are defiers – people who observe their parents were PG and were determined not to become like that. Formally, IV LATE estimates are the weighted average of the defier and complier estimates.

In the specification using both PGSI score and parental problem gambling as instruments tests of the validity of the instruments using Hansen’s J-test for over-identification fails to reject the null hypothesis that the use of parental PG as a first-stage instrument is valid, conditional on the validity of PGSI. However, the Hansen test (and earlier Sargan test) are not generally applicable in the context of a model where there are heterogeneous effects. Fortunately, it is not necessary to rely on this test because Table 2.3 suggests that nothing much hangs on the case for using Parental PG as an IV –the same results are obtained when Parental PG is omitted and the alternative PGSI score is the only instrument. The welfare relevant parameter, δ/γ, is virtually the same in both columns. The suggestion is that it is the measurement error in PG that accounts for much of the bias in the OLS estimate in column 1 of Table 2.2. This is fortunate since it implies that we can extrapolate from the IV estimates. Thus, if taking δ/γto be -4.7 then this implies an average welfare effect of around £110k pppa, which aggregates to approximately £37b pa.36

A possible alternative to the instrumental variables procedure for causal identification as described above can be found in Lewbel (2012) who builds upon earlier work by Klein and

Vella (2006). This method uses a subset of the regressors from the model which are uncorrelated with the covariance of heteroscedastic errors to construct instruments in a two- stage routine to identify coefficients on the endogenous variable(s). Exploiting the fact that these constructed instruments are, by construction, uncorrelated with the error terms, this approach also facilitates the use of the Sargan-Hansen test discussed above when there are not enough instrument candidates to achieve over-identification. As such, a secondary benefit of this approach is to provide evidence in support of otherwise suspect instruments.

However, when the endogenous variable is binary, as in the present case, the utility of using heteroscedasticity per Lewbel (2012) to achieve identification has come under scrutiny (see, for example, Emran, Robano and Smith, 2012). Lewbel (2016) demonstrates that exploiting heteroscedasticity to identify the model is possible when the endogenous regressor is binary but notes that it requires a very strong distribution restriction on the error term. Moreover, the required assumption is not testable and is as much an article of faith as regular IV techniques. As such, even as a means to conduct the aforementioned Sargan-Hansen test of instrument validity will likely yield misleading estimates. Nonetheless, the null hypothesis of this test is again not rejected. The resultant CV estimates, though smaller than those obtained using PGSI and parental PG as instruments alone, are still greater than those obtained via OLS. This provides further evidence towards the conclusion of this chapter that the OLS estimates provide a lower bound on the cost of PG. The results from using Lewbel’s identification methodology therefore makes only a marginal contribution to the evidence already presented and estimates can be found in Appendix Table A2.18.