Conclusions Concerning the Boston Fed Study

In conclusion, the key finding of the Boston Fed Study is unaffected when the model includes a complete accounting for the endogeneity of LTV—or of any other variable. In all of the models examined here, the impact of minority status on loan denial is statistically significant and close to the value in the Boston Fed Study’s equations. Moreover, we find that accounting for the endogeneity of spe- cial program choice actually boosts the impact of minority status on loan denial.

Conclusions Concerning the Boston Fed Study

Munnell et al. (1996), otherwise known as the Boston Fed Study, has been sub- jected to a phenomenal, and in our experience unprecedented, volume of criti- cism. Critics have argued that the study’s main finding, that minority applicants are more than 80 percent more likely to be turned down for a loan than are equivalent white applicants—an indication of discrimination in mortgage lending—is biased upward because the study (a) omitted variables; (b) used data with errors in the explanatory variables; (c) used a dependent variable that misclassifies loan outcomes; (d) used the wrong algebraic form, also known as misspecification; and (e) failed to account for endogeneity in several different explanatory variables.

DOES DISCRIMINATION IN MORTGAGE LENDING EXIST? THE BOSTON FED STUDY AND ITS CRITICS

THE URBAN INSTITUTE

We have examined all of these arguments and, where possible, explored them with the public-use version of the Boston Fed Study’s data. In some cases, we find that the critics are simply wrong: The problem they identify does not exist, or the bias involved is empirically insignificant. In several cases, however, we agree with the critics that a limitation in the Boston Fed Study could potentially lead to a serious overstatement of discrimination, and we have explored these cases in detail. Moreover, we find that the literature has raised several important issues concerning the interpretation of the Boston Fed Study’s results.

This analysis leads us to five main conclusions.

First, we conclude that the large differences in loan denial between minority and white applicants identified by Munnell et al. cannot be explained by data errors, omitted variables, or the endogeneity of loan terms. No study has identified a reasonable procedure for dealing with any of

these potential problems that eliminates the large positive impact of minority status on loan denial. One cannot, of course, prove that no bias exists in any particular equation, but one can examine all of the potential sources of bias identified by scholars for that case. Scholars have been unusually creative in identifying potential biases in the Boston Fed Study, but our analysis, based on the best data currently available, reveals that none of these potential biases can explain why the estimated minority-status coefficient in a loan denial equation is so large.

For example, some scholars have claimed that the “meets guidelines” variable should be included to correct for elements of applicants’ credit histories that are omitted from the explanatory variables in the Boston Fed Study. If this variable does indeed capture such omitted elements, however, then the unobserved factors influencing “meets guidelines” will be correlated with the unobserved factors influencing loan denial. We show that this is not the case. It follows that the “meets guidelines” variable does not correct for omitted variables. In addition, we find that accounting for the endogeneity of various loan terms never results in a substantial reduction in the estimated minority-status coefficient, and in some plausible cases this step actually makes that coefficient larger.

Second, we conclude that no study has demonstrated either the pres- ence or the absence of disparate treatment discrimination in loan approval,

at least not in a large sample of lenders.55_{This conclusion puts us at odds both}

with the authors of the Boston Fed Study, who claim that they measure disparate treatment discrimination, and with several of their critics, who claim that there is no discrimination at all.

In our view, the Boston Fed Study’s results measure disparate treatment discrimination only under the assumption that all lenders use the same underwriting guidelines. With this assumption, any group-based difference in treatment after controlling for underwriting variables implies that the guidelines are applied differentially across groups, which is, by definition, disparate treatment discrimination. Because virtually all lenders sell some of their loans in the secondary mortgage market, they have some incentive to use the underwriting guidelines that institutions in that market, such as Fannie Mae, have established. However, many loans are not sold in the secondary market, and the lending process often involves many individuals in the same lending institu- tion, who may not all have the same incentives. Even on conceptual grounds,

MORTGAGE LENDING DISCRIMINATION: A REVIEW OF EXISTING EVIDENCE

76

THE URBAN INSTITUTE

therefore, the same-guidelines assumption is a strong one, and no existing empirical study can confirm (or deny) it.

Third, we conclude that no study has demonstrated either the presence or the absence of disparate impact discrimination in loan approval. The Boston

Fed Study’s results measure disparate impact discrimination only under the assumptions that (a) different lenders use different underwriting guidelines; (b) existing guidelines are accurately linked to loan profitability, on average; and (c) existing deviations from average guidelines cannot be justified on the basis of business necessity. These assumptions could be satisfied, for example, if underwriting guidelines vary across lenders solely for idiosyncratic reasons or if some lenders purposefully develop guidelines that have a disparate impact on minority applicants. However, no existing study sheds light on whether these assumptions are met.

Fourth, following the logic of civil rights legislation, the Boston Fed Study establishes the presumption that in 1990 lenders in Boston engaged in either disparate treatment discrimination, disparate impact discrimination, or both. This presumption can be rebutted only with evidence that the observed

minority-white differences in loan approval can be entirely explained by profit- based differences in the underwriting guidelines used by the lenders to which minorities and whites applied. To use the legal term, the Boston Fed Study builds a prima facie case that discrimination exists. If such a case were made in a courtroom setting, the burden of proof would shift to lenders. To escape the conclusion that they are discriminating, lenders would have to prove that their actions were based on “business necessity,” that is, that they used underwriting guidelines with a clear connection to the return on loans, that they applied these guidelines equally to all groups, and that no equally profitable guidelines without a disparate impact on minority applicants were available. Our conclusion builds on the spirit of this legal standard. In our view, the Boston Fed Study builds a strong prima facie case for discrimination, and no scholar has come close to showing that the observed intergroup differences in loan approval in Boston can be justified in business terms.

In fact, the available evidence, while far from conclusive, suggests that business necessity is unlikely to explain a large share of the observed minority- white difference in loan denial. In particular, legitimate differences in underwriting guidelines must be associated with real differences in lenders’ experiences. They are therefore most likely to arise between lenders that specialize in groups of borrowers with different average creditworthiness. Thus, if differences across lenders in legitimate underwriting criteria have a major impact on the observed minority-white difference in loan denial, then allow- ing underwriting criteria to vary across lenders should dramatically lower the estimated minority-status coefficient. This turns out not to be the case. Munnell et al. (1995) can reject the hypothesis that the underwriting model is different for single-family houses, multifamily houses, or condominiums. Moreover, both Munnell et al. and Hunter and Walker (1996) find little evidence that individual underwriting variables receive different weights for minority and white applicants. In addition, Munnell et al. show that the minority status coefficient is virtually the same when separate regressions (and hence separate underwriting guidelines) are estimated for lenders that specialize in lending to minorities and

DOES DISCRIMINATION IN MORTGAGE LENDING EXIST? THE BOSTON FED STUDY AND ITS CRITICS

THE URBAN INSTITUTE

for other lenders. Finally, Browne and Tootell (1995) show that the minority- status coefficient is literally unaffected if one excludes two minority lenders, which together account for half of the minority applications in the Boston Fed Study’s sample.

As explained earlier, the “meets guidelines” variable might be related to the issue of business necessity. If we assume that minority households do a poorer job than white households in selecting lenders that meet their credit needs, then including the “meets guidelines” variable (and treating it as endogenous) can be interpreted as a way to account for legitimate differences in underwriting guidelines across the lenders visited by minorities and whites. In this case, we find that roughly 37.5 percent [(7.7 – 5.6)/5.6 from the first row of table 1] of the minority-white difference in loan denial is due to business necessity, not discrimination. However, this assumption is not consistent with the results in the previous paragraph. If minority households simply do a poorer job finding just the right lender, then, contrary to this evidence, the minority-white difference in loan approval should disappear for lenders that specialize in lending to minorities.

Fifth, we conclude that the best way to determine whether the observed minority-white differences in loan denial are the result of underwriting practices justified by business necessity would be to conduct a replication of the Boston Fed Study in other locations with the addition of loan perfor- mance data. This approach would make it possible to determine which

observed application characteristics are accurate predictors of loan returns and therefore which underwriting guidelines are legitimate. Minority-white differences in loan denial that remain after accounting for legitimate underwriting guidelines are evidence of discrimination. Research along these lines is particularly important for policy purposes because credit-scoring and other automated underwriting schemes, which are becoming increasingly popular, have enormous potential to lessen disparate treatment discrimination while at the same time magnifying disparate impact discrimination.

Unfortunately, this approach would not be able to distinguish between disparate treatment and disparate impact discrimination. A combination of application data, including credit history, and performance data should make it possible to identify legitimate underwriting guidelines and even to determine if those guidelines vary by location or by some other variable. However, these data would contain only a few observations for each individual lender and therefore could not be used to identify each lender’s actual underwriting guidelines. As a result, a researcher could not determine whether remaining minority- white differences in loan denial for an individual lender are due to that lender’s use of different guidelines for minorities and whites (disparate treatment discrimination) or its use of illegitimate guidelines that place minority applicants at a disadvantage (disparate impact discrimination).

Sixth, we conclude that the best, and perhaps the only, way to measure disparate treatment discrimination is with audit methodology. In an audit,

two applicants with the same credit histories and in need of the same type of loan would apply for a mortgage at the same lender. Disparate treatment discrimination exists if minority applicants are systematically treated less favor- ably in a large sample of audits. Audits of this type would shed no light on

MORTGAGE LENDING DISCRIMINATION: A REVIEW OF EXISTING EVIDENCE

78

THE URBAN INSTITUTE

disparate impact discrimination, because they would compare the treatment of identically qualified minority and white applicants at the same lender. Thus, observed differences in treatment could not be due to underwriting guidelines that illegitimately magnify differences in credit characteristics between minorities and whites, that is, to disparate impact discrimination.

Unfortunately, an audit study of loan approval faces many major practical challenges. Perhaps the most important is that it would be difficult, and might even be illegal, to assign false credit characteristics to auditors as a means of ensuring that audit teammates had identical loan qualifications. This step would be difficult because it would require the cooperation of the firms that maintain the credit records that lenders refer to. It might be illegal because laws prohibit false statements on credit applications with intent to defraud. We do not believe that auditing is a fraudulent activity. But the courts have not yet ruled on this matter, and any group that pushes audits into the loan approval stage of the mortgage process might face high legal bills, if not something worse. It might be possible to conduct audits using auditors’ actual credit characteristics, but this approach would be administratively difficult because auditors would still have to be matched to have the same credit qualifications. As a result, a very large pool of potential auditors would be necessary.

By collecting, analyzing, and releasing their data, the authors of the Boston Fed Study have made an enormous contribution to the literature on lending discrimination, but their study is certainly not the last word on the subject. Can similar evidence of discrimination be found in urban areas other than Boston? Has the estimated level of discrimination declined? Do lenders engage in disparate- treatment discrimination, or disparate impact discrimination, or both? To what extent can observed minority-white differences in loan denial, controlling for applicants’ credit histories, be explained by legitimate differences in underwriting guidelines across lenders instead of by discrimination? Given the potential importance of lending discrimination as a barrier to homeownership for minority households and the range of questions about lending discrimination that remain unanswered, further research on these questions is urgently needed.

Notes

1. Stephen L. Ross is an assistant professor of economics at the University of Connecticut and John Yinger is a professor of economics and public administration at the Maxwell School, Syracuse University. They are grateful to Anthony Yezer and Geoffrey Tootell for helpful comments. The views expressed in this and the following two chapters should not be attrib- uted to anyone but the authors.

2. For a description of the lenders covered by the HMDA data, see Avery, Beeson, and Sniderman (1996).

3. The 1991 figures are presented in Yinger (1995), and the 1997 figures come from Federal Financial Institutions Examination Council (1998). By way of comparison, the Hispanic/ white loan rejection ratio over this period declined from 1.74 to 1.47.

4. The original version of the Boston Fed Study was released as a working paper in 1992 and the final version was published in 1996. Because many of the criticisms focused on the original version and were published in 1994 or 1995, Munnell et al. (1996) includes considerable material responding to the critics. Subsets of the authors of the Boston Fed Study also have published additional responses. See Browne and Tootell (1995) and Tootell (1996b).

DOES DISCRIMINATION IN MORTGAGE LENDING EXIST? THE BOSTON FED STUDY AND ITS CRITICS

THE URBAN INSTITUTE

5. A discussion of these potential flaws draws on several econometric theorems. A brief discussion of the key econometric concepts is provided in the technical appendix to chapters 3 through 5, which appears following chapter 5.

6. Another variable is considered by Hunter and Walker (1996), who argue that loan denial may depend on how “thick” an applicant’s file is, as measured by whether there are two or more credit checks in the file. This variable proves to be insignificant.

7. This effect was described to us in correspondence from Geoffrey Tootell. Note that our regressions do not include lender dummies because, to protect confidentiality, they are not included in the public-use data set. Because they omit these variables, our regressions over- state the minority-status coefficient by 20 percent. However, the public-use data set also does not, for the same reason, include census tract dummies, which raise the estimated minority- status coefficient. By coincidence, these two effects almost exactly offset each other, so our estimate of the minority/white denial gap using the methodology that is closest to the Boston Fed Study’s, 7.7 percentage points, is almost the same as the Boston Fed Study’s estimate, 8.2 percentage points.

8. This list is similar to the set of variables in the baseline estimation of Munnell et al. (1996), except that it substitutes census tract characteristics for tract dummies and excludes lender dummies. Munnell et al. could not reject the hypothesis that a separate coefficient for black applicants was the same as one for Hispanic applicants.

9. As discussed in the technical appendix following chapter 5, this is the percentage of the vari- ance in the underlying latent variable that is explained by the model.

10. In a probit model (as used here) or the similar logit model (in Munnell et al. 1996), a percentage impact is determined by comparing the average predicted probability for all observations at two different values of the variable in question. In this case, we compare the average predicted probabilities of denial for minority applications with the minority-status coefficient set to zero and to its estimated value.

11. This technique is called a bivariate probit with recursion. Equations describing the model are presented in the technical appendix.

12. This correlation also might reflect omitted variables.

13. To make them comparable with the simultaneous equations procedures in the technical appendix that follows chapter 5, the single-equation results we present here and elsewhere are based on bivariate probit models, not the related logit models used by Munnell et al. However, logit and probit results for comparable equations are similar.

14. The reader may be puzzled by the fact that the minority-status coefficient is larger in the first specification but the percentage impact on loan denial is smaller. This apparent contradiction arises because predicted probabilities from a two-equation model reflect not only the estimated coefficients but also the estimated correlation between unobserved factors across equations. 15. An alternative response to these results would be to include “unable to verify” and treat it

as endogenous. We implemented this alternative approach for many of the models discussed later in this chapter and found that the results are very similar to those using the simple approach of dropping this variable altogether. Consequently, we present only the results from the simpler approach.

16. The equations for this model, along with a discussion of econometric issues it raises, including the identification of the model, can be found in the technical appendix following chap-

In document Mortgage Lending Discrimination: A Review of Existing Evidence (Page 80-89)