The previous section established that attention heterogeneity prevents point
identification of deadweight loss using aggregate data, even if we already know the distribution of preferences. To facilitate identification, we require more granular data.35 If we use
(repeated) cross-sectional data, we also require additional structure on tax salience or the consumption set. Alternatively, we can use (long) panel data to identify deadweight loss without any structural assumptions.
34 One can identify a first order approximation trivially; it is zero.
35 Technically, one could simply impose a distribution of ๐๐ . For instance, one could assume homogeneous perceived tax-inclusive prices. We do not recommend this.
41 4.1 Cross-Sectional Data
We maintain the linear structure on preferences, as well as the assumption that ๐ is constant with respect to ๐, as in section 3.3. Formally:
๐ = ๐ผ + ๐ฝ๐ฬ + ๐ฝฬ๐ + ๐
Here ๐ผ is a constant whose identification we generally assume.36 The econometrician observes the distribution of ๐ conditional on (๐ฬ , ๐) for all values in ๐ ๐ข๐๐(๐ฬ , ๐). The data comes from an underlying data-generating process, which we assume yields a well-defined and finite value for deadweight loss. The data identifies expected deadweight loss from a (non-zero) tax ๐ if any underlying distributions of(๐ฝ, ๐ฝฬ, ๐) across the population consistent with the observed distribution of(๐ฬ , ๐) and conditional distributions of ๐ yield the same value of๐ผ [๐ฝฬ2
๐ฝ]. One approach is to identify the joint distribution of (๐ฝ, ๐ฝฬ) across individuals, then integrate to obtain the desired expected value. To that end, we can use the following lemma from Masten (2017):
Lemma 3: If ๐ ๐ข๐๐(๐ฬ , ๐) contains an open ball in โ2, then the joint distribution of (๐ฝ, ๐ฝฬ, ๐) is identified if and only if:
1. The joint distribution of (๐ฝ, ๐ฝฬ, ๐) is determined by its moments.
2. All absolute moments of (๐ฝ, ๐ฝฬ, ๐) are finite.
Proof: Apply lemma 2 from Masten (2017). โ
36 It is identified if the econometrician observes ๐ฬ = ๐ = 0 or non-collinear variation in ๐ฬ and ๐.
42
It follows as an immediate corollary that the conditions of the lemma also identify deadweight loss, since one can integrate out the marginal distribution of (๐ฝ, ๐ฝฬ) and then calculate ๐ผ [๐ฝฬ๐ฝ2]. The intuition is that one can find the moments of the random coefficients by running increasingly higher-order regressions of the form:
๐ผ[๐๐๐|๐ฬ , ๐] = โฏ + ๐ผ[๐ฝ๐]๐ฬ ๐ + โฏ + ๐ผ[๐ฝฬ๐]๐๐+ โฏ
The enumerated conditions are satisfied, for instance, when (๐ฝ, ๐ฝฬ) has finite support.
However, one might want to be able to identify deadweight loss without making assumptions on the distribution of (๐ฝ, ๐ฝฬ), beyond assuming that ๐ผ [๐ฝฬ๐ฝ2] is well-defined and finite. The following theorem does not impose such assumptions, but does require unbounded support in regressors.
Theorem 4: Suppose that for every pair (๐1, ๐2) โ (โ โ {0}) ร โ, there is a sequence (๐ฬ ๐, ๐๐)๐=1โ contained within the support of (๐ฬ , ๐) such that:
1. lim
๐โโโ(๐ฬ ๐, ๐๐)โ = โ 2. lim
๐โโ
๐๐ ๐ฬ ๐=๐2
๐1
In addition, suppose that there is another sequence (๐ฬ ๐, ๐๐)๐=1โ contained within the support of (๐ฬ , ๐) such that:
1. lim
๐โโ|๐ฬ ๐| < โ 2. lim
๐โโ๐๐ = โ
Then ๐ผ[๐๐ค๐๐] is identified.
43
We make note of a special case in which these results apply. Consider a binary choice problem, so that ๐๐ = {0,1}. Each agent ๐ has quasi-linear utility ๐ข๐1 fro consuming the good, so that for tax-inclusive price ๐:
๐ข๐1 = ๐๐โ ๐
This is an expression of the agent's preference for the taxed good scaled so that an additional dollar yields one unit of utility. We do not assume that ๐ผ[๐] = 0. However, we normalize utility in the absence of the taxed good to zero:
๐ข๐0 = 0
For a price increase from ๐ฬ to ๐๐ , the change in the expenditure function is:
๐(๐๐ ) โ ๐(๐ฬ ) = {
0, ๐๐ โ ๐ฬ โค 0 ๐๐โ ๐ฬ , ๐๐ โฅ ๐๐ > ๐ฬ
๐๐ โ ๐ฬ , ๐๐ โ ๐๐ > 0
Each agent ๐ has a tax salience ๐๐ independent of the tax rate. The agent perceives utility from buying the taxed good:
๐ข๐1๐ = ๐๐โ ๐ฬ โ ๐๐๐ (8)
They purchase the good if ๐ข๐1๐ > 0. This yields deadweight loss for that agent of:
๐๐ค๐๐ = ๐(๐๐ ) โ ๐(๐ฬ ) โ [๐๐ โ ๐ฬ ]๐(๐ผ โ ๐๐ + ๐๐ > 0) = {
0, ๐๐โ ๐ฬ โค 0 ๐๐โ ๐ฬ , ๐๐ โฅ ๐๐ > ๐ฬ
0, ๐๐โ ๐๐ > 0 s
Let ๐น๐.๐โ denote the true distribution of (๐, ๐). For simplicity, we assume that โ(๐ข1๐ = 0) and โ(๐ข1 = 0) are always zero. We assume the econometrician observes only aggregate data, so that the only values observed are triplets (๐ฬ , ๐, โ(๐ข1๐ > 0|๐ฬ , ๐)). The challenge is to infer deadweight loss, which takes the form:
๐ท๐๐ฟ = โซ ๐(๐ข1 โฅ 0 > ๐ข1๐ )[๐๐ โ ๐ฬ ]๐๐น๐,๐โ (๐๐, ๐๐)
๐๐,๐๐ (9)
44
For (2017) shows that, if ๐ฬ has full support, then one can use aggregate demand to infer the CDF of ๐ โ ๐๐, allowing one to apply Masten (2017). Thus, we can identify deadweight loss even with aggregate data if we are willing to assume that tax salience does not depend on the tax rate and the choice set is binary.
One might hope that the assumption that ๐ โฅ (๐ฬ , ๐) is not required for identification, as it is generally a rather strong assumption. Unfortunately, one cannot do away entirely with this restriction. For instance, consider a population in which โ(๐ = 2) = โ(๐ = 3) = 0.5 and โ(๐ = ๐) = 1. Consider rationalizing the data in two ways. In the first case, ๐ = 0.5 with probability one. This yields aggregate demand:
๐ท = {
1, ๐ฬ + 0.5๐ โค 2 0.5, 3 โฅ ๐ฬ + 0.5๐ > 2ฬ
0, ๐ฬ + 0.5๐ > 3 Aggregate deadweight loss takes the form:
๐ท๐๐ฟ = {
0, ๐ฬ + 0.5๐ โค 2 0.5, 3 โฅ ๐ฬ + 0.5๐ > 2ฬ
1.5, ๐ฬ + 0.5๐ > 3
In the second case, suppose for a moment that ๐ฬ โ (0,1). Then agents with ๐ = 2 have tax salience ๐2(๐ฬ ) = 2โ๐ฬ
6โ2๐ฬ , while agents with ๐ = 3 have tax salience ๐3(๐ฬ ) = 3โ๐ฬ
4โ2๐ฬ . If ๐ฬ โ (0,1), then set ๐2(๐ฬ ) = ๐3(๐ฬ ) = 0.5. One can check that this yields the same aggregate demand, yet aggregate deadweight loss with ๐ฬ โ (0,1) now takes the form:
๐ท๐๐ฟ = {
0, ๐ฬ + 0.5๐ โค 2 0.75, 3 โฅ ๐ฬ + 0.5๐ > 2ฬ
1.5, ๐ฬ + 0.5๐ > 3
45 4.2 Panel Data
Now suppose that one can follow specific individual for a long period of time. For every individual ๐, the econometrician observes triplets (๐ฬ ๐, ๐๐, ๐๐) with full support for ๐ฬ when ๐ = 0.
Then the econometrician can identify the demand function ๐๐(๐; 0). Assuming there are no income effects, deadweight loss for an individual ๐ with perceived price ๐๐ takes the form:
๐๐ค๐๐ = โซ ๐(๐; 0)๐๐
๐๐
๐ฬ
โ ๐๐(๐๐ )
If one can observe tax ๐, then one can determine deadweight loss using the inferred demand function via:37
๐๐ = ๐โ1(๐๐(๐ฬ , ๐))
For instance, with binary choice data, demand for agent ๐ is ๐(๐ผ โ ๐๐๐ + ๐๐). This yields deadweight loss for agent ๐ of:38
๐๐ค๐๐ = ๐(๐๐๐ > ๐ผ + ๐๐ > ๐ฬ ) [๐ผ โ ๐ฬ ๐๐] This yields aggregate deadweight loss:
๐ท๐๐ฟ = โซ ๐(๐๐๐ > ๐ผ + ๐๐ > ๐ฬ ) [๐ผ โ ๐ฬ + ๐๐] ๐๐น๐โ๐ ,๐(๐๐๐ , ๐๐)
๐๐๐ ,๐๐