COMPARISON BETWEEN FE AND PD ESTIMATOR ······································

CHAPTER 1 ·····························································································································

2.3 COMPARISON BETWEEN FE AND PD ESTIMATOR ······································

𝜷̂_𝐹𝐸 = [∑ 𝑿̃_𝑔′_𝑿_̃ 𝑔

𝑔 ]−1∑ 𝑿𝑔̃𝑔′𝒚̃𝑔;

𝜷̂_𝑃𝐷 = [∑ 𝑁_𝑔 _𝑔𝑿̃_𝑔′_𝑿_̃

𝑔]−1∑ 𝑁𝑔 𝑔𝑿̃𝑔′𝒚̃𝑔,

Looking at the formulae of FE and PD, 𝜷̂𝑃𝐷 imposes density weights 𝑁𝑔 to both 𝑿̃𝑔′𝑿̃𝑔

(matrix denominator) and 𝑿̃𝑔′𝒚̃𝑔 (matrix numerator) of 𝜷̂𝐹𝐸 for each cluster 𝑔. The inverse

operation makes it hard to see how the weighting practice deviates 𝜷̂𝑃𝐷 from 𝜷̂𝐹𝐸.

This section discusses the two equivalence conditions between FE and PD, and how PD deviates from FE as the conditions are relaxed. The discussion is inspired by a pair of mathematical concepts—mediant and weighted mediant, which are analogous to FE and PD in the number theory. This pair of concepts is introduced in the following section.

2.3.1 Mediant and Weighted Mediant

A mediant 𝑚 for a sequence of 𝑛 fractions 𝑎1

𝑏1, 𝑎2 𝑏2, … , 𝑎𝑛 𝑏𝑛 is defined as 𝑚 =𝑎1+𝑎2+⋯+𝑎𝑛 𝑏1+𝑏2+⋯+𝑏𝑛 = (∑ 𝑏𝑖 𝑖) −1_{∑ 𝑎} 𝑖 𝑖 , for 𝑖 = 1 … 𝑛,

where 𝑎_𝑖 is a nonnegative real number, and 𝑏_𝑖 a is positive real number. If 𝑤₁, 𝑤₂, … , 𝑤_𝑛 are 𝑛 positive real numbers, then a weighted median is defined as

𝑚_𝑤 = 𝑤1𝑎1+𝑤2𝑎2+⋯+𝑤𝑛𝑎𝑛

𝑤1𝑏1+𝑤2𝑏2+⋯+𝑤𝑛𝑏𝑛 = (∑ 𝑤𝑖 𝑖𝑏𝑖)

−1_{∑ 𝑤}

𝑖𝑎𝑖

𝑖 , for 𝑖 = 1 … 𝑛.

It is interesting to see that 𝜷̂𝐹𝐸 and 𝜷̂𝑃𝐷 can be viewed as, in matrix form, the

mediant and the weighted mediant of the sequence of [𝑿̃𝑔′𝑿̃𝑔]−1𝑿̃𝑔′𝒚̃𝑔. Unfortunately, the

literature of the mediant is based on numbers, not vectors or matrices. To avoid groundless application of the mediant theory to the comparison between FE and PD, this paper chooses

relevant theories and then conducts Monte Carlo simulation to verify their applicability to the matrix setting.

2.3.2 Relevant Theories from Mediant Theory

In general, there are two relevant conclusions from the mediant theory: the first (equivalence conditions) states two sufficient conditions for the equivalency between mediant and weighted mediant; and the second (deviation conditions) informs how the difference between them is correlated to the association between weights and values of the fractions.

Equivalence Conditions: the mediant equals to the weighted mediant if any of the

following conditions holds (see proofs in Appendix C):

a. (Equal-weight Condition) all weights are equal: 𝑤𝑖 = 𝑤 for all 𝑖 = 1 … 𝑛 and 𝑤 ∈ ℝ>0;

b. (Equal-fraction Condition) all fractions are equal: 𝑎𝑖

𝑏𝑖= 𝑎

𝑏 for all 𝑖 = 1 … 𝑛 and 𝑎 ∈ ℝ≥0, 𝑏 ∈

ℝ>0.

Deviation Condition: If a relatively larger fraction 𝑎𝑛

𝑏𝑛 is associated with a relatively larger weight 𝑤_𝑛, then 𝑚_𝑤 > 𝑚. In another word, if the covariance between 𝑎𝑛

𝑏𝑛 and 𝑤𝑛 is positive, then 𝑚𝑤 > 𝑚; otherwise, 𝑚𝑤 < 𝑚. A simple case when 𝑛 = 2 and a simulation with 𝑛 >

2 are shown in Appendix C.

2.3.3 Extension to Vector and Matrix

For the matrix setting, the mediant and weighted mediant could be defined as 𝑀 = (𝐵₁+ 𝐵₂+ ⋯ + 𝐵_𝑛)−1_(𝐴

59 𝑀_𝑤 = (𝑤₁𝐵₁+ 𝑤₂𝐵₂+ ⋯ + 𝑤_𝑛𝐵_𝑛)−1_(𝑤 1𝐴1+ 𝑤2𝐴2+ ⋯ + 𝑤𝑛𝐴𝑛) = (∑ 𝑤_𝑖 _𝑖𝐵_𝑖)−1_{∑ 𝑤} 𝑖𝐴𝑖 𝑖 ,

in which the “fraction” is 𝒇_𝑖 = 𝐵_𝑖−1𝐴_𝑖. Since FE and PD are vectors, we only consider the situations when 𝑀, 𝑀_𝑤 and 𝒇_𝑖 are vectors. Thus, if 𝐵_𝑖 is, say a 𝐾 × 𝐾 invertible matrix, then 𝐴𝑖 is a 𝐾 × 1 vector and the resulting 𝑀, 𝑀𝑤 and 𝒇𝑖 are all 𝐾 × 1 vectors.

It can be shown that both equivalence conditions hold, meaning that the mediant equals to the weighted median in the matrix setting when either all weights are equal or all matrix fractions are equal. For the equal-weight condition, if 𝑤𝑖 = 𝑤 for all 𝑖 = 1 … 𝑛,

𝑀_𝑤 = (∑ 𝑤_𝑖 _𝑖𝐵_𝑖)−1∑ 𝑤_𝑖 _𝑖𝐴_𝑖 = (∑ 𝑤𝐵_𝑖 _𝑖)−1∑ 𝑤𝐴_𝑖 _𝑖 = (∑ 𝐵_𝑖 _𝑖)−1∑ 𝐴_𝑖 _𝑖 = 𝑀. This condition generalizes the case (in section 2.2) in which FE and PD are equivalent when cluster sizes are equal. For the equal-fraction condition, if 𝐵_𝑖−1𝐴_𝑖 = 𝐵−1_{𝐴 for all 𝑖, then 𝐴}

𝑖 = 𝐵𝑖𝐵−1𝐴,

and 𝑀_𝑤 = (∑ 𝑤_𝑖 _𝑖𝐵_𝑖)−1_{∑ 𝑤} 𝑖𝐴𝑖

𝑖 = (∑ 𝑤𝑖 𝑖𝐵𝑖)−1(∑ 𝑤𝑖 𝑖𝐵𝑖)𝐵−1𝐴 = 𝐵−1𝐴 =

(∑ 𝐵𝑖 𝑖)−1(∑ 𝐵𝑖 𝑖𝐵−1𝐴) = (∑ 𝐵𝑖 𝑖)−1(∑ 𝐴𝑖 𝑖) = 𝑀. This condition suggests that if 𝒇𝑖 =

𝐵𝑖−1𝐴𝑖 is the same for all 𝑖, the weighted mediant does not deviate from the mediant even

though the weights are very different. In the case of 𝜷̂_𝐹𝐸 and 𝜷̂_𝑃𝐷, the fraction is 𝜷̂_𝑔 = [𝑿̃𝑔′𝑿̃𝑔]−1𝑿̃𝑔′𝒚̃𝑔, which is the local OLS estimate of 𝜷 using observations only in cluster 𝑔.

The equal-fraction condition informs us that if the local estimates are the same, 𝜷̂_𝐹𝐸 equals to 𝜷̂𝑃𝐷 no matter how different the cluster sizes are.

Unfortunately, the deviation condition no longer holds. Measuring the vector deviations with either element-wise comparison or the Euclidean distance between vectors, no correlation can be found between weighting larger weights to “larger” matrix fractions

and “larger” changes from mediant vector to the weighted mediant vector20_.

In regression analysis, even under stationary assumption of 𝜷, local estimates 𝜷̂_𝑔 could still deviate from the true 𝜷 and vary across clusters due to differences in error structures and cluster sizes.

For the variance-covariance matrices in (2.8) and (2.9), they are not in the form of matrix mediant and weighted mediant. Together with 𝜷̂_𝐹𝐸 and 𝜷̂_𝑃𝐷, the comparison between the variance estimators are examined by a Monte Carlo simulation in the next section.

In document Three Essays on Housing Markets (Page 57-60)