• No results found

2.3 Estimating Peer Effects in Networks

2.3.2 Peer Effects Model

The linear-in-means peer effects model was popularized by Manski (1993) and has subsequently been explored in a variety of contexts. This model takes the form:

yiv= α + β ¯y(i)+ γxi+ δ ¯x(i)+ λv+ εiv (2.1)

where an individual’s outcome is a function of their peer group’s mean outcome ¯y(i) (“the endogenous peer effect”), an individual observable xi, and the peer group’s mean observable ¯x(i) (the “exogenous peer effect”).10 Manski’s original insight (as well as subsequent work by Moffitt et al. (2001)) is one of non-identification, i.e. that it is impossible to separately identify parameters on the endogenous and the exogenous peer effect. There are three reasons for this: the reflection problem (an individual’s outcome enters on both sides of the model), contextual effects (individuals in a peer group may share unobservable shocks that can be confounded with peer effects), and

10Manski (1993) originally calculated peer effects as containing individual i; however, subsequent work (see Moffitt et al. (2001) suggests a formulation where individual i is excluded from calculating the mean.

I follow the second convention here, where ¯y(i)) is the mean of the peer group excluding individual i.

endogenous group membership (individuals may select peer groups because they share similar attributes). Any attempt to estimate peer effects must credibly address these three problems. Consequently, a variety of approaches have been developed to resolve them, including varying group sizes (Lee, 2007), variance-covariance restrictions (Graham, 2008), properties of binary response models (Brock and Durlauf, 2007), and social networks (Bramoullé et al., 2009; De Giorgi et al., 2010).

Bramoullé et al. (2009) shows that peer effects can be separately identified using a social network as a peer group as long as the social network has certain properties.11 This result nests the results of Manski (1993), Moffitt et al. (2001), and Lee (2007) as special cases and shares the identification properties of those models. Additionally, as long as multiple networks are observed, it is possible to account for contextual effects using a network level fixed effect. Formally the condition for identification is that an identity matrix, the adjacency matrix, and the adjacency matrix squared must be linearly independent. In practice, this means that the network must posses intransitive triads, sets of three individuals in an undirected network where individual i is connected to individual j and j is connected to k, but i is not connected to k.12 Additionally, a sufficient condition is that the longest “shortest’ path (known as a geodesic) between any

11Although similar results were proven by De Giorgi et al. (2010), I rely on the formulation and results of Bramoullé et al. (2009) through the remainder of this paper.

12An intransitive triad is contrasted with a triangle, where all individuals are connected. The conditions for a directed network are slightly different, but because the networks in question are not directed, these conditions will not be explored in this work.

two individuals has length longer than two (or three in the case of the model including fixed effects).13 This condition is satisfied in all of the networks in question.14

Importantly, the Bramoullé et al. (2009) framework allows for network links to be endogenous (formed on the basis of observable attributes); however, this requires the assumption that the attributes that affect network formation are included in the model. If they are not, the peer effect estimation may simply be identifying the effect of sorting along unobserved attributes and the estimate will be biased. Prior research using microfinance (as well as the DoM project specifically) has done little to explore this specification problem. My primary contribution in this paper is to carefully address how the peer effects models are specified, relying on previous results regarding social networks to resolve the reflection problem and contextual factors. This is done by explicitly controlling for peer observable attributes showing that effects persist across a variety of specifications of the network. I also put forth a placebo test to show that the peer effects identified are not simply due to endogenous sorting. Additionally, relying on the results of Banerjee et al. (2013), I specify the model using network centrality as a proxy for the probability of being informed. This is important to show that the peer effect is not due simply to a higher likelihood of being informed about the program;

13The longest geodesic is calculated by identifying the shortest path between all possible combinations of nodes. If a geodesic exists that is longer than 2, there exists at least one intransitive triad.

14Because the networks are sampled the longest geodesic is difficult to calculate, since many links are missing which may facilitate shorter paths between individuals. However, the longest geodesic even when restricting the network to only sampled individuals is still at least 4 in all cases (it is at least 6 when using the star sub-networks). Additionally, the conditions are satisfied in the graphically reconstructed networks, which should be similar to the whole network.

however, this requires special care when using a sampled network to ensure that the network statistics are estimated properly.

Peer effects must also be estimated using a instrumental variables strategy because the outcomes appear on both sides of the model. Following work in spatial econometrics, Bramoullé et al. (2009) suggests that the adjacency matrix raised to a power and then multiplied an exogenous covariate vector is a valid instrument. This exploits the fact that the square of an adjacency matrix returns a matrix indicating the friends of an individual’s friends. Since every individual’s outcome in a peer effects model is a function of their friend’s attributes, this is a valid instrument as long as friends of friends do not directly affect an individual. If this is the case, friends of friends only influence an individual through the individual’s friends’ outcomes. While the proper estimation strategy is a matter of some dispute, Lee (2007) and Bramoullé et al. (2009) prefer maxmium likelihood estimation while Goldsmith-Pinkham and Imbens (2013) suggest Bayesian models, I rely on two stage least squares (2SLS). The results vary little depending on the estimation technique; however, one reason to prefer 2SLS in this setting is computational feasibility, as the graphical reconstruction process requires estimating the model many times. Additionally, while the results should apply generally, Chandrasekhar and Lewis (2012) only prove the efficacy of their corrections for 2SLS and Generalized Method of Moments estimators.