B Selection of initial seeders - Diffusion of Rival Information in the Field *

The experimental game is initiated in 100 initial seeders. To have variation in the type of seeders of the game I selected a group of 100 nodes using 5 different criteria (20 from each). These were chosen to reflect standard methods used by policy makers to select receivers of a development program.

When facing the choice to target a limit number of households in a village, a standard approach is to randomly select a group of representative households. This can be done using a random walk sampling technique, where enumerators walk away from the center of the enumeration area and select households using a predefined sampling interval. Twenty of our initial nodes are chosen using this technique.2 When the geographic location of households is considered to play a role for the program (for example, if it is an agriculture program where plot characteristics matter, or in the case where social ties matter and researchers proxy those with geographic distance) a possible method is to do a geographic randomization of the village. Since there is evidence that geographic proximity is correlated with social proximity3we use a spatial randomization method (grid-randomization) to select 20 additional households.

The remaining 60 seeders are not selected with a representative objective in mind, but target the

2_{Five enumerators selected 4 households each, using a sampling interval of 45 houses.} 3

Empirical evidence includes:Fafchamps and Gubert(2007),Ambrus et al.(2014) andChandrasekhar and Lewis (2011).

most influential members of the community. Twenty households were named by the village leader as the most influential. Other 20 households were named by the villagers as the most influential (this question was part of the baseline survey). And the last 20 households are the ones with the highest eigenvector centrality score. In the case that the same household is selected by more than one criterion, I select one extra element using one of the repeated criteria. So, if one node is named by the leader and by the community, I consider the 21stmost influential node named by the leader or by the community (I tossed a coin to decide which).

FigureB1shows the position in the village network of the initial seeders of the first session of the lab game. The households who randomly received the rival game are represented in red. And the seeders of the non-rival game are in blue. All the 100 seeders are a member of the large component of the graph.

Figure B1: Position of the initial seeders in the village network

Note. Rival initial seeders are in red. And non-rival in blue. This figure represents the randomization for the first session of the monetary lab game.

C

Simulation of sharing in the network

The theoretical framework for the simulations follows a standard Susceptible, Infected, Removed (SIR) model, first developed byKermack and McKendrick(1927). SIR is used to study the epi- demic of a disease that behaves as follow: one or more infected people are introduced into a community of individuals who are susceptible to the disease. The disease spreads from infected to susceptible people. The SIR model (or an extended version, the Susceptible Infected Suscepti- ble model) is commonly used to approximate the spread of information and adoption of behavior by individuals (for a review seeGoyal(2012), for example). Although in the SIR model people

interact with each other randomly, I explicitly consider the network structure of the community so that connected people in the network are more likely to transmit. This is in line with the work by

Rapoport(1953a,b),Rapoport(1954) andKretzschmar and Morris(1996) that extended the SIR

model to allow for different network structures.

Let the community be represented by the graph(N, g). N = 1, ..., N is the set of individuals who are involved in a network, also known as the nodes of the graph. g is then∗nadjacency matrix where each entrygij represents the existence or not of a link, meaning a relation between two nodes. Time is discrete and at each period people interact and diffusion (or infection in the original terminology) can happen between two linked nodes. yt is the total number of people informed, at each periodt= 1, ...T. The final time period is at moment T when there are no new people to be informed. There exists a set of nodes who are initially informed,y1, which are the

initial seeders of the information. How these seeders became informed is exogenous to the model. In the standard version of the model, informed nodes transmit the information with someone with whom they share a link with fixed probabilityq. In order to introduce the strategic decision for sharing if the information is rival, the choice ofq(φ)is a function of how rival nodes perceive the information.

The simulations start by seeding each treatment to the respective 50 initial seeders. The probability for sharing in the rival game,qr, and for sharing in the non-rival gameqnis subject to:

qr < qn≤max(d) (4)

wheredis the average number of connections (degree). Notice that the second part of the inequal- ity assumes that sharing only occurs with previous neighbors, so that the game cannot be used to generate new connections.

To narrow the values ofqrandqn, I use the results of empirical studies that estimate the probability of diffusion in other contexts. ForqrI consider the results of sharing invitations to play a dictator game fromBanerjee et al.(2012). In their setting the invitations are rival, so I use their estimates of a differential speaking model. They estimate that the probability someone with experience playing the game informs a friend isqpast= 0.18, and the probability someone without experience informs a friend isqnot = 0.1. Based on this, the simulations set:qr∈[0.1; 0.2].

To define the sharing probability of the non-rival treatment I look at the diffusion of microfinance, studied by the same authors. Although one can discuss the non-rivalry of microfinance it seems reasonable to consider it less rival than the invitations for the game previously discussed. From

in the program, informs a neighbor at each roundqp_.4_{Depending on the set of moments used in the}

simulated generalized method of moments, the authors estimateqp to be between 0.45 and 0.65. The estimation ofqpin Banerjee and co-authors is for sharing at each round. In order to adapt their estimates to my design where players can only share once, I use the overall probability of sharing at least once. So, the upper and lower bounds forqnis1−P rob{not be inf ormed in any round}=

1−[(1−qp)]T, where T is the number of rounds. So the interval for non-rival probability is set as:qn∈[0.8; 0.95].

TableC2presents the results of simulating diffusion across the empirical network using the model and parameters mentioned before.5 Each simulation used the same set of initial seeders and was run 100 times. In tableC2I present the average of the 100 simulations. The target value forqnwas around 0.875 and forqraround 0.15.6 Simulating the diffusion forqr = 0.15was straightforward, and I present the average number of people invited, share of people invited and average sharing rate for t=4,5,6,7. In column (2) we see that if we set the number of rounds between 4 and 7, around 21.65 percent of the nodes end up being invited. This is a considerable share of the village and represents a large enough sample for estimation purposes.

Table C2: Results of the simulations for different levels ofqandt.

Note. Simulations using a random sample of 50 nodes as the diffusion points. Each combination of the probability of sharing (q) and number of rounds (t) was run 100 times. Here I present the average values for each of the 100 simulations. Column (1) shows the total amount of people invited at the end of the game (after the last round is played). Share of people invited, in column (2), is the fraction of people in (1) over the total amount of nodes. Column (3) is the average rate of sharing, that is the average number of people invited by each player - note this considers the number of people invited and not only the number of people who played the games. NA means that every node invited to play the final round had played before.

The authors also estimate the same probability for agents who do not participate, but since I am expecting a participation rate very close to 100 percent I do not consider this estimate.

5_{The R code for the simulations is available upon request. It can be easily adapted to other empirical networks and}

set of parameters (including initial seeders and number of rounds).

The simulation sequence for the non-rival parameter is not as straightforward. For sharing prob- abilities as high as 0.7, every node invited to play the third round had already played in previous rounds. The high sharing probability makes the diffusion simulation stop after completing the second round because there was no player invited for the third round that have not played the game before. A possible way to address this problem would be to choose initial seeders who are distance enough one from another so that they do not invite all the same people. Hence, there was this upper limit on the number of rounds,t. At the same time,tshould be small high enough so that the sample of player in the rival game is considerable for estimation. At the endtis set to 5 for both games. This also allows that on average a node can reach any other node while playing the game, since the average shortest path in the village is 4.4.

Other parameter to be defined is the maximum amount of people invited per player. It should be large enough so that the game will circulate across the network but small enough to avoid players of the non-rival game to consider as a game strategy to name each friend they have. Considering that the average sharing is 5.206 (col. (3) of tableC2) and that the average degree is 5.73 I allow players to invite up to 4 people. There is an additional advantage on setting this limit. If players do not invite everyone they know, some nodes will be left to be invited in subsequent rounds.Which will attenuate the issue with diffusion of the non-rival game explained before.

The last parameter to define is the amount of the prize in the rival game. This amount must ensure that the expected value of the rival game (prize amount over the expected number of players) is the same as the non-rival game (50 meticais). TableC3shows the results of re-running the simulations using the final samples of initial seeders used in each game. Sample 2 is selected to receive the rival game and sample 1 the non-rival game. With 217 nodes invited to play the rival game, column (1), the prize must be217∗50mts= 10.850mts. For implementation reasons, the prize is set to 10.000 meticais.

Table C3: Results of the simulations with final parameters for selected initial seeders.

Note. Simulations using the final sample of initial seeders. Sample 1 received the non-rival game and sample 2 the rival game. Each combination of the probability of sharing (q) and number of rounds (t) was run 100 times. Here we present the average values for each of the 100 simulations. Column (1) shows the total amount of people invited at the end of the game (after the last round is played). Share of people invited, in column (2), is the fraction of people in (1) over the total amount of nodes. In column (3) we have the average rate of sharing, that is the average number of people invited by each player - note this considers the number of people invited and not only the number of people who played the games.

In document Diffusion of Rival Information in the Field * (Page 38-43)