Block Random Graph Models - Network Formation: Basic Random Graph Model

Chapter 4 Social Network Formation in the Workplace:

4.4 Network Formation: Basic Random Graph Model

4.4.4 Block Random Graph Models

The attempts so far to capture both the network density and clustering levels of the line chief network with random graph models ignored the fact that ties are reported disproportionally to line chiefs from the same floor, a tendency which automatically increases clustering. We can incorporate floors in random graph models using block models, which date back to Holland et al. (1983). In these models, all nodes in a network are allocated to blocksB, a mutually exclusive and exhaustive partition of

Figure 4.4: Degree Distributions in ‘Meeting Friends’ Model

Figures show histograms of in-degree, and the Power distributions from the meeting friends model of Jackson and Rogers [2007] as shown in equation 4.7, withmfitted with the empirical density of the networks, andmr= 0.95m.

the nodes. The probability that a link is formed between nodesiandjthen depends on the pair-blockBrs which containsiand j, with pair-blockBrs being the pair of blocks Br and Bs, such thatBr contains node iand Bs node j. Essentially, nodes that are located in the same block have a distinct probability that ties are formed between them (which itself can vary from block to block), compared to nodes that are not in the same block (where the probability can also vary across the pairing of two blocks that contains the nodes).

The concept of the blocks naturally lends itself to sewing floors in the line chief network. I estimate a restricted version of the block model with the data from the network, in which I only consider two different probabilities; probabilitypS for a link being formed between two line chiefs from the same floor (regardless of which floor), and probability pR _{for a link being formed between two line chiefs from dif-} ferent floors (regardless of the pair of floors). I estimate this restricted version to see whether simply the within/across floor variation in network densities can account for the observed levels of clustering. The probabilities of link formation in the block model between two nodes in blockr andscan be estimated by the pair-block density ˆprs, the density of the network in which each (potential) link connects a node

from blockr with a node from blocks, wherebyr=sorr 6=s. Table 4.9 shows the estimated probabilities ˆpS and ˆpR for each factory-year network. Between 23% and 68% of all possible directed links within floors are reported in the networks, with the average of 36% across the factories (column 1 – average weighted by number of possible within-floor links per factory-year). For Factories 2 and 3, this estimate for

pSare now very close to the within-floor clustering coefficient, as shown in column 3, while for Factory 1 it is half of the coefficient and roughly two-thirds for Factory 4.14 This is a marked improvement compared to Table 4.8, where the overall clustering coefficient was ca. eight times as large as the overall density. Cross floor density varies between essentially zero at Factory 2 in 2014 (only two links reported across the six floors with 53 line chiefs) to 13% at Factory 2. This density is well in line with cross-floor clustering at Factory 1, 2 (at least in 2013) and 4, while out of line in Factory 3. However, the discrepancies between cross-floor density and clustering at Factory 3 could be due to small sample bias.15 Therefore, it seems that static block random graph models can capture both density and clustering of the line chief networks well.

To conclude this section on basic random graph models, as in many empirical settings, the basic random graph model due to Erds and Renyi (1959), and subse- quent growing random graph models struggle to model both the observed density and clustering at the same time. Also variations of the model explicitly designed to reconcile sparsity and high clustering, such as the ‘meeting-friends’ model from Jackson and Rogers (2007), do not provide a good fit, as their implied degree dis- tribution is at odds with the empirical one. However, a simple block random graph model gets close to reconciling observed network density and clustering.

This result contributes to a debate on the reasons for the ubiquitous high levels of clustering observed in empirical networks. Is it due to homophily, meaning that nodes of similar characteristics have a higher likelihood of forming links, thus generating clusters among themselves? Or is it due to network externalities, such that links which are part of a cluster yield higher utility? For example, friendship with another node could be more enjoyable if one shares third friends with this

Within floor clustering coefficient refers to the directed clustering coefficient, as defined in equation 4.3, on the network that ignores any pairs of nodes which are not on the same floor.

At Factory 3, only 17 out of 3,314 possible cross floor links were reported, and there are only three instance in which one directed cross-floor link can be followed by another cross-floor link from that node to a third node on yet another floor. In one of these three cases, however, (see graph of Factory 3 in the Appendix E), the first node does also have a direct node to the third node, resulting in a directed clustering coefficient of 0.33.

Table 4.9: Block-Random Graph Models

(1) (2) (3) (4) ˆ

pS pˆR Clustering Clustering Factory Year (within flrs.) (across flrs.) within flrs. across flrs.

1 2013 0.231 0.004 0.414 0.000 1 2014 0.334 0.001 0.577 0.000 2 2013 0.617 0.128 0.590 0.093 2 2014 0.683 0.127 0.622 0.226 3 2014 0.496 0.005 0.461 0.330 4 2014 0.412 0.018 0.649 0.000 All 0.359 0.009 0.516 0.126

Notes: Table compares for each network on the factory-year level the empirical (directed) network density within floors (ˆpS_{) and across floors (ˆ}_pR_{) against the}

empirical (directed) clustering coefficient as defined in equation 4.3, calculated within and across floors

node. These two scenarios are empirically often difficult to discriminate (Graham [2015a]). However, in the line chief network, in which we observe very high levels of clustering, we do have a simple measure of homophily, line chiefs working on the same floor. Once we allow for the possibility of differential likelihood of social ties being formed within floors, static random graph models, which ignore network externalities, explain the observed level of clustering. It could be that also in other empirical networks with high clustering, if we had better information on underlying group structures of nodes, a lot of variation in clustering could be captured by block random graph models.

In document Training, organizational learning and productivity : three essays on the Bangladeshi garment industry (Page 114-117)