Structural holes - Using networks to identify differences between disciplinary and interdiscipl

Chapter 7: Using networks to identify differences between disciplinary and interdisciplinary authors

7.6. Structural holes

Structural holes in collaboration networks has been linked with diversity of information. As such, it would be expected that IDR would be characterised by a lot of structural holes. Structural holes are defined by absence of links between a node’s neighbours. For instance, neighbours to node i, j and k who are not linked to one another would be creating a structural hole.

Figure 7.13. The two structures show that node A has three structural holes, whereas node B does not have any. Despite the second graph being denser, it is reasoned that node A benefits from greater diversity. There is greater redundancy in the structure on the right.

The original concept was defined the individuals having complementary knowledge, but not being directly connected (Burt 2004, Burt 2009). This concept is based on the same underlying reasoning as the strength of weak ties that suggests that stronger, more homophilious ties are more likely to overlap in neighbours. That is to say, different knowledge travels through ‘bridges’, which are unlikely to have redundant connections.

121

It is important to note that such network structures are hypothetical, and based on a small-world concept, whilst having entire communities that are only weakly connected to other communities is rare.

The number of paths of length k leading from vertex i to j can be given by 𝐴𝑘𝑖,𝑗. Therefore, the

number of triangles is given by the following expression.

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒𝑠 𝑎𝑡 𝑛𝑜𝑑𝑒 𝑖 = 𝐴3𝑖,𝑖 (7.1)

A variation was found to provide better fits when indirect closures were also included in the measure. Indirect closures not only measures whether there is a direct link between neighbours, but also if there is an indirect link between them via another node.

Figure 7.14. Two different structures are considered in this figure. The circles represent nodes, solid lines represent links, and the dashed lines represent structural holes affecting node A. Consider the structure shown on the left. There are two structural holes. If a node were to be added as shown on the right, it is arguable that there is an indirect closure affecting the structural holes, lessening their impact.

This therefore not only accounts for triangles, but for rectangles as well. The number of rectangles associated with a node can be calculated by considering that paths of length 4 starting and finishing at the same node consists repeat movements of the second order and squares. Thus, subtracting such paths from 𝐴4𝑖,𝑖 would yield the number of directional rectangles associated at a given node. Using

the following equation, the number of rectangles can be found.

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑡𝑎𝑛𝑔𝑙𝑒𝑠 𝑎𝑡 𝑛𝑜𝑑𝑒𝑠 𝑖 = 𝐴4𝑖,𝑖− (𝐴2𝑖,𝑖 2 + ∑ 𝐴2𝑖,𝑗 𝑁 𝑗=1,𝑗≠𝑖 ) (7.2)

122

However, this unfortunately does not consider multiple intermediate nodes connecting to the same two neighbours (i.e. if there are many different indirect paths between two neighbours). Having additional indirect path would increase the number of rectangles, and this is not desired for calculating a structural hole, which is simply interested in if there is a closure or not. Therefore, the number of rectangles forming unique indirect closures will be given by the following expression.

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑡𝑎𝑛𝑔𝑙𝑒𝑠 𝑎𝑡 𝑛𝑜𝑑𝑒𝑠 𝑖 = 𝐴4 𝑖,𝑖− (𝐴2𝑖,𝑖 2 + ∑ ({𝐴 2 𝑖,𝑗 𝑖𝑓𝐴2𝑖,𝑗 ≤ 2 2 𝑖𝑓 𝐴2 𝑖,𝑗> 2 ) 𝑁 𝑗=1,𝑗≠𝑖 ) (7.3) 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑡𝑎𝑛𝑔𝑙𝑒𝑠 𝑎𝑡 𝑛𝑜𝑑𝑒𝑠 𝑖 = 𝐴4 𝑖,𝑖− (𝐴2𝑖,𝑖 2 + ∑ 𝐴2 𝑖,𝑗𝑥≤2 𝑁 𝑗=1,𝑗≠𝑖 ) (7.4)

Having established how to find the number of closures, it is then important to calculate the number of possible closures given the number of neighbours. By subtracting the number of closure by this number, the number of structural holes is found. The maximum number of structural holes is given by the following expression.

𝜎_{𝑚𝑎𝑥𝑘≥2}= 𝑘!

(𝑘 − 2)! (7.5)

This is derived from the unique number of permutations possible from neighbour pairs. Therefore, the proportion of possible triangular and rectangular structural holes are given by the following expressions. 𝜎_{𝑖𝑡𝑟𝑖𝑘} 𝑖≥2= 𝑘! (𝑘 − 2)!− 𝐴 3 𝑖,𝑖 (7.6) 𝜎_{𝑖𝑞𝑢𝑎𝑑} 𝑘𝑖≥2 = 𝑘! (𝑘 − 2)!− (𝐴 4 𝑖,𝑖− (𝐴2𝑖,𝑖 2 + ∑ 𝐴2 𝑖,𝑗𝑥≤2 𝑁 𝑗=1,𝑗≠𝑖 )) (7.7)

Whilst both can be combined to calculate a single number, these should be weighted. It is proposed that the structural hole contribution should be dependent on the path length. The dependency should furthermore be exponential on the path length of the closures (3 and 4 for triangular and rectangular closures respectively). Normalising by (𝑙 − 2)2 provides the following expressions.

𝜎_{𝑖𝑡𝑟𝑖𝑘} 𝑖≥2= 𝑘! (𝑘 − 2)!− 𝐴 3 𝑖,𝑖 (7.8)

123 𝜎_{𝑖 𝑞𝑢𝑎𝑑} 𝑘_𝑖≥2= 1 4 ( 𝑘! (𝑘 − 2)!− (𝐴 4 𝑖,𝑖− (𝐴2𝑖,𝑖 2 + ∑ 𝐴2 𝑖,𝑗𝑥≤2 𝑁 𝑗=1,𝑗≠𝑖 )) ) (7.9)

These are both an inversion analogous to the closed triplets clustering coefficient (Wasserman and Faust 1994).

𝐶 = 3 ∙ 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑜𝑠𝑒𝑑 𝑡𝑟𝑖𝑝𝑙𝑒𝑡𝑠

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑛𝑛𝑒𝑐𝑡𝑒𝑑 𝑡𝑟𝑖𝑝𝑙𝑒𝑡𝑠 (7.10)

Summing these two provides a single measure that can be used to measure structural holes. This measure is expected to be proportional to the metrics of success. The full measure is defined as the following expression. 𝜎_{𝑖𝑖𝑛𝑑𝑖𝑟𝑒𝑐𝑡𝑘} 𝑖≥2= 𝑘! (𝑘 − 2)!− 𝐴 3 𝑖,𝑖 +1 4 ( 𝑘! (𝑘 − 2)!− (𝐴 4 𝑖,𝑖− (𝐴2𝑖,𝑖 2 + ∑ 𝐴2_{𝑖,𝑗𝑥≤2} 𝑁 𝑗=1,𝑗≠𝑖 )) ) (7.11)

Using this measure, it is possible to investigate the effect of structural holes on research and IDR. It is important note that this assumes unweighted networks. To apply weight to clustering is not straightforward. Opsahl and Panzarasa (2009) outline that various approaches to weighted clustering have been proposed. The method proposed by Barrat, Barthelemy et al. (2004) suggests a measure using the arithmetic means of triplets. This method is robust, but does not take into consideration the weight of the closure (a closure has a length of 3, a triplet has a length of 2). As no suitable weighted measure can be found, an unweighted measure is used, although this could certainly be improved upon.

124

7.6.1. Model validity

The model validity is based on two things. The first is that the entire premise is based on structural holes representing heterogeneous knowledge. The second is that heterogeneity is associated with better academic outputs.

Therefore Hypothesis 7.1 needs to be extended to include both.

i. The proportion of structural holes, 𝜎𝑖, is greater in 𝑁𝑖𝑛𝑡𝑒𝑟 than in 𝑁𝑖𝑛𝑡𝑟𝑎.:

𝐻0: 𝜇(𝜎𝑖𝑛𝑡𝑒𝑟) ≤ 𝜇(𝜎𝑖𝑛𝑡𝑟𝑎)

𝐻𝐴: 𝜇(𝜎𝑖𝑛𝑡𝑒𝑟) > 𝜇(𝜎𝑖𝑛𝑡𝑟𝑎)

ii. The model, 𝑋𝑖𝑡, is positively correlated to the metrics of success, 𝑌𝑖𝑡.

𝑌𝑖𝑡 − 𝑌̅ = 𝛽𝑖 1(𝑋𝑖𝑡− 𝑋̅ ) + 𝛾𝑖 𝑡− 𝛾̅ + 𝑢𝑖 𝑖𝑡− 𝑢̅ 𝑖

Part i. was tested on the University of Bath network 2000-2017, and is tested using a cross-sectional two-tailed t-test. The results are shown in Table 7.10. It indicates that there is no statistically significant difference between the structural holes means for disciplinary and interdisciplinary authors.

This means that the premise that interdisciplinary authors have access to more diverse knowledge is not perceptible through the network structure.

Part ii. is tested to see if there is a positive correlation between the structural holes measure and the impact factor. As can be seen in Figure 7.15, there is a positive trend. The statistical results are given in Table 7.11. The F-statistic P-value is below the 0.05 threshold, and the R2_{-value is 0.4768.}

The null hypothesis is rejected and Hypothesis 7.1 ii. is corroborated. This implies that the number of structural holes a node has is positively correlated to the impact factor.

125

Table 7.10. T-test statistical analysis of structural holes measure.

The mean department-based disciplinary structural holes measure, 〈𝝈𝒊𝒏𝒕𝒓𝒂〉

9.3461

The mean department-based

interdisciplinary structural holes measure,

〈𝝈_{𝒊𝒏𝒕𝒆𝒓}〉

11.4954

T-value: -1.5854

P-value: 0.11322

The mean content-based disciplinary

structural holes measure, 〈𝝈𝒊𝒏𝒕𝒓𝒂〉

9.1620

The mean content-based interdisciplinary structural holes measure, 〈𝝈𝒊𝒏𝒕𝒆𝒓〉

10.4617

T-value: -1.3186

P-value: 0.18762

β1 is 33720 thereby confirming the null hypothesis and rejecting the alternative hypothesis. The

model that high eigenvector centrality will be less successful is disproved for the University of Bath co-authorship network 2000-2010 to 2000-2017.

However, as there is a positive trend, it is proposed that the null hypothesis model replaces the model.

126

Figure 7.15. Scatter plot for all points across all time, instead of individual points being shown, bars showing the spread is shown. The clear blue band shows the 95% confidence interval for the chosen regression. The scatter plot shows the structural holes vs. impact factor from 2000-2010 to 2000-2017 (i.e. 8 time-periods). A positive correlation can be seen.

Table 7.11. The statistical results of the fixed effects panel data analysis of structural holes vs impact factor from 2000- 2010 to 2000-2017. The R-squared values show that there is a relatively strong positive trend based on the structural holes (between) and along time (within).

127

7.6.2. Department-based disciplinarity differences

The box-plot of the correlation is shown in Figure 7.16. There appears to be a positive trend. However, this is not statistically significant as can be seen in Table 7.12.

If the trend were statistically significant, it would imply that interdisciplinary authors stand to benefit more from structural holes.

However, as it is not statistically significant, the null hypothesis cannot be rejected, and Hypothesis 7.2 is not corroborated.

Table 7.12. The statistical results of the fixed effects panel data analysis of betweenness vs impact factor from 2000-2010 to 2000-2017. The R-squared values show that there is no trend between and a weak trend within.

128

Figure 7.16. The box-plot for interdisciplinary authors as determined by department-based disciplines with 𝑘 ≤ 30 (as

fewer points cause greater deviation) separated by time. The box-plot shows the interdisciplinary authors’ structural holes vs. the interdisciplinary authors’ impact factor normalised by the disciplinary trend from 2000-2010 to 2000-2017 (i.e. 8 time-periods). An overall negative trend can be seen between, and inconclusive trends can be seen within.

7.6.3. Content-based disciplinarity differences

The content-based disciplines show similar results, albeit with more randomness. Hypothesis 7.2 is therefore not corroborated.

129

Table 7.13. The statistical results of the fixed effects panel data analysis of structural holes vs impact factor from 2000- 2010 to 2000-2017. The R-squared values show that there is a relatively strong positive trend based on the structural holes (between) and along time (within).

Figure 7.17. The box-plot for interdisciplinary authors as determined by department-based disciplines with 𝑘 ≤ 30 (as

130

7.6.4. Model discussion

The structural holes measure provided a clear linear positive trend to corroborate Hypothesis 7.1. However, the reasoning as to why this would be better for interdisciplinary authors does not hold. Therefore, the model is only validated when having structural holes is an advantage. It is not validated to interdisciplinary authors having this structural advantage.

Hypothesis 7.2 is rejected as the findings were statistically insignificant. However, there was a positive trend, suggesting that it is possible that interdisciplinary researchers gain additional benefits from structural holes in comparison to disciplinary nodes.

In document Sustaining Interdisciplinary Research: A Multilayer Perspective (Page 143-153)