Betweenness centrality - Using networks to identify differences between disciplinary and interd

Chapter 7: Using networks to identify differences between disciplinary and interdisciplinary authors

7.4. Betweenness centrality

Betweenness centrality has been used to determine how in between all nodes in a network an individual is. It is calculated by determining the number of shortest paths that go through the node (Freeman 1977). Betweenness centrality is calculated by the following expression.

𝐶_{𝑏𝑒𝑡𝑤𝑒𝑒𝑛𝑛𝑒𝑠𝑠𝑖}= ∑ ∑ 𝜀𝑗𝑘(𝑖) 𝜀𝑗𝑘 𝑁 𝑘=1,𝑘≠ 𝑗≠𝑖 𝑁 𝑗=1,𝑗≠𝑖 (7.1)

Where 𝜀𝑗𝑘 is the number of shortest paths (1 if there is a definite shortest path) between nodes j and

k, and 𝜀𝑗𝑘(𝑖) is the number of shortest paths going through node i. This requires two sets of

108

In cases where there is no unique path, all shortest paths between two pairs must be known. To efficiently find all paths and their dependencies, the number and length of shortest paths must be known. These can be calculated using matrix multiplication, as 𝑨𝑛_{provides the connectivity matrix}

for path length n. By increasing n by 1 at a time and storing the first non-zero instances in node- pairs would yield the shortest path length, n, and the number of shortest paths, 𝑨𝑛

𝑖𝑗. However, this

is computationally expensive, requiring 𝑂(𝑛𝑁3_{) calculations, where N is the number of nodes.}

Furthermore, matrix operations cannot store path dependencies.

Dijskstra and Breadth-First Search (BFS) algorithms are well suited to store the path dependencies. Furthermore, by virtue of the shortest paths acting like trees, it is possible to use predecessor paths to determine shortest paths, as shown in Figure 7.6.

Figure 7.6. Shortest paths create trees. By utilising 𝑝𝑎𝑡ℎ(𝐴, 𝐶) = 𝑝𝑎𝑡ℎ(𝐴, 𝐷) + 𝑝𝑎𝑡ℎ(𝐷, 𝐶)The shortest path from A to

D is straightforward. The shortest path from D to C and D to E diverge. The shortest path(A, C) is equal to the shortest path(A, D) + path(D, C). In exactly the same way, this can be taken advantage of to reduce the computational cost.

Using such methods, it is also possible to reduce the computational cost of finding the shortest paths to 𝑂(𝑁𝑒 + 𝑁2log 𝑁) and 𝑂(𝑁𝑒) in unweighted networks, where e is the number of links (Brandes 2001).

Betweenness has been reasoned as being important to developing academic knowledge as it provides an indication of how many different ideas flow through a node, thereby increasing their overall centrality in the knowledge network (Nahapiet and Ghoshal 1998, Li, Liao et al. 2013). This assumes heterogeneous knowledge flows, but a positive correlation to academic outputs has been found.

109

7.4.1. Model validity

The model validity is testing whether the model holds for this data set, this is done by testing Hypothesis 7.1. As can be seen in Figure 7.7, there is linear positive trend. The statistical results are given in Table 7.1. The F-statistic P-value is below the 0.05 threshold, and the R2_{-value is}

0.2390. This is a relatively weak correlation, and does not perform as well as the degree centrality model.

β1 is 0.0031 thereby rejecting the null hypothesis and accepting the alternative hypothesis. The

model is validated.

This trend holds through all networks from 2000-2010 to 2000-2017.

Figure 7.7. Scatter plot for all points across all time, instead of individual points being shown, bars showing the spread is shown. The clear blue band shows the 95% confidence interval for the chosen regression. The scatter plot shows the betweenness vs. impact factor from 2000-2010 to 2000-2017 (i.e. 8 time-periods). A positive correlation can be seen.

110

Table 7.4. The statistical results of the fixed effects panel data analysis of betweenness vs impact factor from 2000-2010 to 2000-2017. The R-squared values show that there is a relatively strong positive trend based on the betweenness (between) and along time (within).

7.4.2. Department-based disciplinarity differences

As the model is deemed valid, it is possible to test for differences in disciplinary and interdisciplinary authors.

The box-plot of the correlation is shown in Figure 7.8. No trend can be seen between, and a small positive increase can be seen within. The trend given in the statistical analysis confirms that it is a very small value in Table 7.5. This is not valid as the R2_{-value between is negative. This means that}

a horizontal fit is better for the time-averaged values. Given that an overall trend can be seen, but is not well represented by an OLS trend, no conclusion can be drawn.

The null hypothesis cannot be rejected. Hypothesis 7.2 is rejected and no discernible difference between disciplinary and interdisciplinary authors can be identified for the department-based disciplines for the betweenness model.

111

Table 7.5. The statistical results of the fixed effects panel data analysis of betweenness vs impact factor from 2000-2010 to 2000-2017. The R-squared values show that there is no trend between and a weak trend within.

Figure 7.8. The box-plot for interdisciplinary authors as determined by department-based disciplines with 𝑘 ≤ 30 (as

fewer points cause greater deviation) separated by time. The box-plot shows the interdisciplinary authors’ betweenness vs. the interdisciplinary authors’ impact factor normalised by the disciplinary trend from 2000-2010 to 2000-2017 (i.e. 8 time-periods). An overall negative trend can be seen between, and inconclusive trends can be seen within.

112

7.4.3. Content-based disciplinarity differences

The content-based disciplinary again shows less variation between disciplinary and interdisciplinary authors as shown in Table 7.6.

The null hypothesis cannot be rejected, and therefore Hypothesis 7.2 is rejected. There is no discernible difference between disciplinary and interdisciplinary nodes for content-based disciplines for the betweenness model.

Table 7.6. The statistical results of the fixed effects panel data analysis of betweenness vs impact factor from 2000-2010 to 2000-2017. The R-squared values show that there is a relatively strong positive trend based on the betweenness (between) and along time (within).

113

7.4.4. Model discussion

For the betweenness model, Hypothesis 7.1 is corroborated whilst Hypothesis 7.2 is rejected. Hypothesis 7.1 being corroborated means that the findings from similar work (Li, Liao et al. 2013) hold for the University of Bath co-authorship network.

This suggests that the betweenness is a suitable indicator for performance in research organisations when hard boundaries are drawn around the organisation. However, it has a weaker correlation to impact factor than the degree centrality does.

As with the degree centrality, Hypothesis 7.2 is rejected and no statistically significant differences between disciplinary and interdisciplinary authors can be found. This means that both interdisciplinary and disciplinary authors could be judged equally by their betweenness centrality. As with the degree centrality, this can be useful as there is a trend, but it provides us with no further knowledge regarding the differences between disciplinary and interdisciplinary authors.

In document Sustaining Interdisciplinary Research: A Multilayer Perspective (Page 130-136)