2 L ITERATURE R EVIEW
2.3 Internet Market Structuring
2.3.2 Network Formation Network Models
Network Science generally studies the forces that shape developments of networks and their structuring. Networks are composed of vertices (representing network’s agents) and their edges, linking them, representing relationships between these network vertices. According to Schneider and Bauer (2016), empirical networks on the Internet are considered to be neither regular nor random. Regular networks refer to network graphs where each vertex has the same number of neighbouring vertices and every vertex has the same number of In- and Out-Degrees, representing incoming and outgoing relationships between vertices. Networks may be studied based on their edges being of a directed or undirected nature. Directed networks refer to relationships amongst edges that are directed (e.g. vertex A links to vertex B but vertex B not to vertex A). Undirected networks merely acknowledge if there is a linkage (and possibly its number of occurrences) between vertices, or not. The literature covers a number of network formation models that follow specific structural properties. Random Networks employ probability distributions (Bollobás, 2001) and were first defined by Erdős and Rényi (1959) and independently by Gilbert (1959). Watts and Strogatz (1998) and Watts (1999) propose a Small-World Network model where the vertices in so-called sub-graphs (subsections of network graphs) are densely interconnected amongst each other. Watts and Strogatz (1998) are credited for this model but base their work on earlier models of Simon (1962). Albert and Barabási (2002) introduce the preferential attachment (‘rich- get-richer’) effect of vertices and edges in so-called Scale-Free Network models. These preferential attachment models allow researchers to simulate the emergence of growth in
networks, which are discussed as network effects and network externalities in economics, where Katz and Shapiro (1985) and Economides (1995) amongst others, discuss the implications of network externalities on the telecommunications market structures. Power-law Degree Distributions
By looking at the aggregate properties of the resulting distributions, the preferential attachment modality of establishing connections between vertices in a network leads to the presence of power-law degree distributions (Albert and Barabási, 2002; Barabási Labs, 2013). These power-law degree distributions are typical indicators of the presence of a hierarchical network structuring since a few vertices have many edges directly linking them with other vertices, whereas many vertices only have a few edges. This feature is typically captured in distributions of edges, following power-law degree
distributions (Pareto, 1906) which have also proven useful in modelling income
distributions (Reed, 2001). Faloutsos, Faloutsos and Faloutsos (1999) find that the Internet structure follows power-law degree distributions at the Autonomous System level. The work of Dall’Asta et al. (2005) supports this finding. When comparing different tools for generating network structures, Medina, Matta and Byers (2000) argue that
power-laws can only be found in dynamical growth models such as the one of Barabási
and Albert (1999), which adds new vertices and edges to a network. Hence, Medina, Matta and Byers (2000) provide sufficient proof that outgoing connectivity of a vertex (see section 3.4.2 below) and rank exponents (preferential attachment of edges and vertex growth (Barabási and Albert, 1999)), provide ‘useful means’ for testing the structure of the Internet. Before that, Crovella and Bestavros (1996) find that the Internet at the World Wide Web level also displays power-law degree distributions. This is supported by the findings of Albert, Jeong and Barabási (1999), Huberman and Adamic (1999) as well as Kumar et al. (1999). Caldarelli, Marchetti and Pietronero (2000) then find, on the basis of Internet mapping efforts by Cheswick, Burch and Branigan (2000), that an analysis at Router-Level (Internet Protocol granularity) from an end-user perspective, also shows
power-law degree distributions as well as Scale-Free Network properties. While Pastor-
Satorras, Vázquez and Vespignani (2001) and subsequently Vega-Redondo (2003) support these findings, Knight et al. (2011) argue that power-law degree distributions seem convincing but lack accurate data, since the data used was not published in line with the usual articles. Lakhina et al. (2003) also argue against Faloutsos, Faloutsos and Faloutsos (1999), saying that power-law functions are an illusion of biased data. More recently, Willinger and Roughan (2013) also challenge the power-law analysis saying
that traceroutes detections at Internet Protocol (IP) Level are representing network specifics (opaque layer 2 cloud networks) and add that traceroutes are unable to reveal the actual vertex degree of any routers. They conclude that the absence or presence of
power-law degree distributions cannot be justified with reasonable statistical confidence.
While taking a pragmatic stance on this issue, we recognise the importance to choose the most appropriate granularity of analysis.
Insight 6: Based on the research findings stated in the above literature, we expect our case study networks to display power-law degree distributions for primary collected active Internet periphery measurements. Given the economic nature of the upstream Internet market, these power-law degree distributions are signalling the presence of a Tier-Model of Internet Service Provider relationships (see e.g. Luckie et al., 2013), and can be used to explore the presence of hierarchical structuring in the upstream Internet access market.
Levels, or Granularities of Analysis
Research in the Computer Sciences does not seem to be reaching a consensus on the most appropriate level of granularity for the analysis of Internet networks. Faloutsos, Faloutsos and Faloutsos (1999) mention two possible levels of analysis, namely the Router level (Internet Protocol) and the Inter-Domain Level (Autonomous Systems). Vega-Redondo (2003) agrees on these two levels of analysis. Others such as Huffaker, Fomenkov and Claffy (2016) from CAIDA define six possible granularities of analysis, namely the Fiber, IP address, Router, Points-of-Presence, Autonomous System and Internet Service
Provider. Just like Faloutsos, Faloutsos and Faloutsos (1999), Willinger and Roughan
(2013) mentioned the Router level but elaborated further on the Switch granularity (IP Links between hubs and switches), the Physical level (including all Layer 1 devices), the Point-of-Presence Level, the Application Layer such as HTTP and HTML and finally the Autonomous System Level. When analysing the economics of Internet routes, Kagami, Tsuji and Giovannetti (2004) differentiate between three layers of analysis: the end-user level, the Internet Service Provider level and the major Internet backbone providers. These layers can be divided into the traditional supply and demand sides in economics. This represents a valuable departure from the more technical approaches of Computer Science. Given all these levels of granularities, a thorough structural analysis becomes impossible, since one might study links and flows between physical objects as well as information (Willinger and Roughan, 2013). By analysing the difficulties of simulating
the Internet, Floyd and Paxson (2001) also reveal the great heterogeneity for studying the individual links of network traffic or the information flow through protocols on top and argue that the structure of the Internet is difficult to characterise due to its ever-changing dynamics. Nevertheless, the study of the Internet structure at different granularities, especially the Router and Autonomous System granularities, is considered to be of equal and fruitful importance (Faloutsos, Faloutsos and Faloutsos, 1999). Moreover, the results may represent a Complex Network architecture composed of many vertices and few relationships amongst the vertices (Vega-Redondo, 2003). Gorman and Malecki (2000) argue, that a combination of Network Analysis for studying the Internet structure is a surprisingly under-researched field. However, one has to choose the most appropriate granularity of analysis, given the research problem at hand.
Insight 7: Discussing the different granularity assures us that our exploration should be most valuable using the Internet Protocol and Autonomous System granularities, following best practices of the early Computer Science literature such as Faloutsos, Faloutsos and Faloutsos, (1999). Moreover, we believe that the Autonomous System granularity allows us to shed light on economic relationships amongst Internet Service Providers.
Detailed research that relates to our case study is very limited and mainly attributes to the following research papers. In using four different datasets, Barnett and Park (2012) investigate the structure of the World Wide Web (WWW) using Network Analysis. Their findings indicate that the Internet consists of a series of Small-World Networks, which only seems applicable for WWW networks. However, fully interconnected sub-graphs at IP or Autonomous System granularity appear counter-intuitive, given the connectivity role played by Tier-1 Internet Service Providers as described in section 2.1.2 above. More recently, and most related to this dissertation, is the work of UC Davis researchers Ruiz and Barnett (2015), who study the International Internet Service Provider (ISP) ownership network at company and national levels. Their approach relies on secondary Telegeography Autonomous Systems data for 113 companies and captures the number of
Internet Service Provider relationships, their vertex degrees as well as the Eigenvector
and Betweenness Centralities. The findings of Ruiz and Barnett (2015) show that Level 3 Communications, Century Link, Telia Sonera, AT&T and Cogent Communications are the most central companies in their limited dataset. This finding was to be expected, given the role of the Tier-1 Internet Service Providers that their study finds as well as the
CAIDA (2016a) AS-Rank data. However, their study fails to employ additional relevant network metrics and therefore lacks an in-depth analysis of structural network phenomena. Moreover, Ruiz and Barnett (2015)’s work is based on secondary data rather than primary collected active Internet periphery measurements, representing an end-user access to the upstream Internet market. Also, closely related to this work is the pilot case experiment by Giovannetti and Sigloch (2015) who study the incumbent Bhutanese Mobile Broadband operator network at IP and AS granularity using active Internet periphery measurements. Their analysis of primary active Internet periphery measurements using the Clustering Coefficient metric reveals the structural properties of the upstream Internet market, while also indicating previously hidden upstream Autonomous System relationships. While this pilot work opens up an entirely new field of research, they also lack to link it to end-user affordability.
Insight 8: Giovannetti and Sigloch (2015) find previously hidden Autonomous System relationships that were not visible in the CAIDA (2016b) dataset. These hidden relationships were identified using the traceroute analysis at IP and AS granularity for a Bhutanese mobile broadband operator. We infer that an extension of the employed analytical approaches should also reveal hidden AS-relationships for the mobile broadband operators in this case study (see below).