Graph Visualisations and Layouts Graph visualisations

2 L ITERATURE R EVIEW

2.3 Internet Market Structuring

2.3.4 Graph Visualisations and Layouts Graph visualisations

Due to its sheer size, mapping the Internet often relates to difficult abstractions of the real-world (Danesh et al., 2001). The first graph visualisation of the Internet might have been the backbone drawing of the early ARPANET from 1969 as Figure 2-3 on the next page illustrates. This graph visualisation includes the first four institutions of the ARPANET (see section 2.1.1 above) as well as their respective connections.

Figure 2-3: ARPANET 1969 graph visualisation, Source: The Ocp (2016).

With the increasing structural complexity of the Internet, Burch and Cheswick (1999) attempted to map the Internet by studying 88,000 Internet Protocol (IP) addresses and their associated routers and found critical indications for hop distances between their local Carnegie Mellon University and Lycos, an important search engine at that time. By using paths from a local test host containing 90,000 networks towards another host on a destination network, Cheswick, Burch and Branigan (2000) visualise network vertices using a force-directed graph visualisation layout (see below). Their visualisation reveals a number of interesting Internet Service Providers. However, their work also mentions the high complexity of the graph visualisation, which makes it hard for them to conclude their findings with great confidence (Cheswick, Burch and Branigan, 2000). Moreover, there is a relevant group of authors that aim to map the Internet infrastructure from a

geographical perspective. Important examples include Lakhina et al. (2003) who use the CAIDA dataset from 20 ‘Skitter’ monitors (see project description above) with the CAIDA NetGeo IxMapper. Another example is the work of Shavitt and Zilberman (2012), who utilise the DIMES database to map Point-of-Presence connectivity (vertices between networks) with the Arc GIS (2016) mapping software. Roberts et al. (2011) elaborate on one of the only approaches to map the Indian Autonomous System landscape. However, their study covers 100 countries using the CAIDA (2016b) AS- Relationship data and does not explicitly focus on the Indian landscape. Dimitropoulos et al. (2007) reveal country-level Autonomous Systems with the greatest network control. Their work presents the resulting graph visualisations in a Circular Layout using the Flare Toolkit (2010) for China, Russia, The Republic of Korea, The Islamic Republic of Iran, Egypt, Sweden, Ukraine, Angola and India, as well as a comparison between those graphs. Their work identifies four Autonomous Systems with a great control for the 17,98 million analysed Indian IP addresses but does not mention the names of these Autonomous Systems, while also neglecting the structural properties of their composition. However, Dimitropoulos et al. (2007) indicate that the number of Indian Autonomous Systems with great network control is fairly low compared to the other countries in their study. Their work also neglects an end-user perspective and only builds on secondary data. Notably, there are hardly any research efforts mapping upstream Autonomous System relationships and especially upstream connectivity structuring, originating from an Internet Periphery perspective. Caldarelli, Marchetti and Pietronero (2000) analyse, on the basis of the data obtained from Cheswick, Burch and Branigan (2000), some network indicators from an end-user perspective at IP granularity where they find signs of hierarchical structural ordering between end-users and providers. Tangmunarunkit et al. (2002) independently of Caldarelli, Marchetti and Pietronero (2000) or Cheswick, Burch and Branigan (2000) show that while the Internet embodies a hierarchical structuring, graphs are better modelled without explicitly constructing hierarchies. This refers to network visualisations using the Directed Acyclic Graph layout, amongst others. While Giovannetti and Sigloch (2015) explore the upstream network connectivity structure of the incumbent Bhutanese mobile broadband provider, B-Mobile, from an Internet Periphery Analysis introduced by Faggiani et al. (2012), their generated graph visualisation only covers rudimentary analysis.

Graph Layouts

Graph visualisations are considered to be well suited to display agents and their relationship information in networks (Eick, 1996). Moreover, the visualisation of network graphs helps, according to Bastian, Heymann and Jacomy (2009), to understand network structures and their data, while the process of graph visualisation analyses is best suited to follow exploratory strategies (Perer and Shneiderman, 2006). Graph visualisations are done using graph layouts that represent the spatial foundation of a visualisation, including the positioning of vertices and the edges among them. Therefore, graph layouts are used for highlighting specific but highly relevant graph characteristics (Brath and Jonker, 2015). In Figure 2-4 below we generated a random network graph visualisation of an example network consisting of 200 vertices and 1,333 edges linking those vertices using a Random Layout. The graph visualisation using the Random Layout fails, as expected given the Random Layout, to display specific network characteristics.

Figure 2-4: Random Layout graph visualisation with 200 vertices and 1333 edges, elaborated using Gephi (2016).

Nevertheless, the choice of a specific graph layout depends on the research questions addressed. While the literature shows a large set of possible graph visualisations, only a few are valuable for the exploratory analysis of Internet network structures. Among the visualisations considered in this work is the Layered Layout by Kuchar (2012), which places vertices in different layers depending on specifically chosen attributes. According

to Kuchar (2012), this layout is particularly appropriate for studying Small-World

Network phenomena, representing (completely dense) connections between any vertices

in a network. This layout, therefore, helps in testing whether or not any of our mobile broadband operators displays Small-World Networks characteristics. Given the importance of large Tier-1 Internet Service Providers, we would expect to see no Small-

World Network effects. Nevertheless, this analysis helps to obtain indicators of the

interconnection efficiency of a network. If every Autonomous System were connected to any other Autonomous System in the network, then the networks wouldn’t display hierarchical structural features, which is highly unlikely considering the aforementioned tier-ordering of the Internet. Figure 2-5 below illustrates the same random example network graph visualisation as above using a Layered Layout.

Figure 2-5: Layered Layout graph visualisation with 200 vertices and 1333 edges, elaborated using Gephi (2016).

A different graph visualisation is the Fruchterman – Reingold Layout, which focuses on visualisation aesthetics, meaning that edges are more or less having the same visualisation lengths while not crossing each other in the visualisation. This is arranged by applying forces to the edges and vertices based on their relative position in the network (Fruchterman and Reingold, 1991). These forces are applied using spring-like attractions using the Hooke’s law of Physics. This graph layout, therefore, helps when analysing the importance of specific edges in a network. Chan et al. (2003) use the Fruchterman-

Reingold, rather than other layouts, to visualise the structure of a Border Gateway Protocol routing networks, while proposing a new layout. Moreover, they consider the Fruchterman-Reingold Layout as particularly useful to capture and visualise the presence

of power-law degree distributions in networks. Figure 2-6 below depicts the same example network graph visualisation as above using a Fruchterman-Reingold Layout.

Figure 2-6: Fruchterman-Reingold Layout graph visualisation with 200 vertices and 1333 edges, elaborated using Gephi (2016).

To increase the intuitive usage of general layouts, Jacomy et al. (2014) introduce the

Force Atlas 2 layout. This layout is considered to be useful in helping an intuitive spatial

visualisation of networks. Compared to the Fruchterman Reingold layout, the Force Atlas

2 layout shows better performance and usability with strongly clustered networks. This

is important since performance ultimately adds to the readability of the graph visualisation. Moreover, the Force Atlas 2 layout employs avoidances of vertex overlap, which is particularly interesting when trying to identify vertex clusters or white spaces of unconnected vertices in the network structure. In terms of its application, Hasani and Mehdipour (2015) use the Force Atlas 2 layout for visualising traffic in an Internet Protocol (IP) address network. Figure 2-7 below illustrates the same random example network graph visualisation as above (200 vertices and 1333 edges) using a Force Atlas

Figure 2-7: Force Atlas 2 Layout graph visualisation with 200 vertices and 1333 edges, elaborated using Gephi (2016).

More closely related to the analysis of Autonomous System networks, Alvarez-Hamelin et al. (2005b) introduce and use a k-core decomposition for the World Wide Web and Internet analysis. Carmi et al. (2005) and Alvarez-Hamelin et al. (2008) use the k-core

decomposition in communication networks such as the Internet at Autonomous System

granularity. The k-core decomposition separates the network vertices into so-called k-

cores (see coloured k-cores in Figure 2-8 below), or sub-graphs, based on the given

connection densities amongst vertices. This means that the most densely connected vertices would be situated in the highest k-core of the network visualisation, whereas less dense connected vertices are situated increasingly in the periphery of the visualisation. Hence, the k-core decomposition indicates the most important hierarchical vertices of a given network. Figure 2-8 below depicts the same random example network graph visualisation as above (200 vertices and 1333 edges) using the k-core decomposition.

Figure 2-8: k-core decomposition graph visualisation with 200 vertices and 1333 edges, elaborated using R (2016).

Insight 11: Given its distinct applicability to study the structure of the Internet (see Alvarez-Hamelin et al., 2005b), we consider the k-core decomposition and its resulting graph visualisation as the best algorithm to discover influential Autonomous System vertices in our networks. Given the economic nature, we expect that the most densely connected Autonomous Systems being Tier-1 Internet Service Providers. Moreover, we expect that other graph layout visualisations, such as Force Atlas 2, provide valuable exploratory insights on structural features. These indications will be useful to explain and compare the three mobile broadband operator graph visualisations and their general structural features. Our work is the first to apply such a broad spectrum to the exploratory analysis of active Internet periphery data.

In document Mobile Internet connectivity, exploring structural bottlenecks in Tamil Nadu using active Internet periphery measurements (Page 83-90)