• No results found

3.5 Methods using probe packets

3.5.5 Implementations

In this section we provide a brief overview of topology discovery implementations based on probe packets that have been proposed over the years.

Early network topology discovery systems combined SNMP queries with ping and tracer- oute measurements and were mostly oriented towards intra-domain topology discovery [120,

75]. The first large-scale effort made to map and visualise the node-level topology of the Internet was started by Cheswick et al. in 1998 [16]. The authors used hop-limited probes to obtain node-level paths from a central location to each of a list of prefixes obtained from the RADB routing registry. They then used the paths obtained to induce a topology. The probes were sent to a randomly determined IP address in every prefix; since the objective was to determine paths and not count hosts, whether the target IP address responded or not was irrelevant. The topology was then visualised using graph-drawing techniques, with the colours of the nodes initially being obtained from their IP addresses in a very simple way (the first three octets of the IP address determined the red, green, and blue colour compo- nents respectively). The authors found that such a simple algorithm was already sufficient to make networks and ISPs recognisable on the resulting graphs. The project is still ongoing, as one of its goals is to collect topology data over time. This data has been used for various purposes; for example, the authors used it to evaluate the impact on Internet connectivity of major events such as the bombing of Yugoslavia in the spring of 1999, showing that the resulting damage is reflected in their Internet topologies.

An obvious limitation of this method is the fact that it only collects the outgoing paths from one location and thus does not discover cross-links. Subsequent work addressed this limitation by using source routing and, later, by using multiple probing locations. Further- more, this method does not attempt to resolve aliases; thus, the maps generated do not corre- spond to actual network topologies.

Mercator

Govindan and Tangmunarunkit address both these problems in [53]. They introduce a topol- ogy discovery system called Mercator, which uses hop-limited probe packets to determine paths and infers the topology from paths. Mercator also explores the network from a single location, but it makes use use of source routing to increase the number of links found. It veri- fies whether a router allows source routed packets by trying to use it as an intermediate hop to probe addresses from which Mercator has previously received replies; if these probes are an- swered, Mercator assumes that the router allows source routing. Each source routing-capable router thus discovered provides the capacity to probe the network from another location. The authors report that source routing is more useful than they expected: about 8% of routers they found (i.e., nearly 10,000 routers) permitted the use of source routing.

In order to explore the IP address space without requiring any input and in a manner more efficient than exhaustive probing, Mercator uses a technique the authors name “informed ran- dom address probing”. Starting from a seed prefix, Mercator maintains a list of the prefixes seen so far, and randomly and repeatedly probes prefixes adjacent to the prefixes in the list, under the assumption that Internet address registries allocate space sequentially. Whenever it

3.5. METHODS USING PROBE PACKETS 33

sees a probe response from an IP address in a previously unknown prefix, it adds the corre- sponding prefix to the list. The prefix length is determined using a crude heuristic according to the class [102] of the IP address: the natural prefix lengths (/8 and /16) for class A ad- dresses, and /19 for class C addresses.

Although Mercator’s goal is not to collect data over time, but to provide a snapshot of the Internet at a given moment, it takes several weeks for it to capture a complete topology.

Skitter

Skitter [62] is distributed topology discovery system developed by CAIDA [19] that performs hop-limited probes from a number of strategically placed locations around the global Internet. It uses similar alias resolution mechanisms to those used by Mercator, but the addresses probed are not randomly determined but specified by predefined lists of IP addresses. The lists are generated by taking a list of DNS root server clients, a list of web servers, and a list of IP addresses obtained from various sources, then selecting one IP address for every prefix announced in BGP.

Skitter is an ongoing project: every Skitter probe periodically cycles through its list of IP addresses and then starts again. Results are uploaded to a central location and made available to academic researchers and CAIDA sponsors. Skitter data has been used since 1998 for var- ious purposes including visualising interdomain topology, determining which ISPs provide transit service in the Pacific Rim region, and evaluating the quality of data provided by tracer- outes [66]; for an overview, see [62]. Since Skitter probes are co-located with some of the DNS root servers, it has also been used to study the quality of placement of the root servers themselves [81].

Rocketfuel

Rocketfuel [123] is a tool for measuring router-level topologies of ISPs. It aims to cap- ture complete topologies by using as few measurements as possible and without relying on confidential information. Rocketfuel is based on distributed hop-limited probes, but uses routing information to reduce the number of path measurements that must be performed. By using BGP routing information to perform only traceroutes that are likely to transit the target ISP and suppressing traceroutes that are likely to follow redundant paths through the ISP’s network, it is able to reduce the number of measurements required by three orders of magnitude compared to a brute-force, all-to-all approach. Rocketfuel also pioneered the IP identifier method for alias resolution (see Section 3.5.2). Finally, it applies empirical data and heuristics to DNS names in order to divide an ISP’s topology into access components (PoPs, complete with their geographical location), and backbone.

defra1711-tb-r6-0 (165) defra602-tb-p0-2 defra602-tb-s3-2 defra602-tb-r1-0 (172) defra228-tc-r14-0 defra228-tc-r2-0 defra228-tc-p5-0 defra228-tc-p3-0 defra228-tc-p1-0 (169) defra229-tc-r12-0 defra229-tc-r2-0 defra229-tc-p5-0 defra229-tc-p1-0 defra229-tc-p4-0 defra229-tc-p3-0 (170) defra202-tc-p10-1 defra202-tc-r5-0 (167) defra204-nc-u0.de defra204-nc-u3.de defra204-nc-u2.de (263) 195.158.236.122 (8) 195.158.236.218 (74) 213.174.79.28 (88) 195.158.236.66 (110) defra601-tb-p0-3 defra601-tb-r4-0 (171) defra604-ta-p4-0-0 defra604-ta-p0-0-0 (173) defra604-ta-s1-1-1 (174) frankfurt-ebs1-s1-0-1 (189) 195.158.236.134 (14) 195.158.236.194 (59) 195.158.236.170 (43) 195.158.236.214 (71) defra201-tc-p1-3 defra201-tc-r10-0 defra201-tc-p0-0 (166) defra203-ta-s1-1 defra203-ta-r5-0 (168) defra204-nc-r5-0.de (262) (a) (b)

Figure 3.6: Example topologies obtained by Rocketfuel in 2002: (a) the European backbone of AS 3257 (Tiscali International); (b) the Frankfurt POP of AS 1755 (Ebone).

each ISP to evaluate the topologies generated for its network, scanning the ISPs’ address space to verify whether any routers had been overlooked, and comparing the data found to ORV table dumps and Skitter data; they report very good results. The maps and the raw data were made available to the research community at [113].

For an example of the topologies obtained by Rocketfuel, see Fig. 3.6. Fig. 3.6(a) shows the European backbone topology of AS 3257 drawn by CAIDA’s GeoPlot [99] program; Fig. 3.6(b) shows the PoP-level map of the Frankfurt, Germany POP of AS 1755 (Ebone; since dissolved) drawn by the Graphviz [54] software. Note that Rocketfuel only obtains the topology; it does not perform any visualisation or geographical placement itself.

3.5.6

Evaluation

Obtaining the network topology by using probe packets has many advantages. Firstly, it pro- vides up-to-date, reliable information. Although the topologies it discovers are not complete, this is also true of other methods such as those using BGP tables; furthermore, the topologies may be made more complete by using distributed probing or source routing. Secondly, it is the only method for discovering node-level topologies in the Internet at large.

Its main disadvantage is long probing times. This is not only a performance concern: probing times are sufficiently long that network change during probing can significantly affect the results. Furthermore, distributed probing requires many measurement points in the whole network, and the marginal utility of subsequent probes is low. Distributed measurement also requires either the deployment of a measurement infrastructure or interaction with public traceroute servers, which is inconvenient. On the other hand, for research purposes it is also