Flow Specifications - How To Analyse Internet Traffic With A Java Applet

2.3 Flow Specifications

2.3.1 What are Flow Specifications?

In the previous section, we described the basic idea behind a traffic flow, which is to aggregate traffic that matches certain criteria. What we have not spoken about yet is how those criteria can be defined.

The most important advantage of our very general definition is that the communicating entities are not yet specified any further. Until now, we only assumed that they are two machines on the network that communicate with each other. However, the entities do not have to correlate with machines (i.e. network addresses) but they can as well be whole subnets of machines, classes of applications or autonomous systems1_{. In fact, anything that can be used to distinguish data packets is a potential}

criteria. In the following sections we will further specify which criteria can be used for the so called “flow specification”.

A first proposal for a flow specification was given by Partridge in RFC1363 [29]. This proposal is however mainly motivated by the ideas of resource reservation functionality. Therefore, Partridgesflow specification is focused on Quality of Ser- vice (QoS) parameters which shall be used for the reservation of bandwidth for certain kinds of multimedia traffic. Theflow specification he proposes is to be used by the network for admission control and resource allocation purposes.

For network measurement and analysis — our main interest — we need a different approach. Our aim is to aggregate as much information as possible about the high amount of data that is transferred and at the same time to use as less memory for this aggregation as needed. Since we usually do not know in advance what we are searching for and what we want to measure, we need a flexible way to define what kinds of traffic we want to aggregate.

2.3.2 Flow Directionality

First, one can define a flow as unidirectional or bidirectional. While TCP traffic always is connection oriented and therefore always must be bidirectional, it still often exhibits strong asymmetries in the traffic profile of the two directions. Each TCP flow from A to B also generates a reverse flow from B to A, at the very least for small acknowledgment packets.

For data aggregation, we may or may not be interested in measuring those two flows separately, therefore measurement and analysis applications should ideally be configurable in regard to this parameter2.

In the Internet environment is is possible to use a unidirectional definition of flows, i.e., bidirectional traffic between A and B is to be seen as two separate flows: traffic from A to B, and traffic from B to A. This allows to get interesting insights for the analysis of routing issues or traffic characteristics. The aggregation of those two flows into one unidirectional flow could on the other hand be sufficient for accounting. Obviously it makes sense to allow the unidirectional defintion, since a later transformation of unidirectional flows into a bidirectional flow is always possible.

2.3.3 One vs. Two Endpoint Aggregations of Traffic

The second aspect of a flow is related to its endpoints. As mentioned the model allows us not only to examine data exchange between two entities on the network, it is as well possible to aggregate all traffic that originates from a specified entity or that is addressed to a specified entity. Such flow specifications are called “single endpoint flows” in contrary to the “double endpoint flows”, where source and destination addresses are being specified.

An example where a single endpoint flow is interesting is the aggregation of all data transferred from a given destination network number. Those measurements could

2.3 Flow Specifications 35

be compared to the traffic aggregated between this network number and a given second network number to calculate the percentile of traffic from the given network to another network.

2.3.4 Types of Flow Endpoints

An aspect already mentioned above, and certainly the most important criteria for flow specifications are the flow endpoints. The endpoint specification somehow has to describe the communicating entities. Potential granularities for this description include aspects such as traffic by

Hosts, identified by

– Network layer address (e.g. IP address) – Link layer address (e.g. ethernet address) – Symbolic hostname

Networks, identified by

– Network number – Domainname

Abritrary groups of hosts

Traffic sharing a common path on the network, identified by

– Interface number on a backbone node – ATM connection identifiers (VCI/VPIs)

Various additional granularities could be defined, depending on the type of the local network installation and the demands of the user. The only common critera that all granularities have to fulfill is that it must be possible to check for each received packet whether it matches the critiera for the flow or not.

Internetwork Transport Application Station A Router Flow Meter Bridge Station B TCP, UDP, ... IP Network Technology (Ethernet, ...)

Figure 2.3: Flow–Measurement in the layered model of the Internet

Figure 2.3 illustrates where the flow endpoints could be positioned in the layered communication model of the Internet. If flows are to be specified with a granularity that reaches the application layer, the measuring entity will of course also have to have knowledge about the format of the data of this layer. In the TCP/IP model in order to define flows based on the transport layer, the port identifier field of the TCP header would have to be analyzed.

The granularities do not necessarily have an inherent order, as a single user or application might straddle several hosts or even several network numbers. Generally, flow criteria dont have to be restricted to single network layers. It is also possible to specify a flow using a combination of different criteria on several layers, for example one could aggregate all traffic that is generated by a specific application on a specific machine.

The possibility to define a flow in such a variable way is the huge advantage and strength of this model. When developing measurement applications, often it will show that one does not know exactly what should be measured in advance. The configurability of this model reflects this and by keeping as general and abstract as

2.3 Flow Specifications 37

In document How To Analyse Internet Traffic With A Java Applet (Page 37-42)