Model Analysis - The Experimental Model and Analysis

3.2 The Experimental Model and Analysis

3.2.3 Model Analysis

The network tomography simplified linear model is given by the equation

Yt =AXt (3.1) In order to use this model, two of the three variables must be known to find the other. In the formula Yt represents the end-to-end delay, A is the network topology and Xt is the hidden internal delay. The end-to-end delay of a path is how long it takes for a packet to move from one end of a path to another end. The end-to-end delay consists of transmission delay, processing delay, queue delay and propagation delay. Transmission delay is the time taken for a packet to move from one end of a communication link to another. The processing delay is the time when a packet is received by a node and the time the packet was transmitted. The propagation delay is the delay cause by the property of the transmission media. Queue delay is the time difference between when a packet enter a queue and when it leaves the queue. We

3.2. THE EXPERIMENTAL MODEL AND ANALYSIS

put the end-to-end delay into two components, propagation delay and servicing delay. Servicing delay consists of processing delay, transmission delay and queue delay.

The end-to-end delay is needed for the estimation of the internal link delay distribution. There are two main approaches to finding the values of Yt, multicast and unicast. The multicast approach is known to be very successful [9]. In both cases however, probing is required to achieve results. The new approach here is to achieve similar results as the multicast approach without the need for any form of probing.

We reiterated that, the network tomography fails when probing packets travel at different rates since passive end-to-end delay measurement does not take into consid- eration the randomness associated with the movement of a packet traveling on a path.

We design an experiment that aims at solving this problem. There are four main assumptions in this methodology,

• packets generated by users from one end of a path is poisson distributed

• there will be enough packets that will transverse the path during the short period of measurement

• the network topology is a tree structure

• minimum norm least squares solution is the correct estimate of the internal link delay

When the link delays along a path are statistically independent, the end-to-end delay densities are related to the link delay densities through a convolution or com- bination of the individual link delay. Convolution methods are used to estimate the internal delay densities from the end-to-end delay. The idea of convolution can be compared to fourier transformation where signals can be combined to pass through a communication medial and split at the end of a communication channel. Some of the convolution methods are minimum norm least squares, transformation of the convolution into more tractable matrix operator via digitalization of the delays, expectation maximization algorithm (EM),iterative proportional fitting algorithm [28], estimation of the link delay cumulant generating function CGF from the end-to-end delay cumulant generating function CGF [29].

The independence of link delay assumption does not hold strictly in a real network due to the temporal correlations between network traffic. The use of disjoint passive packets could minimize the dependencies in X and hence make the model more re- alistic. The assumption of temporal independence is feasible as long as the interval between probes is large enough [30]. This highlights one of the merits of our proposed approach.

3.2. THE EXPERIMENTAL MODEL AND ANALYSIS

The linear equation Y = AX is ”under-determined” since i < j for the i by j matrix A. Under-determined linear equation has no or infinite number of solutions therefore algorithms are required to choose the best-fit solution. Among the infinite values of X, the least squares approach adopt the minimum norm as the best-fit solution. The minimum norm is the unique solution x that minimizes kxk2.

The internal delay distributions could be estimated, directly or indirectly. The indirect approach estimates the internal distributions without directly solving the linear equation for all the values of X. The direct approach solves the linear equation in order to obtained the distribution of X.

The accuracy of inference depends on which solution method is use to estimate the parameter or maximize the distributions of X. As mentioned before, there are sev- eral algorithms that can be used to estimate X. Maximum likelihood estimator, least squares, pseudo-likelihood and so on.

The maximum likelihood estimator (MLE) is a better parameter estimator than the least squares estimator [25], usually, MLE is consistent with the true parameter value that generated the data. MLE is expensive because it requires the computation of all the values of X as in direct approach. The goal here is not to find out which estimator achieves the best results but to determine how passive network tomography approach may perform compare to multicast and unicast correlation properties.

We will use the MLE to estimate the internal delay distribution. This means we will estimate the values of X to obtain the internal link delay distribution. MLE will be computed directly from the data without any algorithm.

In document Link Delay Inference in ANA Network (Page 38-40)