• No results found

Test of the Spatial Markov Assumption

7.5 Markov Model

7.5.3 Test of the Spatial Markov Assumption

To further examine the goodness of the spatial Markov assumption, we use a relative

and q(x, t) as defined in [20]:

D(pkq) =X

x

p(x, t)logp(x, t)

q(x, t). (146)

The relative entropy is a measure of the distance between two distributions p(x, t) and q(x, t). If q(x, t) is “closer” to p(x, t), D(pkq) is smaller; and D(pkq) = 0 if and only if p=q.

For our case, p(x, t) = P(XNi(t) = xNi(t)|Xi(t) = 0) is the joint distribution

of the statuses of node i’s neighbors given node i is susceptible at time t. For the independent model, q1(x, t) =

Q

j∈NiP(Xj(t) = xj(t)); for the Markov model,

q2(x, t) = Q

j∈NiP(Xj(t) = xj(t)|Xi(t) = 0). We obtain the relative entropies

D(pkq1) and D(pkq2) through simulation on a four-neighbor two-dimensional lat-

tice with 10,000 nodes, β = 0.1, δ = 0.1, and 1,000 individual runs. As described in Section 7.4.3, each node is represented by its coordinate, and the worm begins to spread from node (0,0). Node i is specified at (1,1). Figure 33 shows how the relative entropies D(pkq1) and D(pkq2) change with time. It is observed that the

relative entropies are initially close to 0, but increase with time. D(pkq2) is smaller

than D(pkq1) for all time t, suggesting that the spatial Markov model is indeed a

better approximation than the spatial independent model. On the other hand, when t >60,D(pkq2)>0.5. This explains the performance gap between the Markov model

and the simulation observed in Figures 30 and 32. Hence, a model that incorporates the more spatial dependence than the Markov model may result in a smaller relative entropy.

7.6

Final Size of Infection

The final size of infection corresponds to the equilibrium state of a worm network that is the average number of infected nodes when time t approaches infinity, i.e., limt→+∞n(t). The final size of infection characterizes the potential damage as a

0 50 100 150 200 250 0 0.2 0.4 0.6 0.8 1 1.2 1.4 time Relative Entropy Independent model (D(p||q 1)) Markov model (D(p||q 2))

Figure 33: Relative entropies in a two-dimensional lattice with 10,000 nodes, β = 0.1, andδ = 0.1.

stage of worm spreading, the potential damage can be assessed, and preventive actions can be taken accordingly. In this section, we compare our proposed models with the simulation results and the other models in estimating the final size of infection in homogeneous and complex networks. Each simulation scenario has 100 individual runs and is averaged over the cases that worms survive. The final size of infection is sampled at time t= 2000.

Figure 34(a) shows a comparison of the Epidemiological model, the AAWP model, the independent model, the Markov model, and the simulation results on a connected ER random graph with 10,000 nodes,k = 10, and δ= 0.1. When compared with the simulation results, the Epidemiological model over-predicts the final size of infection when β 0.02, whereas the AAWP model and the independent model slightly over- predict it. The results of the Markov model and the simulation overlap for 0.001

β 1. Therefore, the Markov model is the most accurate one among all these models. Figure 34(b) gives another comparison of the UCN model, the independent model, the Markov model, and the simulation results on a BA network with 10,000 nodes, k = 4, andβ = 0.1. The UCN model over-predicts the final size of infection, whereas the independent model slightly over-predicts it. The results of the Markov model and

10−3 10−2 10−1 100 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 birth rate

final size of infection

Simulation Epidemiological model AAWP model Independent model Markov model

(a) An ER random graph with 10,000 nodes,k= 10, andδ= 0.1. 10−3 10−2 10−1 100 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 death rate

final size of infection

Simulation UCN model Independent model Markov model

(b) A BA network with 10,000 nodes,k= 4, and

β= 0.1.

Figure 34: Performance comparisons in estimating the final size of infection. the simulation overlap for 0.001 δ 1. Therefore, both the independent model and the Markov model are shown to be good estimators of the final size of infection, and the Markov model is more accurate than the independent model.

7.7

Summary

In this chapter, we have presented a spatial-temporal model to study the dynamic spreading of worms that employ different scanning methods. Making use of this model, we have studied the impact of the underlying topology on worm propagation. We show that the detailed topology information and the spatial dependence are key factors in modeling the spread of worms. The independent model incorporates the de- tailed topology information and thus outperforms the previous models. Our Markov model incorporates both the detailed topology information and the simple spatial dependence, and thus achieves a greater accuracy than the independent model, espe- cially when both the birth rate and the average nodal degree are small. Moreover, when the graph is dense, each node fluctuates independently about its mean value, and thus both models perform well. These results are validated through analysis and extensive simulations on large networks using real and synthesized topologies.

The class of models we have investigated are biased, i.e., with a reduced complex- ity. Hence, the accuracy of such models is important. The relative entropy is used as a performance measure and shows that a performance gap still exists between the Markov model and the reality. Formulations are needed to incorporate the more spa- tial dependence into the model. Furthermore, as both models are motivated by the spirit of the mean-field approximation in machine learning, a formal treatment of the mean-field approximation to include the temporal dependence will be studied in our future work. As part of the ongoing work, we also plan to estimate the parameters of worm propagation (e.g., the birth rate and the death rate) and use our proposed models to study the countermeasures for controlling the spread of worms. Our mod- eling approach may also help to understand a wide range of information propagation behaviors in Internet, such as BGP update streams and file sharing in peer-to-peer applications.

CHAPTER VIII

CONCLUSIONS AND FUTURE RESEARCH

DIRECTIONS