CONCLUSION

In document Modeling Cascading Failures in Complex Networks (Page 41-72)

We introduced a load-based cascade model to study the vulnerability of complex net- works under random single-node attacks, where the ER random graph with finite size was used to represent the network. We assumed that the capacity of a node is proportional to its initial load and the load of a failed node is redistributed to its neighbors according to their capacity. The average failure ratio at each step was used to quantify the damage ex- perienced by the network. A step-by-step estimation of the average failure ratio has been provided. The accuracy of such estimations was validated by numerical results. Our anal- ysis for finite-size networks revealed a phase transition phenomenon in network reactions to single-node attacks, where the average value of the failure ratio drops quickly within a short interval of the load margin. We characterized this interval by finding the critical value of the tolerance parameter at which the failure ratio takes its median value and is most sensitive to the variation of the tolerance parameter. We also derived the threshold interval within which this phase transition occurs. Our findings shed light on how to set the load margin for both robustness and efficient use of resources in designing networks resilient to random single-node attacks.

Reprinted from [11] “Load-Dependent Cascading Failures in Finite-Size Erd¨os-R´enyi Random Net- works” by D. Lv and A. Eslami and S. Cui, 2017, Network Science and Engineering, IEEE Trans on (TNSE),

REFERENCES

[1] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D. U. Hwang. Complex net- works: Structure and dynamics. Physics Reports, 424(45):175–308, Jan. 2006. [2] Paolo Crucitti, Vito Latora, and Massimo Marchiori. Model for cascading failures in

complex networks. Phys. Rev. E, 69(4):045104, Apr. 2004.

[3] I. Dobson, B.A. Carreras, and D.E. Newman. A branching process approximation to cascading load-dependent system failure. In Proceedings of the 37th Annual Hawaii International Conference on System Sciences, pages 1–10, Hawaii, Jan. 2004. [4] P. Erd¨os and A. R´enyi. On random graphs I. Publ. Math. Debrecen, 6:290–297,

1959.

[5] Paul Erd¨os and A R´enyi. On the evolution of random graphs. Publ. Math. Inst. Hungar. Acad. Sci, 5:17–61, 1960.

[6] E. N. Gilbert. Random graphs. The Annals of Mathematical Statistics, 30(4):1141– 1144, 1959.

[7] Ake J. Holmgren. Using graph models to analyze the vulnerability of electric power networks. Risk Analysis, 26(4):955–969, Aug. 2006.

[8] R. Kinney, P. Crucitti, R. Albert, and V. Latora. Modeling cascading failures in the North American power grid. The European Physical Journal B - Condensed Matter and Complex Systems, 46(1):101–107, July 2005.

[9] Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, June 2014.

[10] Daqing Li, Bowen Fu, Yunpeng Wang, Guangquan Lu, Yehiel Berezin, H. Eugene Stanley, and Shlomo Havlin. Percolation transition in dynamical traffic network with

evolving critical bottlenecks. Proceedings of the National Academy of Sciences, 112(3):669–672, 2015.

[11] D. Lv, A. Eslami, and S. Cui. Load-dependent cascading failures in finite-size erdos- renyi random networks. IEEE Transactions on Network Science and Engineering, 4(2):129–139, April 2017.

[12] Yi-Hua Ma and Dong-Li Zhang. Cascading network failure based on local load distribution and non-linear relationship between initial load and capacity. In Machine Learning and Cybernetics (ICMLC), 2012 International Conference on, volume 3, pages 935–940, July 2012.

[13] Adilson E. Motter and Ying-Cheng Lai. Cascade-based attacks on complex networks. Phys. Rev. E, 66(6):065102, Dec. 2002.

[14] M. E. J. Newman, S. H. Strogatz, and D. J. Watts. Random graphs with arbitrary degree distributions and their applications. Phys. Rev. E, 64:026118, Jul 2001. [15] Mark EJ Newman. Random graphs as models of networks. arXiv preprint cond-

mat/0202208, Feb 2002.

[16] Ke Sun and Zhen-Xiang Han. Analysis and comparison on several kinds of models of cascading failure in power system. In Transmission and Distribution Conference and Exhibition: Asia and Pacific, IEEE/PES, pages 1–7, Dalian, China, Aug. 2005. [17] JianWei Wang and LiLi Rong. Cascade-based attack vulnerability on the US power

grid. Safety Science, 47(10):1332–1336, Dec. 2009.

[18] JianWei Wang and LiLi Rong. Robustness of the western United States power grid under edge attack strategies due to cascading failures. Safety Science, 49(6):807–812, July 2011.

[19] Duncan J. Watts. A simple model of global cascades on random networks. Proceed- ings of the National Academy of Sciences, 99(9):5766–5771, Apr. 2002.

[20] ZhiXi Wu, Gang Peng, WenXu Wang, Sammy Chan, and Eric WingMing Wong. Cascading failure spreading on weighted heterogeneous networks. Journal of Statis- tical Mechanics: Theory and Experiment, 2008(05):P05013, May 2008.

APPENDIX A

SOME PROOFS FOR THEOREMS, COROLLARIES, LEMMAS

Lemma 2.3.1 Consider a random single-node attack applied to G(n, p). Let node a be the attacked node, and node e be an arbitrary node in V\{a}. Let Pd be the probability

that the shortest path from e to a has length d; Pr{Bd} be the probability that at least one

path from e directly through a node in V\{a ∪ e} to a has a length which is less than or equal to d. Then E[|Vd|], d ≥ 1, the average size of Vd, is simply

E[|Vd|] = (n − 1)Pd,

where Pd, d≥ 1, can be obtained recursively as

P1 = p, P2 = (1− p)(1 − (1 − p2)n−2), Pd = (1− p) Pr{Bd} − d−1j=2 Pj, d > 2.

In the numerical calculation, we assume that Pr{Bd}, d > 2, can be approximated recur-

sively as Pr{Bd} ≈ 1 − ((1 − p) + p · (1 − d−1j=1 Pj))n−2.

proof 4 (Proof of Lemma 2.3.1) Suppose that we randomly pick a node a from the ER graph G(n, p), where any two nodes are connected with probability p. For an arbitrary node e ∈ V \a, we define a family of probabilities as Pd = Pr{the shortest path between

nodes e and a has length d}, P[i,j] = Pr{the shortest path between nodes e and a has

length within [i, j]}, satisfying

P[i,j] =

j

d=i

Pd, 0 < i≤ j.

Since Pd is the same for all the nodes in V\a, the expectation of |Vd| over all possible

topologies can be calculated as

E[|Vd|] = (n − 1)Pd. (5.1)

We use induction to obtain Pd. We first have P1 = p. When d≥ 2, we find Pdas

Pd =      P[2,d] d = 2 P[2,d]−d−1 j=2Pj d > 2 , (5.2)

where P[2,d] remains to be found. The event “the shortest distance from e to a is within

[2, d]” is true if the following two independent events happen at the same time: A =“e is not directly connected to a” and Bd=“at least one path from e directly through a node in

V\{a ∪ e} to a has a length which is less than or equal to d”. It can be easily seen that Pr{A} = 1 − p. Then P[2,d], d ≥ 2 can be obtained as

P[2,d] = Pr{A} Pr{Bd}

Pr{A}=1−p

−−−−−−−→ = (1 − p) Pr{Bd}. (5.3)

Now we aim to obtain Pr{Bd}, d ≥ 2. Given d = 2, node e can go through any node

happens if any of these paths is connected. Therefore,

Pr{B2} = 1 − (1 − p2)n−2. (5.4)

Combining (5.2), (5.3) and (5.4) we have

P2 = (1− p)(1 − (1 − p2)n−2). (5.5)

When d > 2, substituting (5.3) into (5.2), we have

Pd = (1− p) Pr{Bd} − d−1

j=2

Pj, d > 2. (5.6)

Combining P1 = p, (5.5) and (5.6) yields Lemma 1.

Now we aim to verify the approximation

Pr{Bd} ≈ 1 − ((1 − p) + p · (1 − d−1

j=1

Pj))n−2, d > 2. (5.7)

Since directly obtaining Pr{Bd} is complicated, we first focus on its complement ¯Bd=“the

lengths of the (n−2) shortest paths from e directly through a node in V \{a∪e} to a are all greater than d”. Therefore, Pr{Bd} can be obtained through 1 − Pr{ ¯Bd}. Now our goal

is to approximate Pr{ ¯Bd}. For notational simplicity, let l1,· · · , ln−2 denote the lengths

of the (n− 2) shortest paths from e directly through a node in V \{a ∪ e} to a, such that Pr{ ¯Bd} is the joint probability that l1,· · · , ln−2are greater than d, which can be expressed

as

Pr{ ¯Bd} = Pr{ n−2

i=1

To obtain (5.8), we start from analyzing the joint probability that arbitrary two path lengths from l1,· · · , ln−2 are greater than d. Let us randomly pick two nodes v1, v2

V\{a ∪ e} and consider paths e → v1 99K a and e → v2 99K a, where “99K” stands for

the shortest path between the two nodes. Let l1 denote the length of e → v1 99K a and

l2 denote the length of e → v2 99K a. In ER random graph, e is connected to each node

independently with probability p. We can list the following three scenarios regarding the connectivity between e and v1, v2:

1. e is not directly connected to v1, such that l1 =∞. In this case, l1and l2will not be

affected by each other, such that they are independent.

2. e is not directly connected to v2. Similarly, l1 and l2are independent.

3. e is directly connected to both v1 and v2. In this case l1 and l2 become dependent.

Let C1 =“Scenario 3): e is directly connected to both v1and v2”. According to scenarios

1) and 2), we have l1 ⊥ l2 | ¯C1. Now we aim to calculate the joint probability Pr{l1 >

d, l2 > d}. Given d > 2, Pr{l1 > d, l2 > d} can be rewritten according to the law of total

probability: Pr{l1 > d, l2 > d} = Pr{l1 > d, l2 > d| C1} Pr{C1}+ Pr{l1 > d, l2 > d| ¯C1} Pr{ ¯C1} = Pr{l1 > d, l2 > d| C1}p2+ Pr{l1 > d, l2 > d| ¯C1}(1 − p2) l1⊥l2| ¯C1 −−−−−→ = Pr{l1 > d, l2 > d| C1}p2+ (5.9) Pr{l1 > d| ¯C1} Pr{l2 > d| ¯C1}(1 − p2),

where Pr{l1 > d| ¯C1} can be obtained by

Pr{l1> d| ¯C1} = (Pr{l1 > d} − Pr{l1 > d| C1} Pr{C1})/ Pr{ ¯C1}

= (Pr{l1 > d} − Pr{l1 > d| C1}p2)/(1− p2). (5.10)

Similar to (5.10), we can obtain Pr{l2 > d| ¯C1}. Substituting them into (5.9), we obtain

Pr{l1 > d, l2 > d} = Pr{l1 > d, l2 > d| C1}p2+

(Pr{l1 > d} − Pr{l1 > d| C1}p2)

·(Pr{l2 > d}− Pr{l2 > d| C1}p2)/(1− p2),

When p2≈0, cancel p2

−−−−−−−−−−−→ ≈ Pr{l1 > d} Pr{l2 > d}. (5.11)

The approximation (5.11) holds when p2 is small, which is true under typical values of p in networks of typical sizes, e.g., n ≥ 20. In addition, some simulations were con- ducted to test the accuracy of approximation (5.11), and the results support our analysis. The following steps describe how the simulations were conducted: in a network of size n, arbitrarily four nodes a, e, v1, and v2were selected. Then we counted l1and l2in 100, 000

realizations of ER random graph. Based on the numerical results of l1 and l2, we calcu-

lated Pr{l1 > d, l2 > d} and Pr{l1 > d} Pr{l2 > d}. Under varied values of n, p and d,

these two probabilities are always approximately equal, which indicates the assumption is valid even in small networks with relatively larger p, e.g., n = 20, p = 0.3. Partial results are shown in Table 5.1.

in (5.8). First we rewrite (5.11) as

Pr{l1 > d, l2 > d}

= Pr{l1 > d| l2 > d} Pr{l2 > d}

≈ Pr{l1 > d} Pr{l2 > d},

Since Pr{l2>d}̸=0, cancel Pr{l2>d} on both sides

−−−−−−−−−−−−−−−−−−−−−−−−−−−→

Pr{l1 > d| l2 > d} ≈ Pr{l1 > d}. (5.12)

Since an ER random graph is homogeneous network and each node has an identical statistical property, result (5.12) also applies to other path lengths l1,· · · , ln−2. In a

finite-size network G(n, p), according to the chain rule, the joint probability Pr{ ¯Bd} =

Pr{ni=1−2li > d} can be expanded as

Pr{ ¯Bd} = Pr{ n−2 i=1 li > d} = n−2 i=1 Pr{li > d| i−1j=1 lj > d}, (5.12) −−−→ ≈ n−2 i=1 Pr{li > d}. (5.13)

In order to find (5.13), we need to have Pr{li > d}, i = 1, · · · , n − 2. Assume l1 is the

length of the path through node v1, so l1 > d happens when either e is not connected to v1

(l1 =∞) or e is connected to v1 but the distance between v1 and a is greater than d− 1.

Because ER random graph is a homogeneous network and l1,· · · , ln−2 have the identical

statistical property, Pr{li > d}, i = 1, 2, · · · , n − 2, can be obtained by

Substituting (5.14) into (5.13), we have Pr{ ¯Bd} ≈ ((1 − p) + p · (1 − d−1j=1 Pj))n−2, d > 2, (5.15)

Thus, the approximation in (5.7) is justified.

Table 5.1: Simulation results to test the dependence of two path lengths. Network d Pr{l1 > d, l2 > d} Pr{l1 > d} · Pr{l2 > d}

G(100, 0.05) 4 0.9287 0.9286

10 0.9054 0.9051

G(20, 0.3) 4 0.8265 0.8270

8 0.4926 0.4922

Theorem 3.2.1 Consider a random single-node attack applied to G(n, p). We assume the conditional distribution of k(V1) given|V1| = x is approximately normal with mean µ

and variance σ2, where

µ = x + x(x− 1)p + (n − x − 1)xp, σ2 = (2x(x− 1) + (n − x − 1)x)p(1 − p).

Then E[f1], i.e., the average failure ratio at step 1, can be approximated as

E[f1] 1 n ( 1 + n−1x=1 ( n− 1 x ) px(1− p)n−1−xΦ( x α−1 − µ σ ) ) ,

proof 5 (Proof of Theorem 3.2.1) Since E[f1] can be obtained as E[f1] = n1(1 + E[|F1|]),

it is enough to find E[|F1|]. Based on Corollary 1, we have the failure condition for V1 as

k(V1) < k(V0α−1). According to the failure condition, the distribution of|F1| can be expressed

as

Pr{|F1| = x}

= Pr{k(V1) <

k(V0)

α− 1 | |V1| = x} · Pr{|V1| = x}, (5.16)

and expectation of|F1| can be obtained by the law of total probability as

E[|F1|] = n−1x=1 x· Pr{k(V1) < k(V0) α− 1 | |V1| = x} · Pr{|V1| = x}, (5.17)

where |V1| is the number of nodes in V1, which obeys a binomial distribution. That is,

|V1| ∼ B(n − 1, p) and Pr{|V1| = x} = ( n− 1 x ) px(1− p)n−1−x. (5.18)

Now to find (5.17), we need to calculate the conditional distribution of k(V1) given

|V1| = x. The links adjacent to nodes in V1 can be divided into three categories: edges

between V0and V1, within V1, and between V1and V2, denoted by the setsE(V0, V1),E(V1),

andE(V1, V2), respectively. Such partition of edges is illustrated in Fig. 5.1. We then have

k(V1) = |E(V0, V1)| + 2|E(V1)| + |E(V1, V2)|. (5.19)

Figure 5.1: Partition of edges within and adjacent to V1: E(V0, V1) = {1, 2, 3}, E(V1) =

{4}, E(V1, V2) = {5, 6, 7, 8}.

can be easily seen that|E(V0, V1)| = x, while |E(V1)| and |E(V1, V2)| depend on connec-

tivity of nodes. In a ER random graph, each pair of nodes are connected with a probability p independent of other pairs [6]. Thus, |E(V1)| follows a binomial distribution B(

(x

2

) , p) when x ≥ 2 and |E(V1)| = 0 when x = 1. |E(V1, V2)| follows a binomial distribution

B((n − x − 1)x, p). Under typical settings, |E(V1)| is much smaller than |E(V1, V2)|.

For example, in G(100, 0.05), given |V1| = 5, we have E[|E(V1)|] = 0.25, whereas

E[|E(V1, V2)|] = 94 × 0.25. Therefore, 2|E(V1)| + |E(V1, V2)| ≈ |E(V1)| + |E(V1, V2)|,

which follows B(12x(x−1)+(n−x−1)x, p). This binomial distribution is approximately normal since (n− x − 1)x · p and (n − x − 1)x · (1 − p) are both greater than 5 under typical settings in networks with practical-sizes (We usually have np≥ ln n in a connected graph G(n, p) [5]). Therefore, k(V1) is approximately normal given|V1| = x. Let µ and

σ2 denote the conditional mean and variance of k(V1) given|V1| = x, respectively. They

can be obtained by

µ = x + x(x− 1)p + (n − x − 1)xp, σ2 = (2x(x− 1) + (n − x − 1)x)p(1 − p).

Such that Pr{k(V1) < k(V0α−1) | |V1| = x} in (5.17) can be approximated as Pr{k(V1) < k(V0) α− 1 | |V1| = x} ≈ Φ( x α−1 − µ σ ). (5.20)

After substituting (5.18) and (5.20) into (5.17), we obtain

E[|F1|] ≈ n−1x=1 ( n− 1 x ) px(1− p)n−1−xΦ( x α−1 − µ σ ).

By definition, E[f1] = n1(1 + E[|F1|]), which yields Theorem 1.

Theorem 3.3.1 Consider a random single-node attack applied to G(n, p). We assume, 1. We only consider the failures propagating in the forward direction; i.e., at step t,

only the nodes in V\ ∪td=0−1 Vdare considered as potential nodes to fail.

2. The set Ft−1is considered as a large virtual node that redistributes its load to its alive

neighbors at step t with the rule defined in (3).

3. n is large enough such that the variance of| ˆVt| is small and | ˆVt| can be approximated

by E[| ˆVt|].

4. E[| ˆVt| | |Ft−1| = E[|Ft−1|]] is applied to approximate E[| ˆVt|].

5. Given| ˆVt| = E[| ˆVt|], Lt(Ft−1) and (α−1)L0( ˆVt) are independent and approximately

normal. Lt(Ft−1) has conditional mean ˜µ = E[|Tt−1|](n − 1)p and unknown condi-

tional variance ˜σ2. (α− 1)L

0( ˆVt) has conditional mean ˆµ = (α− 1)(n − 1)E[| ˆVt|]p

and conditional variance ˆσ2 = (α− 1)2(n− 1)E[| ˆV

t|]p(1 − p). Φ(µ˜−ˆµσˆ ) is applied

Then an estimate of the average failure ratio E[ft] for step t≥ 2 is obtained recursively as E[ft] 1 nΦ( ˜ µ− ˆµ ˆ σ )E[| ˆVt|] + E[ft−1], where E[| ˆVt|] = (n − t−1d=0 E[|Vd|]) · (1 − (1 − p)E[|Ft−1|]), E[|Tt−1|] = nE[ft−1], E[|Ft−1|] = n(E[ft−1]− E[ft−2]),

E[|V0|] = 1 by definition and E[|Vd|], d ≥ 1 are given by Lemma 2.3.1.

proof 6 (Proof of Theorem 2) After step 1, E[ft] depends on random variables|V1|, |V2|, · · · , |Vt−1|,

as well as k(V1), k(V2),· · · , k(Vt−1). However, finding the joint distribution of all these

random variables is very difficult. Therefore, we need to make several necessary sim- plifying assumptions and approximations to obtain a closed-form result, as listed in the theorem. In this proof, we will first use these assumptions and approximations to derive the approximation of E[ft], and then justify all the assumptions and approximations point-

by-point in the end of the proof.

According to the assumption 1): we only consider the failures propagating in the for- ward direction, i.e., at step t only the nodes in V\ ∪td=0−1 Vd are considered as potential

nodes to fail, we define the set of target nodes ˆVtat step t as the nodes in V\ ∪td=0−1 Vdcon-

nected to Ft−1, as illustrated in Fig. 5.2. Since each node in V\ ∪td=0−1 Vdhas probability

p to be connected with a node in Ft−1 independently, | ˆVt| obeys a binomial distribution

B(n−td=0−1 |Vd|, 1 − (1 − p)|Ft−1|). The target nodes will receive redistributed load from

Figure 5.2: At step t, Ft−1 redistributes its load to set ˆVt. Vt−1 is the set of nodes with a

shortest distance t− 1 to node a; Ft−1 is the set of nodes failing at step t− 1; ˆVtis the set

of target nodes at step t, defined as the nodes in V\ ∪td=0−1 Vdconnected to Ft−1.

Furthermore, instead of analyzing the load redistribution for every node in Ft−1, we

make the assumption 2): the set Ft−1is considered as a large virtual node that redistributes

its load to its alive neighbors at step t. This assumption makes the analysis mathematically tractable. The problem now becomes “a single node redistributing its load to its neigh- bors”. Similar to step 1, by applying Corollary 1 to this equivalent setting, the failure condition for ˆVtcan be found as

Lt(Ft−1) + L0( ˆVt) > αL0( ˆVt), (5.21)

and E[|Ft|] can be obtained as

E[|Ft|] =z z Pr { ˆ Vtfails| | ˆVt| = z } Pr{| ˆVt| = z}. (5.22)

Recall our goal is to obtain E[ft] through

where E[ft−1] is estimated in the previous step analysis. Now we aim to find E[|Ft|] in

(5.22). However, it is difficult to find the exact distribution of| ˆVt| in (5.22) as it requires

the joint distribution of|V1|, |V2|, · · · , |Vt−1| and k(V1), k(V2),· · · , k(Vt−1). According to

the approximation 3): E[| ˆVt|] is applied to approximate | ˆVt|, we have

E[|Ft|] ≈ Pr { ˆ Vtfails| | ˆVt| = E[| ˆVt|] } E[| ˆVt|], (5.23)

where E[| ˆVt|] depends on random variable |Ft−1|:

E[| ˆVt|] =

y

E[| ˆVt| | |Ft−1| = y] · Pr{|Ft−1| = y}. (5.24)

And by ˆVt’s definition, we have

E[| ˆVt| | |Ft−1| = y] = (n − t−1

d=0

E[|Vd|]) · (1 − (1 − p)y). (5.25)

Now we aim to find E[| ˆVt|]. To avoid finding Pr{|Ft−1| = y} and the summation in

(5.24), we make the approximation 4): E[| ˆVt| | |Ft−1| = E[|Ft−1|]] is applied to approxi-

mate E[| ˆVt|]. This approximation leads to

E[| ˆVt|] ≈ (n − t−1d=0 E[|Vd|]) · (1 − (1 − p)E[|Ft−1|]), (5.26) where E[|Ft−1|] = n(E[ft−1]),

and E[ft−1] is given by the previous step analysis; E[|V0|] = 1 by definition, and E[|Vd|], ∀d ≥

1 are given by Lemma 1. We now estimate the probability Pr {

ˆ

Vtfails| | ˆVt| = E[| ˆVt|]

}

in (5.23). According to the failure condition (5.21), Pr {

ˆ

Vtfails| | ˆVt| = E[| ˆVt|]

}

obtained by Pr { ˆ Vtfails| | ˆVt| = E[| ˆVt|] } = Pr{Lt(Ft) > (α− 1)L0( ˆVt)| | ˆVt| = E[| ˆVt|]}. (5.27)

According to assumption 5), the above probability can be approximated by

Pr{Lt(Ft) > (α− 1)L0( ˆVt)| | ˆVt| = E[| ˆVt|]} ≈ Φ( ˜ µ− ˆµ ˆ σ ), (5.28) where ˜ µ = E[|Tt−1|](n − 1)p, ˆ µ = (α− 1)(n − 1)E[| ˆVt|]p, ˆ σ2 = (α− 1)2(n− 1)E[| ˆVt|]p(1 − p).

By substituting (5.28) and (5.26) into (5.23), we find E[|Ft|]. Then E[ft] can be ob-

tained as E[ft] = E[|Ft|]/n + E[ft−1], which yields Theorem 2.

Now we show the point-by-point justifications of all assumptions and approximations used:

1. We only considered failures propagating in the forward direction, i.e., at step t, we only considered the nodes in V\ ∪td=0−1 Vdas potential nodes to fail.

This assumption will be discussed in Section 3.4.

2. The set Ft−1 is considered as a large virtual node that redistributes its load to its

alive neighbors at step t with the rule defined in (3).

3 3.5 4 4.5 5 5.5 6 Degree 0.8 1 1.2 1.4 1.6 1.8 2 2.2

Received Load V2 data

Linear Regression

Figure 5.3: Example of load redistribution. White numbers located inside of circles are degrees, and numbers outside of circles are received load amounts.

ematically analyzable. Without this assumption, we would need to have the joint distribution of all node degrees in Ft−1, as well as the link connections between

Ft−1 and ˆVt, which is analytically complicated, especially when t is large.

This assumption is appropriate in ER random graph because such graph is homo- geneous by construction. In a typical realization of ER graph, the loads of nodes in Ft−1 have small variation. After the nodes in Ft−1 distribute their loads to their

alive neighbors according to their degrees, a neighbor node with a higher degree tends to receive more load, and vice versa. Such an example is shown in Fig. 5.3. The given partial network is a typical realization of ER random graph. Nodes in V1 redistribute their loads to their neighbors in V2. The scatter plot of the received

load amounts and degrees in V2 is also shown in Fig. 5.3. The linear correlation of

the received load amounts and degrees in V2 is 0.9556, indicating a strong linear

relationship, which matches the assumed case.

3. n is large enough such that the variance of| ˆVt| is small and | ˆVt| can be approximated

By definition of target nodes, we have | ˆVt| ∼ B(n − t−1d=0 |Vd|, 1 − (1 − p)|Ft−1|), with variance V ar = (n− t−1d=0 |Vd|)(1 − (1 − p)|Ft−1|)(1− p)|Ft−1| = E[| ˆVt|] · (1 − p)|Ft−1|.

For t ≥ 2 and before the stage of steady state, we have |Ft−1| ≫ 0, (1 − p)|Ft−1|≈ 0

and V ar ≈ 0 under typical settings of finite networks. For example, given p = 0.06, |Ft−1| = 100 and E[| ˆVt|] = 40, V ar = 0.0822. With Chebyshev’s inequality, we

have Pr{|| ˆVt| − 40| > 3 ×

0.0822} ≤ 19. We can see that| ˆVt| stays close to its

mean with high probability such that it can be approximated by its mean, as long as network’ size is large enough (still finite). In addition, for the asymptotic case, we also have (1− p)|Ft−1|→ 0 and V ar → 0 with |F

t−1| → ∞.

4. E[| ˆVt| | |Ft−1| = E[|Ft−1|]] is applied to approximate E[| ˆVt|].

For notational simplicity, let Y = |Ft−1|. Now our goal is to show that E[| ˆVt|] ≈

E[| ˆVt| | Y = E[Y ]]. According to (5.24), we have

E[| ˆVt|] =

y

E[| ˆVt| | Y = y]P (Y = y), (5.29)

where E[| ˆVt| | Y = y] can be rewritten as a function of y:

Figure 5.4: Histogram of Lt(Ft−1) and fit normal distribution. Skewness and kurtosis

indicate that this distribution is quite symmetric, not heavily tailed.

with c = (n−td=0−1 E[|Vd|]) as a constant. Combining (5.29) with (5.30), we have

E[| ˆVt|] = E[f(Y )]. (5.31)

We now aim to show that E[f (Y )] ≈ f(E[Y ]). First we look at the derivative of f (y):

f′(y) = −c(1 − p)yln(1− p),

where c is a positive constant, (1 − p)y < 1, and ln(1 − p) ≈ 0 under typical

settings in finite networks (where p is a small number). So f′(y) is a very small positive number. For example, given c = 10, p = 0.08, and y = 30, we have f′(y) =−10 · 0.9230· ln(0.92) = 0.0683, which is close to zero. Since f′(y) is close

to zero and f (y) is approximately constant over y, E[f (Y )] can be approximated by:

where y∗ is an arbitrary point within f (y)’s domain. Let us pick y∗ = E[Y ] such that we have E[f (Y )] ≈ f(E[Y ]), where E[Y ] is obtained from the previous step analysis. According to the simulation result, Y usually has a symmetric distribu- tion and E[Y ] is the median of Y . Since f (Y ) is a monotonic increasing function, f (E[Y ]) must be the median of f (Y ). Thus f (E[Y ]) is a reasonable approximation of E[f (Y )]. Based on (5.31), we have

E[| ˆVt|] = E[f(Y )] ≈ f(E[Y ]).

According to the definition of f (y) in (5.30), we have

E[| ˆVt|] ≈ f(E[Y ]) = E[X | Y = E[Y ]]

= E[| ˆVt| | |Ft−1| = E[|Ft−1|]].

5. Given | ˆVt| = E[| ˆVt|], Lt(Ft−1) and (α − 1)L0( ˆVt) are independent and approxi-

mately normal. Lt(Ft−1) has conditional mean ˜µ = E[|Tt−1|](n−1)p and unknown

variance. (α− 1)L0( ˆVt) has conditional mean ˆµ = (α− 1)(n − 1)E[| ˆVt|]p and

conditional variance ˆσ2 = (α− 1)2(n− 1)E[| ˆV

t|]p(1 − p). Φ(µ˜−ˆµσˆ ) is applied as an

approximation of Pr{Lt(Ft−1) > (α− 1)L0( ˆVt)| | ˆVt| = E[| ˆVt|]}.

For notational simplicity, let R1and R2denote random variables Lt(Ft−1) and (α−

1)L0( ˆVt), conditional on| ˆVt| = E[| ˆVt|]}, respectively. R1 has mean ˜µ and variance

˜ σ2. R

2has mean ˆµ and variance ˆσ2. Thus, obtaining the probability Pr{Lt(Ft−1) >

(α− 1)L0( ˆVt)| | ˆVt| = E[| ˆVt|]} is equivalent to obtaining Pr{R1 > R2}.

First we look at R1, which does not depend on | ˆVt| and it equals the summation

over the initial loads in Tt−1. Since all the nodes’ initial load amounts are i.i.d.

However, |Tt−1| is itself a random variable with an unknown distribution. Only

E[|Tt−1|] is obtained from the previous step’s analysis. Thus, ˜µ = E[|Tt−1|](n − 1)p

is known, while ˜σ2 and the distribution of R

1 remain unknown. To get an idea of

what distribution R1looks like, we show a histogram of Lt(Ft−1) from the simulation

results in Fig. 5.4, which has the following settings: n = 100, p = 0.06, α = 1.1, t = 4, and sample size is 50, 000. The case where no failures are triggered at step 1 was excluded in the histogram since no failures will happen at step 2 either in this scenario. From Fig. 5.4, we see that R1 is approximately normal.

Then we look at R2. Given that| ˆVt| = E[| ˆVt|], we have R2 ∼ B((n − 1)E[| ˆVt|], p),

which is approximately normal because E[| ˆVt|] ≫ 0 before the steady state, such

that (n− 1)E[| ˆVt|]p and (n − 1)E[| ˆVt|](1 − p) are both greater than 5 under typical

settings in a practical-size network. Given α − 1 is a constant, (α − 1)L0( ˆV0)

is also approximately normal. R2’s mean ˆµ and variance ˆσ2 can be obtained as

ˆ

µ = (α− 1)(n − 1)E[| ˆVt|]p and ˆσ2 = (α− 1)2(n− 1)E[| ˆVt|]p(1 − p).

Note R1 and R2 are independent because L0( ˆVt) and Lt(Ft−1) will not affect each

other’s distribution with given| ˆVt|. Then,

Pr{R1 > R2} = Pr{R2− R1 < 0}

≈ Φ(√µ˜− ˆµ ˜

σ2+ ˆσ2), (5.32)

where ˜σ2 remains unknown. We notice that Pr{R1 > R2} does not depend on ˜σ2 in

the following three cases:

(a) ˜µ− ˆµ ≫ 0 and (5.32) ≈ 1, (b) ˜µ− ˆµ ≪ 0 and (5.32) ≈ 0, − ˆµ = 0 and (5.32) = 0.5.

In document Modeling Cascading Failures in Complex Networks (Page 41-72)

Related documents