Network-Aware Worm Spreading Ability - MODELING AND DEFENDING AGAINST INTERNET WORM ATTACKS

How to quantify the spreading speed of a network-aware worm with the information of a vulnerable-host distribution? We characterize the spread of a network-aware worm at an early stage by deriving the infection rate.

6.5.1 Infection Rate

The infection rate, denoted by α, is defined as the average number of vulnerable hosts that can be infected per unit time by one infected host during the early stage of worm propagation [79]. The infection rate is an important metric for studying network-aware worm spreading ability for two reasons. First, since the number of infected hosts increases exponentially with the rate 1 +α during the early stage, a worm with a higher infection rate can spread much faster at the beginning and thus infect a large number of hosts in a shorter time [10]. Second, while it is generally difficult to derive a close-form solution for dynamic worm propagation, we can obtain a close-form expression of the infection rate for different worm-scanning methods.

Let R denote the (random) number of vulnerable hosts that can be infected per unit time by one infected host during the early stage of worm propagation. The infection rate is the expected value of R, i.e., α =E[R]. Let s be the scanning rate or the number of scans sent by an infected host per unit time, N be the number of vulnerable hosts, and Ω be the scanning space (i.e., Ω = 232_).

For random scanning (RS) [79, 10], an infected host sends outsrandom scans per unit time, and the probability that one scan hits a vulnerable host is N

follows a Binomial distribution B(s, N Ω)2, resulting in αRS =E[R] = sN Ω . (88) 6.5.2 Importance Scanning

We derive the infection rates of importance scanning (IS) [10, 16]. An infected host scans /lsubnetiwith the probabilityqg(l)(i). qg(l)(i) is called the group scanning distri-

bution and is to be chosen with respect to the group distributionp(gl)(i). If a worm scan

hits /l subneti, it would have a probability of N p(gl)(i)

232−l to find a vulnerable host. Thus,

a worm scan hits a vulnerable host with a likelihood ofP2_i₌₁l

³ qg(l)(i)· N p (l) g (i) 232−l ´ . Similar to random scanning, R of IS follows a Binomial distribution B(s, P2_i₌₁l N pg(l)(i)q(gl)(i)

232−l ), which leads to αIS =E[R] =sN 2l X i=1 p(gl)(i)qg(l)(i) 232−l . (89)

The same result was derived in [10] but by a different approach.

We now consider a special case of IS, where the group scanning distributionqg(l)(i)

is chosen to be proportional to the number of vulnerable hosts in groupi, i.e.,q(gl)(i) =

p(gl)(i). This results in sub-optimal IS [10], called /l IS. Thus, the infection rate is

α(_ISl) = sN 232−l 2l X i=1 (pg(i))2 =αRS ·β(l). (90)

Compared with RS, this /l IS can increase the infection rate by a factor of β(l)_.

Such an infection rate can be considered as a benchmark for comparison with other network-aware worms.

6.5.3 Localized Scanning

Localized scanning (LS) has been used by such real worms as Code Red II and Nimda [49, 8]. We first consider a simplified version of LS, called /l LS, which scans the Internet as follows:

2_{In our derivation, we ignore the dependency of the events that different scans hit the same target}

• pa (0≤ pa ≤ 1) of the time, an address with the same first l bits is chosen as

the target,

• 1−pa of the time, a random address is chosen.

Assume that an initially infected host is randomly chosen from the vulnerable hosts. Let Ig denote the subnet where an initially infected host locates. Thus, P(Ig =i) =

p(gl)(i), where i= 1,2,· · · ,2l. For an infected host located in /l subneti, a scan from

this host probes globally with the probability of 1−pa and hits /l subnet j (j 6=i)

with the likelihood of 1−pa

2l . Thus, the group scanning distribution for this host is

q(l) g (j) =      pa+1−₂pla, if j =i; 1−pa 2l , otherwise, (91)

where j = 1,2,· · · ,2l_{. Given the subnet location of an initially infected host, we}

can apply the results of IS. Specifically, putting Equation (91) into Equation (89), we have E[R|Ig =i] = sN 232−l µ pap(gl)(i) + 1−pa 2l ¶ . (92)

Therefore, we can compute the infection rate of /l LS as

α(_LSl) =E[R] = E[E[R|Ig]] = 2l X i=1 p(_gl)(i)E[R|Ig =i], (93) resulting in α(_LSl) =αRS ¡ 1−pa+paβ(l) ¢ . (94)

Since β(l) _> _{1 (β}(l) _{= 1 is for a uniform distribution and is excluded here),} _α(l)

increases with respect to pa. Specifically, when pa→1, α(LSl) →αRSβ(l)=α(ISl). Thus,

/lLS has an infection rate comparable to that of /lIS. In reality,pacannot be 1. This

is because an LS worm begins spreading from one infected host that is specifically in a subnet; and ifpa = 1, the worm can never spread out of this subnet. Therefore, we

Next, we further consider another LS, called two-level LS (2LLS), which has been used by the Code Red II and Nimda worms [82, 83]. 2LLS scans the Internet as follows:

• pb (0≤pb ≤1) of the time, an address with the same first byte is chosen as the

target,

• pc (0 ≤ pc ≤ 1−pb) of the time, an address with the same first two bytes is

chosen as the target,

• 1−pb −pc of the time, a random address is chosen.

For example, for the Code Red II worm, pb = 0.5 and pc= 0.375 [82]; for the Nimda

worm, pb = 0.25 andpc= 0.5 [83]. Using the similar analysis for /lLS, we can derive

the infection rate of 2LLS:

α2LLS =αRS

1−pb−pc+pbβ(8)+pcβ(16)

. (95)

Since β(16) _≥ _β(8) _≥ _{1 from Theorem 4,} _α

2LLS holds or increases when both pb and

pc increase. Specially, when pc → 1, α2LLS → αRSβ(16) = α(16)IS . Thus, 2LLS has an

infection rate comparable to that of /16 IS. Moreover, β(16) _{is much larger than} _β(8)

as shown in Table 4 for the collected distributions. Hence,pcis more significant than

pb for 2LLS.

6.5.4 Modified Sequential Scanning

The Blaster worm is a real worm that exploits sequential scanning in combination with localized scanning. A sequential-scanning worm studied in [81, 30] begins to scan addresses sequentially from a randomly chosen starting IP address and has a similar propagation speed as a random-scanning worm. The Blaster worm selects its starting point locally as the first address of its Class-C subnet with probability 0.4 [85, 81]. To analyze the effect of sequential scanning, we do not incorporate localized

scanning. Specifically, we consider our /l modified sequential-scanning (MSS) worm, which scans the Internet as follows:

• Newly infected host A begins with random scanning until finding a vulnerable host with address B.

• After infecting the targetB, hostAcontinues to sequentially scan IP addresses B+ 1, B + 2, · · · (or B−1,B −2,· · ·) in the /l subnet whereB locates.

Such a sequential worm-scanning strategy is in a similar spirit to thenearest neighbor rule, which is widely used in pattern classification [19]. The basic idea is that if the vulnerable hosts are clustered, the neighbor of a vulnerable host is likely to be vulnerable also.

Such a /l MSS worm has two stages. In the first stage (called MSS 1), the worm uses random scanning and has an infection rate of αRS, i.e., αM SS1 = αRS. In the

second stage (called MSS 2), the worm scans sequentially in a /l subnet. The fist l bits of a target address are fixed, whereas the last 32−l bits of the address are generated additively or subtractively and are modulated by 232−l_{. Let} _I

g denote the

sunbet where B locates. Thus, P(Ig = i) = pg(l)(i), where i = 1,2,· · · ,2l. Since a

sequential worm scan in subnet i has a probability of Ni(l)

232−l to hit a vulnerable host,

E[R|Ig =i] = N

(l)

232−ls =αRS ·2lp(gl)(i), which leads to

αM SS2 =E[R] = E[E[R|Ig]] = αRS·β(l). (96)

Therefore, the infection rate of /l MSS is between αRS and αRSβ(l).

In Summary, the infection rates of all three network-aware worms (IS, LS, and MSS) can be far larger than that of an RS worm, depending on the non-uniformity factors.

In document MODELING AND DEFENDING AGAINST INTERNET WORM ATTACKS (Page 94-99)