Target Selection Algorithm - Prentice.hall.Malware.fighting.malicious.code.Nov.2003.ISBN.013101

Once the worm is running on the victim machine, the target selection

algorithm starts looking for new victims to attack. Each address identified by the target selection algorithm will later be scanned to determine if a suitably vulnerable victim is using that address. Using the resources of the victim

machine, a worm author has a variety of different target selection techniques to choose from, such as these:

E-Mail Addresses: A worm could dump e-mail addresses from the victim machine's e-mail reader or mail server. Anyone who sent e-mail to or received a message from the current victim is then a potential target.

Host Lists: Some worms harvest addresses from various lists of machines on the local host, such as those stored in the local host files (/etc/hosts on UNIX and LMHOSTS on Windows).

Trusted Systems: On a UNIX victim, the worm could look for trust

relationships between the current victim machine and others, by analyzing the /etc/hosts.equiv file and users' individual .rhosts files. These trust relationships, which are sometimes set up so users can access one

machine from another without providing a password, are very insecure, offering the worm a leg up in conquering the new victims.

Network Neighborhood: On a Windows network, some worms explore the network neighborhood to find new potential victims. Acting like a user looking for nearby file servers, the worm attempts to find systems by sending queries using Microsoft's NetBIOS and SMB protocols.

DNS Queries: The worm could connect to the local Domain Name Service (DNS) server associated with the victim machine, and query it for the network addresses of other victims. DNS servers turn domain names (like

www.counterhack.net) into IP addresses (e.g., 10.1.1.15), among other functions. Therefore, DNS servers act as excellent repositories of potential target addresses for a worm.

Randomly Selecting a Target Network Address: Finally, a worm could just randomly select a target address, utilizing an algorithm to calculate a reasonable value to try to infect.

The targeting engines found in most worms have been pretty lame. Many worms merely select IP addresses at random to scan for victims. However, random targeting yields very poor results, based on the distribution of IP addresses on the Internet. Because IP addresses are 32 bits long in the

current widely used IP version 4, there are over 4 billion possible addresses on the Internet. However, these addresses were assigned very inefficiently.

Twenty or more years ago, almost no one thought that the cute little Internet and its associated TCP/IP protocol suite would grow into the

world-encompassing behemoth we see today. Without this foresight, huge swaths of address spaces were assigned to single organizations. Way back in the olden days, the potential IP address space was carved into Class A, B, and C net

works, described in Table 3.4. Class D and E address spaces also exist, but they are used for broadcast and experimental purposes, respectively.

Table 3.4. IP Address Assignment Based on Class

Class IP Address Range Number of Networks in

This Class

Number of IP Addresses in Range

Class A

First octet ranges from 1 to 126, other octets are zero

to 255: [1126].x.y.z 126 16,777,214

Class B

First octet ranges from 128 to 191, other octets are

zero to 255: [128191].x.y.z 16,384 65,534

Class C

First octet ranges from 192 to 223, other octets are

zero to 255: [192223].x.y.z 2,097,152 254

Class A networks have more than 16 million possible addresses, yet many of these ranges were given to a single organization, such as a government

agency, corporation, or university. Very few of these organizations utilize such large gobs of address space. Therefore, the addresses associated with the

original Class A networks are very sparsely populated, looking more like ghost towns than busy cities on a global network. Class B networks contain 65,534 possible addresses. That's a little more reasonable, but still, most

organizations don't even have that number of hosts. Finally, we have the little

Class C networks with 254 possible addresses. These workhorses are much more densely populated, and are assigned to organizations of all sizes. Today, these class-based address schemes have given way to a different method for assigning address space, called Classless InterDomain Routing (CIDR),

pronounced cider, as in apples [6]. Although CIDR is much more efficient, some organizations that were originally assigned whole Class As are holding on to their original address assignments, even though much of it remains completely unused. So, even in today's CIDR world, address usage is still heavily weighted to the traditional Class C networks.

Now, suppose a worm's targeting mechanism generates a new potential target address completely at random. Some worms do just that, thereby

implementing a very inefficient spread. If the worm's randomly selected target falls into the old Class A space, there is a significant likelihood that there won't be any valid targets in that range, because it's so sparsely populated. Likewise, a lot of Class B space lies fallow. However, if the worm gets lucky, it'll come up with an address that falls into the Class C space, where there are many victims ripe for the picking. If a worm selects a nonresponsive address, valuable

scanning time will be wasted.

Remember the famous quip from the old-time gangster, Willie Sutton? When asked why he robbed banks, Sutton replied, "Because that's where the money is!" In a similar way, worms want to carefully select target addresses based on where the machines are. For a far more efficient spread, more sophisticated worm targeting engines focus on the very active ranges of addresses in use, such as the Class C range or even parts of the Class B range. By optimizing the targeting mechanism so that it chooses these types of addresses, the initial spread can occur much more quickly. More efficient (and therefore successful) worms usually target various Class C and Class B ranges.

Furthermore, because of network latency, spreading over a local area network is far quicker than spreading a worm halfway across the planet. Therefore, some targeting engines are designed to generate addresses very near the address of the current worm segment, in the hopes of dominating the local network quickly. After all systems on the local network have been vanquished, the targeting mechanism turns its attention to spreading across a wider area.

Of course, sometimes the victim machine is on a nonpublic address space (i.e., the private IP addresses defined in RFC 1918 that are not routable across the Internet). In such cases, the local address of the victim will fall into certain specified ranges (10.0.0.0 to 10.255.255.255, 172.16.0.0 to 172.31.255.255, and 192.168.0.0 to 192.168.255.255). Many worms, when installed on

systems with such addresses, choose targets within this range for rapid propagation.

In document Prentice.hall.Malware.fighting.malicious.code.Nov.2003.ISBN.0131014056 (1) (Page 118-121)