ZeroAccess Case study: A P2P Botnet Protocol

1.4 Dissertation Outlines

2.1.2 ZeroAccess Case study: A P2P Botnet Protocol

ZeroAccess is a recent botnet discovered around July 2011 by Symantec that infects Windows operating systems [123]. Its primary motivations is to make money through Bitcoin mining and pay-per-click advertising. Its size has been estimated at around one million active on at least nine million systems in the third quarter of 2012.

The malware spreads itself through various attack vectors. Among them, ZeroAccess was found in apparently legitimated files that users download from infected websites. It also relies on classical sets of “drive-by-download” attacks distributed by the Blackhole Exploit Toolkit and the Bleeding Like Toolkit [66]. Once executed on a computer, this malware behaves as a typical rootkit to hide and persist on the compromised system. Typically, it infects the Master Boot Record (MBR) of its host and disables the Windows Security Center service and with it, the user firewall and anti-virus provided by Windows 7. It also downloads other malware and lure the user to download fake anti-viruses applications.

Moreover, it opens a backdoor to connect to its network. Its command and control channel is used to distribute updates and malicious files among all the botnet members. ZeroAccess has seen multiples updates. In the following, we focus on the C&C protocol it operates after its update on the second quarter of 2012. From our knowledge, latest observed protocol updates occurred the 29 of June 2013 which included small improvements.

Figure 2.2 – UDP traffic generated by a host infected by ZeroAccess

The C&C protocol of the ZeroAccess botnet is a P2P protocol. It enables the creation of a distributed directory of all the infected hosts by means of UDP connections. This directory is used by each bot to identify from which other peers it can download malicious files or updates. This protocol does not cover files transfer. As illustrated on Figure 2.2, an infected host constantly contacts other peers to update its peer list and to discover new files to download. As a matter of facts, each bot is also constantly contacted by other peers. Thus, a ZeroAccess bot plays both the role of a server and a client.

To avoid easy detection, each message is encrypted by means of a rotated XOR. It encrypts (or decrypts) four-byte at a time the message using a four-bytes key. The initial key value is “ftp2”. The routine given in Listing 2.2 can be use to decrypt ZeroAccess communications. Its protocol vocabulary is made of three different types of binary message (getL, retL and newL). In

the following, we detail their formats. import struct def decryptZeroAccessMessage(encryptedMessage): key=0x66747032 result = [] for i in range(0,len(encryptedMessage), 4):

subData = struct.unpack("<I", encryptedMessage[i:i+4])[0] xoredSubData = subData ˆ key

result.append(struct.pack("<I", xoredSubData)) key = ((key << 1) & 0xffffffffL | key >> 31) decryptedMessage = ’’.join(result)

return decryptedMessage

Listing 2.2– Python decryption routine of ZeroAccess messages

The getL message is the first message an infected host emits to a predefined list of peers. With this message, the infected host requests a new list of peer IP addresses. As illustrated on Figure 2.3, a getL message is made of four fields of four bytes long. The first field contains the message CRC32 value and the second field, the message command name (i.e. “getL”). The third field is filled with zeros while the last field contains a randomly generated number, the bot unique identifier.

CRC32 Command DA61 DDE5 Bot UID “getL” 4C74 6567 Zero 0000 0000 A846 E280

4 bytes 4 bytes 4 bytes 4 bytes

Figure 2.3 – ZeroAccess getL message format

The retL message is another type of message that is sent in response to a getL message. Figure 2.4 illustrates its format. It contains a list of IP addresses of other botnet members and a list of files that can be downloaded. Similarly to the getL format, the first and the second field host the CRC32 value and the command name of the message. Obviously, in this case the second field is always filled with the “retL” value. The four-byte value stored in the third field is often referred to as the “broadcast flag” that might indicate if the receiver must propagate the list of IP addresses contained in this message to its own peers. The fourth field contains the number of IP and timestamp pairs that are stored in the fifth field (denoted “IP Entries” on Figure 2.4). Each pair consists in two values of four-bytes: the IP address of a peer and its timestamp. Right after this sequence of IP/timestamp pairs, the sixth field denotes the number of file entries contained in the last field (denoted “File Entries” on Figure 2.4). A file entry is made of four values. The first value denotes a file name of four bytes long followed with the file creation date also a four-byte long value. The third value in a file entry denotes the file size while last value is a 32 bytes long that might represent the file signature.

CRC32 1fd085eF 4C746572 Command 00000000 10000000 TS(0) xxxxxxxx 03000000 IP Entries IP(0) xxxxxxx xxxxxxx xxxxxxx xxxxxxx xxxxxxx…………

Name(0) Date(0) Size(0) Signature(0)

File Entries Number of IPs Broadcast Flag Number of files 4 bytes

4 bytes 4 bytes 4 bytes

4 bytes 4 bytes

4 bytes ${Number of IPs} x 8

4 bytes 4 bytes 4 bytes 32 bytes ${Number of files} x 44

“retL”

Figure 2.4 – ZeroAccess retL message format

When a bot receives a retL message it checks the file names and creation dates declared in it against the files it has. If it discovers that the remote peer possesses a file it does not have, it tries to obtain a copy of it. To achieve this, it initiates a TCP session to the peer on the same port number as the UDP exchange and downloads the file by means of another protocol.

The newL message propagates a new peer address across the botnet. When an infected host receives a retL message with the broadcast flag set, it broadcasts the received peer list to its own peer list through a set of newL message. This message follows a similar format than the getL message. As illustrated on Figure 2.5, a newL message is made of four fields. The first and second field respectively contains the CRC32 value and the command name (“newL”) of the message. The meaning of the third field is obscure and usually contains “8”. The fourth field contains the peer IP address the sender wants to propagate.

CRC32 Command f321 C30E Peer IP “newL” 4C77 656e Unknown 0000 0008 xxxx xxxx

4 bytes 4 bytes 4 bytes 4 bytes

Figure 2.5 – ZeroAccess newL message format

Zero Access P2P protocol relies on binary messages. These messages can be regrouped in three types following the value of their second field. Based on this value, the parser expects a specific format to parse the remaining data. This format is made of a mix of static sized fields and fields with a size computed following previously parsed field values. Another interesting thing is its encryption. Its objective is not to ensure the confidentiality of its exchanges but rather to prevent its detection through signature based mechanisms. As detailed in [22], reverse engineer this protocol can be easily achieved if a preliminary crypto analysis is performed to break the encryption mechanism. It requires to identify field boundaries and to cluster messages having a similar format.

In document Exploiting Semantic for the Automatic Reverse Engineering of Communication Protocols. (Page 33-35)