DICE pseudocode - Source code for automated malware

B.3 Source code for automated malware

5.2 DICE pseudocode

To summarise, in DICE the adversary successfully authenticates on a probabilistic basis and only if their credentials meet the policy enforced by the honeypot operator. This approach covertly creates a unique authtoken for each adversary that is used to track when they return from different origins. The honeypot authentication process is no longer based on the default passwords that are supplied with the honeypot, or a selection of passwords that are hand picked by the operator.

5.2 Evaluation of DICE

Peisert and Bishop (2007) describe the following general scientific principles for experimental re- search in the computer security field: falsifiable hypotheses, scientific controls, reproducible results and data quality. Hypothesis 4.2, that was first introduced in Chapter 4, states that: an adversary’s

Figure 5.1: DICE paths as a flow chart

authentication credentials can be used to track their interactions with a deceptive cyber system. Hy- pothesis 4.2 speculates that the authentication credentials that an adversary submits to a honeypot can be used to gather further attribution information that is otherwise not available, as was discussed in Section 5.1. A controlled experiment is used to create evidence that the hypothesis could be valid. Hypothesis 4.2 can be measured by evaluating the technique in a controlled environment and then as a live deployment. To ensure that results are reproducible and can be used by other researchers, a thorough documentation of hardware, software and their configuration in the experiments is pre- sented and source code is provided in Appendices A. The data sets, i.e. honeypot observation logs, are also available for other researchers to evaluate. To protect adversary anonymity, IP addresses have been sanitised to addresses in the RFC 1918 reserved ranges as shown in Table 5.2. Sanitisation does not adversely affect analysis as the authtoken links between adversaries remain visible. By using real data sets and adversary attacks, problems associated with artificially and synthetically created data sets are bypassed (Peisert and Bishop, 2007).

To implement a DICE prototype an existing medium-interaction Secure Shell (SSH) honeypot was modified. This honeypot, known as Kippo (Desaster, 2013), is coded in the Python program-

IP Address Range CIDR Notation Number of Addresses 10.0.0.0 - 10.255.255.255 /8 16,777,216

172.16.0.0 - 172.31.255.255 /12 1,048,576 192.168.0.0 - 192.168.255.255 /16 65,536

Table 5.2: Anonymised IP address ranges (RFC1918)

ming language and uses the Twisted framework for SSH protocol simulation. Kippo simulates a computer file system using the Python pickle module, allowing for a file system to be built very quickly. To evaluate the DICE prototype and hypothesis, two experiments were designed; a controlled experiment and a live experiment.

i. Controlled Experiment

The controlled experiment confirmed that the three principles outlined in Section 5.1; dice probability, policy enforcement and authtokens, performed as expected in controlled conditions. This provided evidence to support Hypothesis 4.2. This experiment assumed the role of both the adversary and the honeypot operator. Automated brute force tools were used against DICE to evaluate the two techniques, dice probability and policy enforcement. The third technique, authtokens, was evaluated by using successful authentication credentials from numerous systems in the controlled environment. This experiment simulated an adversary using different origins. Figure 5.2(a) shows the network topology for experiment (i). The controlled experiment took place in the cyber security laboratories at the University. Firewall rules and network routing were used to ensure that attack traffic did not leak beyond the parameters of the network and that the honeypots were only visible internally.

ii. Live Experiment

While useful in proving the approach, experiment (i) did not accurately reflect Internet conditions. For example, protocols may not be observed that are common to Internet environments. Similarly, techniques and tools used by adversaries cannot be studied. The experiment was expanded by evaluating DICE with real attacks. Experiment (ii) assumed the role of the honeypot operator; the honeypot was deployed on a cloud-based infrastructure that was visible to adversaries on the Internet. Experiment (ii) further evaluated Hypothesis 4.2, but this time in a live environment with real attack data.

The adversaries themselves, the times at which they choose to interact, whether or not they do interact, what types of interactions, all of which are challenging to control, but do not need to be controlled to measure the interactions that are required to provide evidence towards Hypothesis 4.2. The same outcomes as was seen in experiment (i) were expected, except that attacks would be carried out by real adversaries, targeting the honeypot from any geographic location. As before, an operator-controlled probabilistic element should affect adversary logins, i.e. dice probability, weak passwords should be declined, i.e. password policy, and relationships between different origins should be visible i.e. authtoken. Figure 5.2(b) shows the network topology for experiment (ii).

(a)

(b)

Figure 5.2: Experiment diagram: (a) controlled; (b) live.

5.2.1 Closing potential loopholes

A loophole that could subvert DICE is if the adversary created a new user account for their activities or changed the compromised account password. To prevent this common behaviour two techniques were implemented in DICE. First, adversaries are not allowed to login with the root account, so they do not possess necessary privileges to create new accounts. The policy stated in Table 5.5 shows that root accounts are blocked. Second, adversaries are allowed to change the password for the account that they have compromised. This was implemented in DICE by considering the password change as a new account and linking new accounts to the initial compromised account. This is described as a “chained” account and is visible in Code Listing 5.1. Another loophole is that an adversary could repeatedly use a single credential pair to identify the system as a honeypot. This is illustrated in the pseudo-login in Code Listing 5.3.

1 s s h user@192 . 1 6 8 . 1 . 1 5 2 Password : <f o F r e O l e 1 2 9 !> 3 f a i l e d 4 Password : <f o F r e O l e 1 2 9 !> 5 f a i l e d 6 Password : <f o F r e O l e 1 2 9 !> 7 S u c c e e d e d

In document Wide spectrum attribution: Using deception for attribution intelligence in cyber attacks (Page 119-123)