3.4 Solution and Algorithm Design
3.4.1 Characterisation Algorithm
The characterisation algorithm extracts timing behaviours from embedded machines. It is based on Hypothesis 1 and represented in Figure 3.5.
Hypothesis 1. Every VES introduces delays caused by translating and executing in- structions from the host to the guest architecture. This causes differences in timing behaviours between a VES and an REM.
The characterisation algorithm is based on pinging the localhost (127.0.0.1)2 in the system under consideration many times (e.g. 1000) in order to characterise behaviours related to the ping response time (P.), timestamp values (T.) and CPU usage (C.) for each ping, as shown in Figure 3.6. Timestamp values are used as control information;
2
Agency A Agency B
A = {e,n}PUk A = {d,n}PR k
Calculate hash sum value of IoT MA and compare with , if valid execute it
C = K ne mod
K = C nmod
Trusted Third Party
IoT Mobile Agent (IoT MA)
IoT Mobile Agent A PUk Create the IoT MA,
append the public key
and send it to node B IoT Mobile Agent
Generate a public/private key pair
Random generated key ( )K
Perform characterisation
Encrypt results using K Retrieve shared key
Decrypt characterisation results , K d Hash sum value Send an encrypted request to perform the characterisation
Figure 3.4: Representation of security mechanisms in MAP for characterising IoT-embedded machines.
if an attacker tries to fake the ping response time, the algorithm will rely on changes in timestamp values and vice versa. In fact, in order to fake both or one of these values, an attacker must implement new functionalities. These functionalities will be executed after they are translated from the host architecture to the virtualised or emulated system. This translation will require some time, which will increase the time between two pings (T.) and the characterisation time. These timing discrepancies can be used by the trustee to detect that the attacker is trying to fake the characterisation result.
The ping response time was obtained by using the “ping” command, the timestamp values were obtained by using the “date” command, and the CPU usage was determined by extracting information from “/proc/stat” file or the “iostat” command depending on the OS used. However, CPU usage is mentioned only for reference and is not used for detecting forged embedded machines as this could be easily faked by running several concurrent applications.
Real Embedded System
Real Embedded Hardware (i.e. ARM, MIPS, etc.) Embedded Operating System
Application Execution Execution Host Real Hardware (i.e. x86, x86-64, etc.) Operating System/Hypervisor Instructions Translation Transparent Execution Virtualised or Emulated Embedded System
Embedded Operating System Application Virtualised or Emulated Embedded Hardware Execution Execution Delays introduced by VESs
Figure 3.5: Hypothesis behind the characterisation algorithm concerning the differ- ence in time behaviours for translating and executing instruction in REMs and VESs.
Start ping localhost Is it still pinging? Store ping response time (ms) Store timestamp (s) Finish P. T. Yes No Store CPU usage (%) C.
Figure 3.6: Characterisation algorithm flowchart. The ping command is used locally and information from ping response time (P.), timestamp (T.) and CPU usage (C.) is
stored.
all systems with Internet connectivity, including IoT/M2M devices. It is obvious that it cannot be applied to IoT-embedded devices that do not support the ping command. For these devices a different method for extracting timing behaviour is required. Secondly, it is used to obtain precise timing information that is strictly related to the networking stack used by IoT/M2M devices. Another reason is that it will be difficult for attackers to fake the timing information provided by pinging locally, as it uses network sockets for managing the ping packets. These sockets reside in the kernel space of an OS and only its modification may allow an attacker to fake the timing information. Attackers
that try to fake the ping characterisation results must actively modify part of the kernel of a VES by checking when a network socket is created and handle it every time. This task can increase the overall delay and decrease system performance, especially in the presence of embedded environments. Furthermore, modifying the kernel is not always possible, especially if the system is not open source.
In fact, when the system in a VES receives ping packets, it will automatically create ping request and response packets, and manage these in both kernel and user space. As described previously, these operations require translating every instruction from the host to the guest architecture. Therefore, it is unlikely that the behaviours will be the same in VESs and REMs.
Moreover, if the attacker tries to slow down the VES response time in order to fake the characterisation, the agency that requests the characterisation can use its local time to detect this. This can be achieved by measuring the difference between the request (TReq) and the response (TResp) of the characterisation as follows:
TResp− TReq≈ TCh+ ∆TCh+℮(∆TRT T) (3.1)
where TCh is the time required for the characterisation, ∆TCh is a very small amount of
time (around 2 seconds) in which the IoT mobile agent receives the request, launches the characterisation, receives the results and sends them back to the trustor IoT agency, and ℮(∆TRT T) is the estimated round-trip times after TCh plus ∆TCh seconds are elapsed.
For example, if the characterisation is supposed to take 3 minutes, the difference should not be longer than 3 minutes plus network communication delays and the trustee IoT mobile agent delays, but not less than that.
For each target, eight tests were performed in which the ping command was tuned with different options and stressing the CPU (whereby the CPU usage is maintained around 100%), as shown in Table 3.2.
Table 3.2: List of ping characterisation tests performed.
Test# Ping option CPU stress
1 -c 1000 -i 0.2 No 2 -c 1000 -i 0.2 Yes 3 -c 1000 No 4 -c 1000 Yes 5 -c 1000 -s 20000 No 6 -c 1000 -s 20000 Yes 7 -c 1000 -s 20000 -i 0.2 No 8 -c 1000 -s 20000 -i 0.2 Yes -c: stop after sending n ping packets
-i: wait n seconds between sending each packet -s: specifies the number of data bytes to be sent
In order to stress the CPU, the “dd” command was employed using the random device as input and the NULL device as output (dd if=/dev/urandom of=/dev/null). Multiple
instances of the “dd” command were executed in embedded machines with multi-core CPU. This stressed not only the CPU but also the kernel, because in Linux, random and NULL devices are managed in the kernel space. During the simulations, systems were performing SSH connections with a server which was used to request the characterisation and collect its results. The test-bed used during the simulation is shown in Figure 3.7. In this, systems were connected to a switch via Ethernet cables. High-priority traffic was not considered during the simulations, and it cannot be excluded that this may have affected the kernel behaviour.
REM 1 REM 2 REM VESs N Trustor Ethernet connection Switch
Figure 3.7: Test-bed used during the simulations for performing characterisation requests and collect their results from REM and VES systems.