On the Performance of Detecting Injection of Fabricated Messages into the CAN Bus

(1)

Electrical and Computer Engineering

Publications

Electrical and Computer Engineering

4-23-2020

On the Performance of Detecting Injection of Fabricated

Messages into the CAN Bus

Lotfi ben Othmane

Iowa State University, [email protected]

Lalitha Dhulipala

Iowa State University

Nicholas Multari

Pacific Northwest National Laboratories

Manimaran Govindarasu

Iowa State University, [email protected]

Follow this and additional works at: https://lib.dr.iastate.edu/ece_pubs Part of the Systems and Communications Commons

The complete bibliographic information for this item can be found at

https://lib.dr.iastate.edu/

ece_pubs/247

. For information on how to cite this item, please visit

http://lib.dr.iastate.edu/

howtocite.html

.

This Article is brought to you for free and open access by the Electrical and Computer Engineering at Iowa State University Digital Repository. It has been accepted for inclusion in Electrical and Computer Engineering Publications by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected].

(2)

On the Performance of Detecting Injection of Fabricated Messages into the CAN

Bus

Abstract

There have been several public demonstrations of attacks on connected vehicles showing the ability of an attacker to take control of a targeted vehicle by injecting messages into their CAN bus. In this paper, using injected speed reading and RPM reading messages in in-motion vehicle, we examine the ability of the Pearson correlation and the unsupervised learning methods k-means clustering and HMM to differentiate 'no-attack' and 'under-attack' states of the given vehicle. We found that the Pearson correlation

distinguishes the two states, the k-means clustering method has an acceptable accuracy but high false positive rate and HMM detects attacks with acceptable detection rate but has a high false positive in detecting attacks from speed readings when there is no attack. The accuracy of these unsupervised learning methods are comparable to the ones of the supervised learning methods used by CAN bus IDS suppliers. In addition, the paper shows that studying CAN anomaly detection techniques using off-vehicle test facilities may not properly evaluate the performance of the detection techniques. The results suggest using other features besides the data content of the CAN messages and integrate knowledge about how the ECU collaborate in building effective techniques for the detection of injection of fabricated message attacks.

Disciplines

Systems and Communications

Comments

This is a manuscript of an article published as ben Othmane, Lotfi, Lalitha Dhulipala, Nicholas Multari, and Manimaran Govindarasu. "On the Performance of Detecting Injection of Fabricated Messages into the CAN Bus." IEEE Transactions on Dependable and Secure Computing (2020). DOI: 10.1109/

TDSC.2020.2990192. Posted with permission.

(3)

On the Performance of Detecting Injection of

Fabricated Messages into the CAN Bus

Lotfi ben Othmane

∗

, Lalitha Dhulipala

∗

, Nicholas Multari

‡

, Manimaran Govindarasu

†

∗ _{Secure Development of Smart Cyber-Physical Systems Laboratory}

Iowa State University, Ames, IA USA

† _{Iowa State University, Ames, IA USA}

‡ _{Pacific Northwest National Laboratories, Richland, WA USA}

Abstract—There have been several public demon-strations of attacks on connected vehicles showing the ability of an attacker to take control of a targeted vehicle by injecting messages into their Controller Area Network (CAN) bus. In this paper, using injected speed reading and Revolutions Per Minute (RPM) reading messages in in-motion vehicle, we examine the ability of the Pearson correlation and the unsupervised learning methods k-means clustering and Hidden Markov Model (HMM) to differentiate ’no-attack’ and ’under-attack’ states of the given vehicle. We found that the Pearson correlation distinguishes the two states, the k-means clustering method has an acceptable accuracy but high false positive rate and HMM detects attacks with acceptable detection rate but has a high false positive in detecting attacks from speed readings when there is no attack. The accuracy of these unsupervised learning methods are comparable to the ones of the supervised learning methods used by CAN bus Intrusion Detection System (IDS) suppliers. In addition, the paper shows that studying CAN anomaly detection techniques using off-vehicle test facilities may not properly evaluate the performance of the detection techniques. The results suggest using other features besides the data content of the CAN messages and integrate knowledge about how the Electronic Control Units (ECUs) collaborate in building effective techniques for the detection of injection of fabricated message attacks.

I. Introduction

On newer vehicles, Electronic Control Units (ECUs) communicate using in-vehicle networks to control their behaviors, as shown in Figure 1. For example, the figure shows that the engine control unit communicates with other controllers such as the odometer and brakes over the Controller Area Network (CAN) [1], among other bus systems. Normal and safety critical communications occur in this manner.

In addition to the above, several Intelligent Transporta-tion Systems (ITSs) applicaTransporta-tions have been proposed and implemented in recent years to improve drivers experiences and road safety. Examples of these applications include infotainment systems, fleet management systems, park-ing assistance, remote diagnostics, eCall, remote engine start, and Cooperative Adaptive Cruise Control (CACC) systems [2], [3], [4]. In order to operate correctly, these

Fig. 1: Example of connected vehicle [2].

ITSs must communicate frequently using the CAN bus to exchange messages between the ECUs, sensors and applications.

The CAN was designed to operate as a closed net-work systems, that is, all the communicating nodes are trusted [2]. The CAN does not support security mech-anisms, such as authentication, to identify and isolate malicious nodes. This design was not a problem when automobiles were standalone entities, however, today, ITS applications communicate with external entities, such as other vehicles in the case of CACC. This makes it pos-sible for the attackers to exploit vulnerabilities in these applications and to attack the connected vehicles.

Many studies in the last decade showed the feasibility of cyber-attacks on connected vehicles. Zhao [5], Wolf et al. [6], and Hoppe et al. [7] were among the pioneers in identifying and discussing attacks on connected vehicles. Koscher et. al. demonstrated the feasibility of such at-tacks [8] such as unlocking doors and controlling the lights, horn, and wipers of a vehicle. In addition, Checkoway et al. [9] experimented with a set of attack vectors including CD players, Bluetooth, and radio. Valasek and Miller demonstrated the ability to remotely take control of a new Jeep Cherokee by accessing the vehicle’s steering, brakes, and transmission at DEFCON [10]. Other demon-strations of remote attacks on connected vehicles included the attacks on a Chrysler vehicle [11] and the attack on a Tesla X [12]. Recently, automotive manufacturers, such as Toyota, began planning to equip new models with Vehicle-to-Vehicle (V2V) devices [13], providing new

(4)

attack vectors for attackers [14]. These demonstrations and events attracted significant news coverage, raising awareness of and concern over the issue. By 2015; 78% of car-buyers believe that vehicles could be hacked [15]. Security researchers believe these attacks, once confined to controlled environments, are now available for use by malicious actors [16].

The community has proposed several prevention and detection solutions to address cyber security threats to connected vehicles. The preventive solutions typically re-quire the modification of the CAN protocol, an impracti-cality for used and new automobiles [2], [3]. The detection-based solutions (or IDS), e.g., [7], [17], [18], [19], often use artificial data, e.g., [20],1 _{data collected from}

simula-tors [21], or data related to researchers’ devices to detect malicious behaviors, e.g., [22], which limits the practicality of the results. In addition, most of the anomaly-based IDSs use supervised learning techniques, such as neural-network-based techniques [23]. The learning phase of these techniques requires the use of large dataset and extensive time to develop high-performance IDS models from labeled data [24].2 Note that the profiles are already specific for every vehicle make and model; they are dependent on vehicle systems’ features [24].

We explore in this paper the capabilities of unsupervised machine learning methods to detect injection-attacks of CAN messages using data collected from an in-motion vehicle. Unsupervised techniques do not require extensive time to develop detection models from labeled data and may not depend on the context of the data, such as the make and model of the vehicle and the driving behavior of the driver. We choose in this evaluation the Pearson cor-relation because it is commonly used to explore the data. We choose k-means clustering and Hidden Markov Model (HMM) because they are commonly used unsupervised learning methods and were with excellent results [22].

We focus in this study on vehicle’s speed and RPM readings because (1) it is easy to observe the physical effects of injecting speed and RPM readings (the pointers of the speedometer and tachometer move to indicate the injected values) and (2) they are safe to inject while the vehicle is in-motion; changing them do not harm the driver if aware about that–the pointers of the speedometer and tachometer move to indicate the injected values. (Manipulating speed and RPM readings in conjunction with using vehicle features that use these readings, such as cruise control is not safe.) To detect the message injection attacks, we distinguish two states: no-attack, no injection of speed or RPM reading messages, and under-attack, fabricated speed or RPM reading messages are injected. We aim to identify the state of the vehicle from the data exchanged on the in-vehicle network.

1_{The researchers log the messages exchanged between the ECUs}

in a normally operating vehicle. Then, they inject randomly a set of fabricated messages in the dataset.

2_{Each record is flagged as related to attack or no-attack states.}

The main contributions of the paper are:

1) Capture of the messages log of the CAN bus for (1) normal driving behavior, (2) injection of fabricated speed reading messages onto the CAN bus, and (3) injection of fabricated RPM reading messages onto the CAN bus of anin-motionFord Transit 500 2017. 2) Evaluate the performance of (1) Pearson correla-tion [25], (2) k-means clustering [26], and (3) Hidden Markov Models [27] to detect injection of speed and RPM readings attacks using datasets collected from an in-motion vehicle. Note that k-means and HMM clustering are commonly used unsupervised learning methods.

3) Show that studying CAN anomaly detection tech-niques using off-vehicle test facilities may not prop-erly evaluate the performance of the detection tech-niques.

The paper is organized as follows. First, we give a primer on CAN in Section II and discuss the threat model in Section III. Then, we discuss related work in Section IV. Next, we describe the experiment setup and the collected data in Section V. We discuss the application of the detection techniques and the obtained results in Section VI afterwords. Then, we discuss the results along with their impacts and limitations in Section VII and conclude the paper in Section VIII.

II. Background - Primer on CAN protocol The Controller Area Network (CAN) is a serial commu-nication protocol supporting real-time control with high-levels of safety. It was developed by Robert Bosch and officially released at the Society of Automotive Engineers (SAE) conference in Detroit, Michigan in 1986 [1]. The protocol is used by the On-Board Diagnostics (OBD) stan-dard, which is mandatory for all cars and light trucks sold in the United States since 1996 and in the European Union since 2004. In the remainder of this section, we discuss the format of the OBD message frame, the arbitration mechanism used, and the OBD-II standard.

A. Message format

The CAN is a time-synchronized broadcast, multi-cast reception bus that uses two wires to connect ECUs to-gether. One wire is the CAN high (aka CANH) and the other wire is the CAN low (CANL). There are four types of message frames exchanged on the CAN bus: (1) Data Frame, a frame used to exchange data between the nodes, (2) Remote Frame, a frame requesting the transmission of a specific identifier, (3) Error Frame, a frame transmitted by any node detecting an error, and (4) Overload Frame, a frame to inject a delay between data or remote frames [1]. The data frame, which is used in this study, is composed of seven fields: (1) Start of Frame, which indicates the start of a frame, (2) Arbitration field, which includes the Identifier of the message (stored in the Identifier field) and service bits, (3) Control field, which includes the length of the data

(5)

Fig. 2: OBD pin-out diagram [29]

TABLE I: OBD-II Pin Specification

Pin Function Pin Function

1 Manufacture Specific 9 Manufacture Specific 2 J1850 Bus (+) 10 J1850 Bus (-) 3 Manufacturer Specific 11 Manufacturer Specific 4 Ground 12 Manufacturer Specific 5 Ground 13 Manufacturer Specific

6 CAN High 14 CAN Low

7 K-Line (ISO 9141-2) 15 L-Line (ISO 9141-2) 8 Manufacturer Specific 16 12V Battery Power

contained in the message and two reserved bits, (4) Data field, which is of 0 to 8 bytes and includes the data to be transferred, (5) CRC field, which is used to identify errors, (6) ACK field, which is used to acknowledge receiving the message and (7) End of Frame, which is a seven recessive (value 1) bits that indicate the end of the frame [1]. B. Arbitration

Transmission on the CAN bus follows the Carrier Sense Multiple Access with Collision Detection (CSMA/CD) protocol [28] where any network node can start trans-mitting whenever the bus is free [1]. If two nodes (or more) start transmitting at the same time, the conflict is resolved by bit-wise arbitration of the node’s Identifiers. As a result, the identifier is also called the Priority ID. Nodes detect collisions by comparing the transmitted bits to the monitored bit. The node continues the transmission if the bits are equal, else a collision is detected. The ECU that sends the message with the lowest Identifier wins the arbitration and begins to re-transmit the message. Note that recent vehicles ignore messages with identifier of "0", the highest priority identifier, based on experiments that we conducted on several 2017 vehicles.

C. On-Board Diagnostics (OBD)

On-Board Diagnostics (OBD) is implemented in the vehicles to provide ways to report self-diagnosed errors and malfunctions [29]. The SAE introduced a standardized diagnostic port connector and a set of protocols for OBD in 1988. By 1996, OBD-II was mandated nationwide in the United States.

The OBD-II standard specifies the connector type and its pin-outs, signaling protocols, and the format of the mes-sages. Figure 2 shows the connector and Table I provides the pin specification.

All cars manufactured in USA after 2008 support the Controller Area Network (CAN) protocol and offer a

means to interface with it through the OBD-II port, which is usually located under the steering wheel. This port gives direct access to the CAN network. We used in our experiment the OBD-II port to collect data from the CAN bus–see Section V. Specifically, we used pin 6 and pin 14 of the Connector. Pin 6 carries CAN High at 3.75 V and Pin 14 carries CAN Low at 1.25 V.

III. Threat Model

Although claimed to be a secure protocol by its Creator, the CAN protocol is acknowledged to be insecure3 [30], [10]. It was designed to be a closed network system [2]. The security weaknesses include [31]:

• Lack of segmentation and boundary defense: The ECUs of a vehicle are often organized in sections, however, all sections of the bus are connected through gateways. Thus, an intruder, for example, may gain access to the Anti-Braking System through the info-tainment system.

• Lack of device authentication: Data broadcasted by a CAN node reaches all nodes in the network, with no authentication of the sender node, that is, any node connected to the network can send any CAN identifier it wishes. In addition, any device can participate in the network communications solely by connecting to the bus.

• Un-encrypted traffic: Messages exchanged between the nodes of a CAN are plain-text; they are not considered confidential.

• Availability check: The CAN network checks for the availability of the carrier before every message is sent. If the carrier is busy, all the ECUs are halted and the carrier is rechecked after a period. This mechanism can be used by attackers to carryout Denial Of Service (DOS) attacks.

Accesses to the CAN of a vehicle could be local or remote [32], [9], [2]. Local attacks can be initiated through, for example, the Passive Anti-Theft System (PATS), Tire Pressure Monitoring System (TPMS), Remote Keyless Entry / Start (RKE), Bluetooth, and OBD connector [32]. The attack surface for remote access includes Radio Data System, Telematics, Cellular, and WiFi [9], [2], [32]. Info-tainment systems are among the main entry points [33].

We used in our experiment the OBD-II port as it gives direct physical access to the CAN network. We injected fabricated speed and RPM readings messages [19] into the bus. The port has been used previously to inject messages into the CAN bus of vehicles [10], [19]. Injection of fabricated speed and RPM messages can trick the driver and autonomous driving systems (if they use the speed

3_{Robert Bosch claims in [1] that the protocol "supports distributed}

real-time control with a very high level ofsecurity." This is a merely an incorrect translation in our opinion. The German term "sicherheit" means both security and safety, as in other languages such as French and Arabic.

(6)

Fig. 3: OBD-II port.

reading in their decisions making) and may lead to wrong decisions.

The paper proposes a set of IDS techniques to detect injection of speed and RPM readings messages into the CAN bus, which are Pearson correlation, K-means, and HMM. IDSs are commonly used as a protection control to mitigate injection of messages into in-vehicle networks [34]. Potential attackers need to bypass the IDS, if installed on the in-vehicle network of a vehicle, to be able to inject successfully speed or RPM readings messages.

IV. Related Work

This section discusses related work on attack prevention and attack detection methods.

A. Automobile attacks and prevention techniques

The absence of an authentication mechanism in the CAN protocol is the main root cause of most of the reported attacks. Several authentication protocols were proposed to address this weakness. For example, Groza et al. [35] proposed CAN Auth, a protocol that uses a 15-byte pre-shared key to authenticate transmitted messages. The protocol also uses a counter to prevent replay attacks. Other solutions include the Light Weight CAN Authenti-cation Protocol (LCAP) [36], the Light Weight Broadcast Authentication CAN Protocol (LiBrA CAN) [35], and the CAN ID shuffling using the network address shuffling technique [37]. Like most prevention solution, these solu-tions require vehicle modificasolu-tions to support the proposed modified CAN protocols.

B. Attack detection and mitigation

Several anomaly-based solutions to detect attacks on the in-vehicle network have been proposed. For example, Müter et al. [17] proposed logical sensors that use the network protocol specifications, the redundancy of data sources, and the cooperation mechanisms of the devices to

detect attacks on the CAN bus. Although this technique does not produce false positives as it uses unambiguous and reliable information, it does produce high false neg-atives as a result of errors caused by faulty hardware. Müter et al [18] also proposed the use of a coincidence metric, which is the relative entropy of the CAN messages when there is no-attack versus when there is an attack. The metric has a low value when there is no-attack and high value when there are attacks–due to repetition of messages. The technique fails, however, to detect attacks when the data values used in the attack deviate slightly from the legitimate values. In addition, it is difficult to identify a threshold to distinguish a legitimate state from attack state.

Several Machine Learning (ML)-based attack detection techniques have been proposed. For instance, Taylor et al. [20] proposed the use of Long Short Term Memory (LSTM) networks to detect attacks on the CAN bus. The authors collected CAN bus data from the normal operation of a 2012 Subaru Impreza and injected fabricated messages into the dataset to ’mimic’ attacks. Then, they trained a detection model to predict the next data byte originating from each of the ECUs, such that deviations from the model are declared attacks. The model detects anomalies with very low false positive rates. Also, Levi et al. [21] proposed a system for monitoring the different vehicle’s interfaces, extracting relevant information, and sending the data to a trained generative model to detect devia-tions from the normal behavior. The system uses HMM to generate the model of normal vehicle behavior and a regression model to calibrate the likelihood threshold of anomalies. Cho and Shin proposed Clock-based IDS (CIDS) [19], an anomaly-based intrusion detection system that fingerprints ECUs based upon their periodic in-vehicle messages. CIDS detects attacks and identifies their sources by identifying deviations in these fingerprints. The technique does not, however, work for ECUs that send aperiodic messages.

Lokman et al. [23] surveyed the literature of intrusion detection methods and classified them into anomaly-based, frequency-based, machine learning-based, statistical-based, specification,-based, signature-based, and hybrid-based methods. They found that most of the methods achieved high accuracy, require labeled data, and focus on detecting attacks from periodic malicious messages.

Tomlinson et al. [22] surveyed also the literature of intrusion detection methods for the CAN bus and an-alyzed their practicalities. They classified the meth-ods into signature-based, statistical, knowledge-based, clustering-based, Neural network-based, and HMM-based approaches. The authors found that the surveyed studies typically used fabricated data by manipulating the CAN logs collected from vehicles or simulators, e.g, [21]. Tomlin-son et al. identified several problems with these methods to include: the fabricated dataset typically does not account

(7)

for the response and feedback of ECUs/fault detection mechanisms within the vehicle, as well as an unrealistic expectation that the simulators will faithfully replicate the true behavior of the vehicles.4 _{The results are, therefore,}

"might be less realistic" [22]. To address these identified shortfalls, we use an actual in-motion vehicle to perform our experiments.

Seo at al. proposed an IDS for in-vehicle networks that uses convolution neural network and generative adversarial nets [38] to identify injection attacks in CAN bus. The model encodes the CAN IDs of the messages captured from the attacked vehicle to one-hot-vector, which allows to use the CAN messages as images. The authors evaluated their technique using data captured from a parked Hyndai TF Sonata; they generated five Neural Network models, each detects one of the following attacks: Denial of Service (DoS), injection of fuzzy messages, injection of RPM mes-sages, and injection of GEAR messages. The five models have high attack detection rates (and also high precision and accuracy) on the attacks that are trained on. The technique requires training a model for each attack type. In addition, the dataset used in the study are collected from a parked vehicle, which limits the scope of the study results because when parked, only a small subset of the ECU of the test vehicle inject messages into the CAN bus and the frequencies of sending messages of these ECUs is constant.

Iegotov and Fischmeister [39] proposed a technique to mine a task precedence graph from the trace of the messages exchanged in the CAN bus and an anomaly detection algorithm that checks the conformance of the CAN messages received from the vehicle to the patterns of the mined task precedence graph. The assessment of the techniques using Seo et al.’s dataset [38] revealed that the technique falsely detects anomalies. The techniques detect false anomalies when the attacking device tries to send spoofed messages because it changes the message periods of some devices. This issue may indicate that the performance of the techniques would degrade when using in-motion vehicles because there will be devices with noncyclic frequencies of sending messages.

Boumiza and Braham investigated the capability of HMM to detect anomaly [40]. They also used Seo et al.’s dataset [38], which limits the scope of the results as specified earlier.

Hanselmann et al. [41] proposed a supervised Neu-ral Network(NN)-based intrusion detection technique for high-dimensional CAN bus. The system trains the NN from the data of 20 signals managed by 13 CAN-bus devices and identifies flooding attacks (attacks that send messages of a particular ID with high frequency to the CAN Bus), suppression attacks (the device is prevented from sending messages) and payload modification attacks

4_{Simulators execute predefined models designed by their authors}

and focus on a set of specific aspects of the real word behavior.

Fig. 4: Raspberry PI and PiCAN board connection to the OBD-II port.

(changing the content of CAN messages). The technique has a good performance for three of the attacks.

Stachowski et al. [24] developed a test suite to assess the performance of IDSs to recognize messages that deviate from the normal cyclic messages rate, track messages that contain counters, rationalize messages content from system wide status, detect violations of Bus Off timer, distinguish anomalous messages from similar messages invoked by common driving scenarios, and distinguish anomalous messages from similar messages invoked by unusual but non-anomalous driving behaviors. The au-thors tested three IDSs–which were developed by three suppliers–on three different vehicles in stationary and in-motion scenarios. The IDSs have varying performance results. For instance, the accuracy of the three IDS varies between 50% and 100% in detecting violations of (not specified) cyclic messages. In addition, two of the three systems reported high-false positives (between 12% and 71%) in detecting anomalous events, when there is no attacks.

V. Experiment setup and data collection This section discusses the setup of our experiments, specifically the devices used, the data collection methods employed, and the datasets collected.

A. Experiment setup

Figure 4 shows the CAN node, composed of a Raspberry PI 3 [42] and a PiCAN board [43], that we connected to the OBD-II port [29] of our vehicle.5_{The PiCAN connects to}

the CAN network through a DB-9 connector to the CAN H and CAN L pins. The DB 9/OBD II adapter puts the CAN H pin (OBD-II pin 6) into DB9 pin 3 and the CAN L pin (OBD-II pin 14) into DB9 pin 5.

5_{Recall that the OBD-II port provides physical access to the}

(8)

Fig. 5: Example of collected CAN data stream.

Sniff and log in-vehicle messages

Play the log file

Did the action happen?

Split & plan first half

Did the action happen? Down to one PID? Success

Split & plan first half

Did the action happen? No No No No Yes Yes Yes Yes

Fig. 6: Flowchart of the process of identifying the PID of the ECU of interest.

We used for this experiment Canutils tool-chain [44] and SocketCAN [45]. The Canutils tool-chain can be cloned from GitHub [46] and installed on the Raspberry Pi. SocketCAN is an open-source driver package inte-grated into the Linux networking stack. The tool-chain includes additional tools to sniff the CAN network, moni-tor changes, and inject CAN messages.

B. Collection of CAN Datasets

This subsection describes the collection of CAN data, identification of the PID of the ECUs of interest: the Tachometer and Speedometer, and transformation of the data to human readable values.

Data collection. We monitor the traffic using our data collection devices and archive the CAN messages that the vehicle’s ECUs exchange. Figure 5 shows an example of data collected from the CAN bus. Each row of the data corresponds to a message broadcasted by an ECU on the CAN bus. The three hexadecimal digits after the term

Fig. 7: Injection of Speed values messages.

Fig. 8: Injection of RPM messages.

"can0" represent the PID of the ECU that broadcasted the message. The hexadecimal digits after the "#" character comprise data within the CAN message.

Identification of PID of interest.Identifying the PID of a set of CAN messages to the ECU of interest, such as speedometer or tachometer, requires performing an action related to the selected PID and collecting the resulting messages exchanged over the CAN bus [44]. For example, to identify the PID of the ECU responsible for the speed reading, we accelerate and observe the resulting message traffic.

To determine which PID triggers a desired action, we used the record and replay of CAN messages technique [44] to match the selected action to the appropriate PID. Figure 6 shows the flowchart of the method. To identify, for example, the PID corresponding to the tachometer reading (Engine RPM), we push or release the accelerator and log all the messages broadcasted over the CAN bus. Then, we play back the log file to check if the tachometer reading changes. Next, we divide the log into two halves and play the first half of the log. We conclude that the PID of concern is in the first half of the log if the tachometer reading changes and in the second half of the log if the tachometer does not change. We repeat the process until

(9)

(a) Relationship of the change of the speed over time.

(b) Relationship of the change of the RPM over time.

(c) Relationship between speed and RPM.

Fig. 9: Plots of the speedometer and tachometer readings for the normal functioning scenario. (The sequence Ids indicate the order of receiving the readings messages by our CAN Bus monitoring node.)

we identify the PID that causes the action. We applied this technique to identify the PIDs for tachometer- and speedometer-related ECUs on the Ford Transit 500 and we found them to be 115 and 254 respectively.

Transformation of the data to human readable values. Each ECU has a set of rules for storing the data in the data field(s) of CAN messages. In this experiment, we collected the physical data, e.g., the speed shown on the speedometer, and the related messages that the given ECU broadcasted [44]. The obtained data is used to infer the storage rules. For example, the formula for the speed is:6

Vdec=

Vhex×0.62137119

100

whereVdecis the real speed andVhex is the hexadecimal

value contained in the message.

Inject fabricated messages into the CAN bus of the vehicle. First, we construct a message that contains the desired PID of the ECU and the data. We also construct a log file with copies of the message. Then, we use the

canutil tool to replay the log file. Figure 7 shows the effect of injecting speed reading messages with value 25 miles on the speedometer and Figure 8 shows the effect of injecting RPM reading messages with value 120000 on the tachometer of our test vehicle– the tachometer shows the maximum value that it could display: 8000. We observed that there is no time-stamp validity check performed on the messages. However, the values must show some increment else they are ignored. We also observed that the vehicle ignores the injected messages and may show engine faults in the dashboard if their frequency is low.

C. Data sets

The experiment consists of injecting speed (PID 254) and RPM reading (PID 115) messages onto the CAN bus from our CAN node simultaneously with the legitimate

6_{The multiplier}_0.62137119_{is used to convert km/h to MPH.}

TABLE II: List of collected data sets.

No Description # of

CAN messages 1 CAN Data for no injection of fabricated

mes-sages

23,963

2 CAN Data with injection of "FFF" as the speed reading

88,492

3 CAN Data with injection of "FFFF" as the RPM reading

30,308

ECU traffic. We collected logs of the CAN bus for three scenarios:

1) No injection attack, or normal functioning of the vehicle.

2) Injection of fabricated speed reading messages. 3) Injection of fabricated RPM reading messages. Table II provides the size of the collected CAN bus log for each of the three scenarios. Note that the RPM and speed readings are time-series data. The legitimate values and fabricated values are intertwined.

The data is available at: https://home.eng.iastate.edu/ ~othmanel/files/CANData.zip - password "Ford6ransit".

Figure 9 shows the speed readings and RPM readings over time under normal behavior The sequence ID indi-cates the order of receiving the values by our monitoring node from the CAN bus. Sub-figure (a) shows the change of the speed over time. It shows that the driver accelerates until they reached almost 30 MPH, then decelerates until the vehicle is almost stopped. Sub-figure (b) shows the change of the RPM reading over time. It shows that the driver accelerated until the reading reached almost 2500. Then, they released the accelerator then accelerated again, which we observe as drop followed by an increase of the RPM readings. They repeated the two operations two times and then released the accelerator. Sub-figure (c) shows the relationship between the speedometer and tachometer readings. The plot shows a strong relationship between the two.

(10)

(a) Frequency of the speed readings. (b) Frequency the of RPM readings.

Fig. 10: Frequencies of speedometer readings and tachometer readings for the normal functioning scenario (We used speed steps as speed is almost continuous.)

(a) Frequency the speed readings. (b) Frequency of the RPM readings.

Fig. 11: Frequency of the speed readings for the scenario of injection of fabricated speed messages.

(a) Frequency the of speed readings. (b) Frequency of the RPM readings.

(11)

(a) No fabricated messages attack. (b) Injection of fabricated RPM reading messages

(c) Injection of fabricated speed reading messages. (There is no correlation in this case.)

Fig. 13: Correlations of the speed and RPM readings in the three scenarios. ()

Figure 10 shows the frequency of speed and RPM readings under normal behavior (scenario 1). The plots do not show special patterns. Figure 11 shows the frequencies of speed and RPM readings for the scenarios of injection of fabricated speed reading messages. The figure shows two groups of values for speed readings: one group for values close to 25 and one group for values between 0 and 15. Figure 12 shows the frequencies of speed and RPM readings for the scenarios of injection of fabricated RPM reading messages. Again, the figure shows two groups of values: one group for values around 120000 and one group for values around 1000. These values are in accordance with the hexadecimal values that we injected: ’FFF’ for speed reading and ’FFFF’ for RPM reading.

The next section discusses the use of three techniques to discriminate the two states: no-attack and under-attack.

VI. Detection of message injection

This section discusses the use of Pearson correlation [25], K-mean [26], and Hidden Markov Model [27] to detect attacks through injection of fabricated RPM or speed reading messages. Specifically, it reports about evaluating the ability of these techniques to distinguish between no-attack (when there is no injection of fabricated messages in the CAN bus) and under-attack (when there is injection of fabricated messages) states of a vehicle using data values embedded in the RPM and speed reading messages exchanged over the CAN bus. Recall that the unsuper-vised machine learning methods do not require extensive training phase to generate models that may depend on contextual data. Example of context for CAN data include behavior of the driver and make and model of the test vehicle.

A. Detection using correlation

Almost each driving action, such as increase of speed, requires the collaboration of a set of ECUs, which typically results in a correlation between the data exchanged be-tween the ECUs. For example, the engine RPM and speed

TABLE III: Pearson correlation of the speed and RPM readings.

Description Correlation

coefficient

P-value

CAN Data for no injection of fabri-cated messages

0.85 2.08e-204

CAN Data with injection of speed readings

-0.013 0.45

CAN Data with injection of RPM readings

0.33 9.37e-23

increase as the vehicle accelerates and decrease when there is braking action, shifting of the transmission, or release of the accelerator. The speed and RPM values must show a strong positive relationship because, with the exception of RPM spikes due to transmission downshifts during deceleration, the engine RPM does not increase while the speed is decreasing, and the speed does not drop while the engine RPM is increasing. The relationship becomes weak or no longer exists when data packets are injected to alter the behavior of the speed or engine RPMs. Therefore, the correlation of the data exchanged by the ECUs is high when there is no-attack and small when there is attack.

We applied the Pearson correlation [25] to the data collected from the three scenarios: legitimate messages, injection of fabricated speed reading messages, and injec-tion of fabricated RPM reading messages–see Secinjec-tion V-C. Table III provides the correlation coefficients of the RPM readings and speed readings for each of the three scenarios. A detailed analysis of the correlation coefficients follows.

Figure 13 shows the plots of the correlation of the speed and RPM readings for the three scenarios. Sub-figure (a) shows the RPM readings as a function of the speed readings and the linear regression line (orange) of the data for the no injection of messages. The plot shows a strong correlation between the two features; the coefficient is 0.8562 with p_value 2.08e−204. The results indicate that the RPM reading predominately increases when the speed reading increases. Sub-figure (b) shows the speed reading

(12)

as a function of the RPM reading and the linear regression line (orange) of the data for the injection of fabricated RPM reading messages scenario. The plot shows a small correlation between the two features; the coefficient is 0.33 with p_value 9.37e−23. The results indicate that the speed reading slightly increases when the RPM reading increases. Sub-figure (c) shows the RPM reading as a function of the speed reading and the linear regression line (orange) of the data for the injection of fabricated speed reading messages scenario. The plot shows no correlation between the two features; the coefficient is -0.013 with p_value 0.45. This correlation coefficient indicates that the speed and RPM readings are unrelated.

The strength of the correlation coefficients distinguishes the no injection attack and under injection attack states. The results suggest that the technique can use the correla-tion between the speed and RPM readings to distinguish between no-attack state versus an under-attack state, i.e., there is no-attack when the correlation coefficient is high and there is attack when the correlation coefficient is low. As a result, we conclude that the Pearson correlation can discriminate no-attack and under-attack states of a given vehicle for the case of RPM readings and speed readings.

B. K-Means clustering

The k-means clustering is commonly used to group the data points into a set of clusters; it clusters a given N

observations into a given k clusters, such that the total Euclidean distances of the observations to the cluster centers are minimized [26].

We applied the k-means clustering algorithm on the data collected from the three scenarios: legitimate mes-sages, injection of fabricated speed reading mesmes-sages, and injection of fabricated RPM reading messages–see Sec-tion V-C–to cluster the observed speed (and RPM) read-ings into two clusters (kis 2): no-attack and under-attack. Note that we rescaled the data to be appropriate for k-means algorithm. Table IV provides the performance of the algorithm in identifying the injection attacks, in terms of detection rate, false positive rate, and accuracy [47]. (Accuracy measures the fraction of correctly classified cases [47].) A detailed analysis of the results obtained using the algorithm follows.

Figure 14 plots the distinction between the no-attack and under-attack states by the k-means clustering method for the no injection of fabricated messages scenario. Sub-figure (a) shows the two identified clusters from the RPM readings. The algorithm falsely detects 64% of the RPM readings messages as attacks while the dataset does not include fabricated messages. Sub-figure (b) shows the two identified clusters from the speed readings. The algorithm falsely detects 68% of the speed readings messages as attacks. We conclude that the technique is misleading in this case because the dataset contains only legitimate messages.

Figure 15 plots the distinction between the no-attack and under-attack states by the k-means clustering method for the injection of fabricated RPM reading messages sce-nario. Sub-figure (a) shows the two identified clusters from the RPM readings. The algorithm detects correctly only 71% of the injected readings messages and has an accuracy of 0.71. Sub-figure (b) shows the two identified clusters from the speed readings. The algorithm classified falsely 45% of speed readings messages as attack. The technique has good performance in detecting RPM readings injection but falsely detects injection of speed readings. Note that we observe in the figure that there are two types of lines: curved line and straight line. The straight line plots the injected constant reshaped digital value of "FFFF" into the RPM readings.

Figure 16 plots the distinction between the no-attack and under-attack states by the k-means clustering method for the injection of fabricated speed reading messages scenario. Sub-figure (a) shows the two identified clusters. The algorithm falsely detects 50% of the RPM readings messages as attacks. Sub-figure (b) shows the two identi-fied clusters of speed readings. The algorithm detects 79% of the injected speed readings messages, with accuracy of 0.79. Note that we observe in the figure that there are two types of lines: curved lines and straight lines. The straight line plots the injected constant reshaped digital value of "FFF" into the speed readings.

C. Detection Using Hidden Markov Model

A Hidden Markov Model (HMM) is a doubly stochastic process where a stochastic process is not directly ob-servable but can be inferred through another obob-servable process. The model is named "hidden" since the state of the model at a time instant t is not directly observable [27]. However, each of these hidden states could be asso-ciated with a set of observations of the related observable processes.

We used HMMlearn [48], the python implementation of HMM, to train a model, using two-thirds of each of the datasets (see Section V-C), to predict whether the message data indicates normal behavior or a vehicle under-attack. The remaining third of each dataset is used to verify the accuracy of the model’s predictions. Table V provides the performance of the algorithm in identifying the injection attacks, in terms of detection rate, false positive rate, accuracy and precision [47]. A detailed analysis of the results obtained using the algorithm follows.

Figure 17 plots the RPM and speed readings under normal behavior or the no-attack scenario. Sub-figure (a) shows that the method classified all the RPM readings cor-rectly as being legitimate. However, sub-figure (b) shows that the method classified the speed readings into both no-attack and under-attack states as indicated by the gap in speeds from approximately 18 MPH to 30 MPH. The algorithm classified falsely 42.5% of speed readings

(13)

TABLE IV: Performance of the k-means clustering to detect injection attacks.

Speed readings RPM readings

Description Detection rate False det. rate Accuracy Detection rate False det. rate Accuracy

CAN Data for no injection of fabricated mes-sages

0.0 0.68 0.0 0.0 0.64 0.0

CAN Data with injection of RPM reading 0.0 0.45 0.0 0.71 0.0 0.71 CAN Data with injection of speed reading 0.79 0.0 0.79 0.0 0.50 0.0

(a) Based on RPM readings. (b) Based on speed readings.

Fig. 14: Identification of attack/no-attack states for the no injection of fabricated messages scenario based on K-means clustering.

Fig. 15: Identification of attack/no-attack states for the injection of fabricated RPM messages scenario based on K-means clustering.

TABLE V: Performance of the HMM clustering to detect injection attacks.

Speed readings RPM readings

Description Detection rate False det. rate Accuracy Detection rate False det. rate Accuracy

CAN Data for no injection of fabricated mes-sages

0.0 0.425 0.0 0.0 0.0 0.0

CAN Data with injection of RPM reading 0.0 0.0 0.0 0.68 0.08 0.75 CAN Data with injection of speed reading 0.80 0.0 0.80 0.0 0.0 0.0

(14)

Fig. 16: Identification of attack/no-attack states for the injection of fabricated speed messages scenario based on K-means clustering.

(a) Based on RPM readings (b) Based on speed readings

Fig. 17: Identification of attack/no-attack states for the no injection of fabricated messages scenario based on HMM.

(15)

Fig. 19: Identification of attack/no-attack states for the injection of fabricated RPM messages scenario based on HMM

messages as attack, while the dataset does not include fabricated messages.

Figure 18 plots the distinction between the no-attack and under-attack states by the HMM clustering method for the injection of fabricated speed reading messages scenario. According to Sub-figure (a), the algorithm did not identify an attack from the RPM readings and ac-cording to sub-figure (b) the method identified an attack from the speed readings. The detection rate of fabricated speed readings messages is 80% with accuracy of 0.8. The technique succeeds, with acceptable rate, in this case because the dataset includes fabricated speed messages but does not include fabricated RPM readings.

Figure 19 plots the distinction between the no-attack and under-attack states by the HMM clustering method for the injection of fabricated RPM reading messages scenario. According to sub-figure (a), the method iden-tified an attack from the RPM readings and according to sub-figure (b) the method did not identify an attack from the speed readings. The detection rate of fabricated RPM readings messages is 68% and false detection rate is 8%, with accuracy of 0.75. The technique has good performance in this case because the dataset includes fabricated RPM messages but does not include fabricated speed readings.

The HMM technique distinguishes no-attack and attack states when there is an attack with acceptable accuracy but has high positive rate in detecting attacks from speed readings data when there is no-attack.

VII. Discussion

Table VI shows the common data sources that have been used to assess the performance of machine learning tech-niques in identifying attacks on CAN bus. Unlike reported studies, the data source of this study is the log of a CAN bus of an in-motion vehicle under fabricated messages injection attack. The results suggest that studying CAN

bus traffic anomaly detection techniques using dataset fabricated via simulators, models and similar off-vehicle test facilities may not properly evaluate the performance of the detection techniques, which was already suggested, but not demonstrated, by Tomlinson et al. [22]. Thus, re-searchers should use data directly captured from the CAN bus of operational vehicles to evaluate the effectiveness of anomaly detection techniques on the CAN bus.

TABLE VI: Examples of reported performance of CAN bus IDS approaches.

Ref. Data source Detection

rate [19] Use data of own devices connected to

vehi-cles

100%

[20] Use synthetic (artificial) data 100% [21] Use data collected from simulators up to

100% [38] Use data from stationary vehicle up to

98% [24] Use data from in-motion vehicles See

Table VII

The application of the Pearson correlation [25] using the collected datasets revealed that the correlation dis-tinguishes under-attack and no-attack states of a given vehicle for the case of speed and RPM readings. This confirms the proposition that the relationships between messages exchanged by the ECUs through the CAN bus of vehicles can be used to detect cyber-attacks on connected vehicles [17]. The results suggest that knowledge should be ascertained concerning the relationships and collaboration between a vehicle’s ECUs to detect anomalous behaviors. The k-means clustering [26] has limited capabilities to distinguish the under-attack and no-attack states of a vehicle using the speed and RPM readings. The method has an acceptable detection rate (and accuracy) but also a high positive rate (almost 0.5). The results indicate that

(16)

the method is not appropriate for time-series data, such as the speed and RPM readings.

The HMM [27] has acceptable performance to distin-guish the under-attack and no-attack states of a vehicle using the speed and RPM readings. The method has, however, high positive rate in detecting attacks from speed readings when there is no-attack. In practice, this issue may be mitigated by using the method in conjunction with the Pearson correlation.

TABLE VII: Performance of K-means and HMM com-pared to two existing IDS suppliers [24].

IDS solution Accuracy

Supplier B

Detection of cyclic messages injection in stationary vehicle using Supplier’s B IDS

.5

Detection of cyclic messages injection in in-motion vehicle using Supplier’s B IDS

1.0

Supplier C

Detection of cyclic messages injection in stationary vehicle using Supplier’s C IDS

.67

Detection of cyclic messages injection in in-motion vehicle using Supplier’s C IDS

.83

K-means clustering

Detection of RPM reading messages in in-motion vehicle using K-means clustering

.71

Detection of speed reading messages in in-motion vehicle using K-means cluster-ing

.79

HMM Detection of RPM reading messages in in-motion vehicle using HMM

.80

Detection of speed reading messages in in-motion vehicle using HMM

.75

Table VII compares the accuracy of the unsupervised machine learning techniques that we used, k-means and HMM, and two pre-commercial IDS [24] in detecting fabricated messages attacks on CAN bus. Our study and the one reported in [24] used different datasets captured from in-motion vehicles but include messages of different cyclic CAN IDs. The accuracy metric shows that the k-means and HMM have performance comparable to the ones of supervised machine learning techniques. Note that the IDS discussed in [24] has accuracy 1.0 for in-motion vehicle and 0.5 for stationary vehicle but we do not have results for stationary vehicles–the speed reading is 0 for stationary vehicle.

Note that we used in this study three datasets related to one driver and one vehicle. This may limit the general-ization of the results.

VIII. Conclusion

Many studies in the last decade have shown the feasibil-ity of cyber-attacks on connected vehicles. The communfeasibil-ity has proposed solutions to prevent such attacks, which require modification of the CAN protocol. The anomaly detection technique has been proposed as an alternative approach as it does not require modification to the CAN protocol. This study found that the Pearson correlation can be used to detect attacks, the k-means clustering fails to detect attacks, and HMM provides acceptable results in detecting injection of fabricated speed and RPM reading messages. The results suggests that studying

anomaly detection techniques for the CAN bus using datasets fabricated via simulators, models and similar off-vehicle test facilities–which is the current practice in the literature–does not indicate the performance of the detec-tion techniques. We suggest that researchers should use real data obtained from operational vehicles in evaluating the efficacy of anomaly detection techniques in the CAN bus. In addition, the results suggest that unsupervised machine learning techniques such as HMM could provide performance in detecting attacks comparable to supervised machine learning technique when using data from in-motion vehicle. These methods do not require an extensive training phase to develop models that are tight to the vehicle make and model or to the driving habit of the driver. The results suggest also that knowledge should be ascertained concerning the relationships and collaboration between a vehicle’s ECUs to detect anomalous behaviors.

Acknowledgement

The authors thank Sudharrshan Veeraraghava, Ashraf Shaik Mohammed, and Moataz Abdelkalik for their help in collecting the data. The project is funded by the Pacific Northwest National Laboratories.

References

[1] R. Bosch GmbH, “Can specification v2.0.” http://esd.cs.ucr. edu/webres/can20.pdf, 1991.

[2] L. ben Othmane, H. Weffers, M. M. Mohamad, and M. Wolf,A Survey of Security and Privacy in Connected Vehicles, pp. 217– 247. Springer, 2015.

[3] V. H. Le, J. den Hartog, and N. Zannone, “Security and privacy for innovative automotive applications: A survey,” Computer Communications, vol. 132, pp. 17 – 41, 2018.

[4] L. ben Othmane, V. Alvarez, K. Berner, M. Fuhrmann, W. Fuhrmann, A. Guss, and T. Hartsock, “Demo: A low-cost fleet monitoring system,” inThe Fourth IEEE Annual Interna-tional Smart Cities Conference, (Kansas City, MO), Sep. 2018. [5] Y. Zhao, “Telematics: safe and fun driving,”IEEE Intelligent

Systems, vol. 17, pp. 10–14, Jan 2002.

[6] M. Wolf, A. Weimerskirch, and T. Wollinger, “State of the art: Embedding security in vehicles,” EURASIP Journal on Embedded Systems, vol. 2007, 12 2007.

[7] T. Hoppe, S. Kiltz, and J. Dittmann, “Security threats to auto-motive can networks–practical examples and selected short-term countermeasures,” Reliability Engineering and System Safety, vol. 96, no. 1, pp. 11 – 25, 2011. Special Issue on Safecomp 2008.

[8] K. Koscher, A. Czeskis, F. Roesner, S. Patel, T. Kohno, S. Checkoway, D. McCoy, B. Kantor, D. Anderson, H. Shacham, and S. Savage, “Experimental security analysis of a modern automobile,” in2010 IEEE Symposium on Security and Privacy, pp. 447–462, May 2010.

[9] S. Checkoway, D. McCoy, B. Kantor, D. Anderson, H. Shacham, S. Savage, K. Koscher, A. Czeskis, F. Roesner, and T. Kohno, “Comprehensive experimental analyses of automotive attack surfaces,” in Proceedings of the 20th USENIX Conference on Security, SEC’11, (Berkeley, CA, USA), pp. 6–6, USENIX As-sociation, 2011.

[10] C. Miller and C. Valasek, “Adventures in automotive networks and control units.” http://illmatics.com/car_hacking.pdf, 2013. [11] R. Brandom, “The scariest thing about the chrysler hack is how

hard it was to patch,” Jul 2015.

[12] J. Golson, “Car hackers demonstrate wireless attack on tesla model s.” https://www.theverge.com/2016/9/19/12985120/ tesla-model-s-hack-vulnerability\protect\discretionary{\char\ hyphenchar\font}{}{}keen-labs, Sep 2016.

(17)

[13] S. Abuelsamid, “Toyota has big plans to get cars talking to each other and infrastructure in the U.S..” https://www.forbes.com/sites/samabuelsamid/ 2018/04/16/toyota-launches-aggressive-v2x-/

/communications-roll-out-from-2021/#35a8e8ad146c, Apr. 2018.

[14] S. Janaarthanan, “Security analysis of vehicle to vehicle arada locomate on board unit,” Master’s thesis, Iowa State University, Ames, USA, 2019.

[15] K. B. Book, “Nearly 80 percent of consumers think vehicle hacking will be frequent problem in near future, according to new kelley blue book survey.” https://www.kbb.com/car-news/all-the-latest/ kbbcom-survey-owners-leery-of-hacking/2000012299/, 07 2015.

[16] L. ben Othmane, R. Fernando, R. Ranchal, B. Bhargava, and E. Bodden, “Likelihood of threats to connected vehicles,” In-ternational Journal of Next-generation Computing (IJNGC), vol. 5, pp. 290–303, Nov. 2014.

[17] M. Müter, A. Groll, and F. C. Freiling, “A structured approach to anomaly detection for in-vehicle networks,” inProc. Sixth In-ternational Conference on Information Assurance and Security, (Atlanta, GA), pp. 92–98, Aug 2010.

[18] M. Müter and N. Asaj, “Entropy-based anomaly detection for in-vehicle networks,” inProc. IEEE Intelligent Vehicles Sym-posium (IV), (Baden-Baden, Germany), pp. 1110–1115, June 2011.

[19] K.-T. Cho and K. G. Shin, “Fingerprinting electronic control units for vehicle intrusion detection,” in Proc. of the 25th USENIX Conference on Security Symposium, (Austin, TX, USA), pp. 911–927, 08 2016.

[20] A. Taylor, S. Leblanc, and N. Japkowicz, “Anomaly detection in automobile control network data with long short-term memory networks,” inProc. IEEE International Conference on Data Science and Advanced Analytics (DSAA), (Montreal, Canada), pp. 130–139, Oct 2016.

[21] M. Levi, Y. Allouche, and A. Kontorovich, “Advanced analytics for connected car cybersecurity,” inProc. IEEE 87th Vehicular Technology Conference (VTC Spring), (Porto, Portugal), pp. 1– 7, 06 2018.

[22] A. Tomlinson, J. Bryans, and S. Shaikh, “Towards viable in-trusion detection methods for the automotive controller area network,” inProc. 2nd Computer Science in Cars Symposium - Future Challenges in Artificial Intelligence Security for Au-tonomous Vehicles (CSCS 2018), (Munich, Germany), 9 2018. [23] S.-F. Lokman, A. T. Othman, and M.-H. Abu-Bakar, “Intrusion

detection system for automotive controller area network (can) bus system: a review,”EURASIP Journal on Wireless Commu-nications and Networking, no. 1, 2019.

[24] S. Stachowski, R. Gaynier, and D. J. LeBlanc, “An assess-ment method for automotive intrusion detection system perfor-mance.” https://rosap.ntl.bts.gov/view/dot/41006, April 2019. [25] K. Pearson, “Notes on regression and inheritance in the case of two parents,”Proc. of the Royal Society of London, vol. 58, June 1895.

[26] D. J. C. MacKay, Information theory, inference and learning algorithms. Cambridge University Press, New York, 2003. [27] L. Rabiner and B. Juang, “An introduction to hidden markov

models,”IEEE ASSP Magazine, vol. 3, pp. 4–16, Jan 1986. [28] K. Pazul, “Controller area network (can) basics.”

https://www.microchip.com/wwwAppNotes/AppNotes.aspx? appnote=en011694, 2005. accessed in Jan. 2019.

[29] B. Electronics, “Obd-ii - on-board diagnostic system.” http:// www.obdii.com/connector.html. accessed on Jan. 2019. [30] “Hacking the can bus: Basic manipulation of a modern

automobile through can bus reverse engineering.” https://www. sans.org/reading-room/whitepapers/threats/paper/37825, June 2017.

[31] L. Dhulipala, “Detection of injection attacks on in-vehicle net-work using data analytics,” Master’s thesis, Iowa State Univer-sity, Ames, USA, 2018.

[32] C. Miller and C. Valasek, “A survey of remote automo-tive attack surfaces.” http://illmatics.com/remote%20attack% 20surfaces.pdf. accessed in Jan. 2019.

[33] S. Mazloom, M. Rezaeirad, A. Hunter, and D. McCoy, “A secu-rity analysis of an in-vehicle infotainment and app platform,” in Proc. 10th USENIX Workshop on Offensive Technologies (WOOT 16), (Austin, TX), 2016.

[34] W. Wu, R. Li, G. Xie, J. An, Y. Bai, J. Zhou, and K. Li, “A survey of intrusion detection for in-vehicle networks,”IEEE Transactions on Intelligent Transportation Systems, pp. 1–15, 2019.

[35] B. Groza, S. Murvay, A. van Herrewege, and I. Verbauwhede, “Libra-can: A lightweight broadcast authentication protocol for controller area networks,” inCryptology and Network Security (J. Pieprzyk, A.-R. Sadeghi, and M. Manulis, eds.), vol. 7712, pp. 185–200, Springer, 2012.

[36] A. Hazem and H. A. Fahmy, “Lcap - a lightweight can authenti-cation protocol for securing in-vehicle networks,” in10th escar Embedded Security in Cars Conference,, (Berlin, Germany), 2012.

[37] S. Woo, D. Moon, T. Youn, Y. Lee, and Y. Kim, “Can id shuffling technique (cist): Moving target defense strategy for protecting in-vehicle can,” IEEE Access, vol. 7, pp. 15521– 15536, Jan. 2019.

[38] E. Seo, H. M. Song, and H. K. Kim, “Gids: Gan based intrusion detection system for in-vehicle network,” in2018 16th Annual Conference on Privacy, Security and Trust (PST), pp. 1–6, Aug 2018.

[39] O. Iegorov and S. Fischmeister, “Mining task precedence graphs from real-time embedded system traces,” in2018 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), pp. 251–260, April 2018.

[40] S. Boumiza and R. Braham, “An efficient hidden markov model for anomaly detection in can bus networks,” in2019 Interna-tional Conference on Software, Telecommunications and Com-puter Networks (SoftCOM), pp. 1–6, Sep. 2019.

[41] M. Hanselmann, T. Strauss, K. Dorman, and H. Ulmer, “Canet: An unsupervised intrusion detection system for high dimen-sional can bus data,” 2019. preprint arXiv:1906.02492. [42] “Raspberry pi 3 model b.” https://www.raspberrypi.org/

products/raspberry-pi-3-model-b/. accessed on Jan. 2019. [43] “Pican 2 - can interface for raspberry pi 2/3.”

https://copperhilltech.com/pican-2-can-interface-for-raspb/ /erry-pi-2-3/. accessed on Jan. 2019.

[44] C. Smith,The Car Hacker’s Handbook: A Guide for the Penetra-tion Tester. San Francisco, CA, USA: No Starch Press, 1st ed., 2016.

[45] SocketCAN. https://www.kernel.org/doc/Documentation/ networking/can.txt. accessed on Jan. 2019.

[46] Github, “Linux-can / socketcan user space applications.” https: //github.com/linux-can/can-utils. accessed in November 2018. [47] L. ben Othmane, A. D. Brucker, S. Dashevskyi, and P. Tsa-lovski,An Introduction to Data Analytics For Software Security, p. 69âĂŞ94. CRC Press, 2018.

[48] S. Lebedev, “hmmlearn 0.2.1.” https://pypi.org/project/ hmmlearn/l.

Lotfi ben OthmaneDr. Lotfi ben Othmane is Assistant Teaching Professor at the De-partment of Electrical and Computer Engi-neering and is leading the EngiEngi-neering Secure Smart Cyber-Physical Systems Lab at Iowa State University, USA. Previously, he was a Research Scientist and then Head of the Secure Software Engineering department at Fraun-hofer SIT, Germany. Lotfi received his Ph.D. from Western Michigan University (WMU), USA, in 2010; the M.S. in computer science from University of Sherbrooke, Canada, in 2000; and the B.S in information systems from University of Economics and Management of Sfax, Tunisia, in 1995. He works currently on engineering secure cyber-physical systems.

(18)

Lalitha DhulipalaLalitha Dhulipala is cur-rently an Associate Consultant at PwC. She received her B.S. from Osmania University, India in 2016 and M.S. from Iowa State Uni-versity, USA, in 2018.

Nicholas MultariNicholas J. Multari is se-nior technical advisor for research in cyberse-curity at the Pacific Northwest National Lab (PNNL) in Richland, Washington. He estab-lishes the direction and leads the execution of the various research projects resulting in a rigorous foundation upon which security con-cepts are matured and implemented. Prior to joining PNNL, he was the manager of trusted cyber technology research at Boeing Research and Technology in Seattle, Washington. In 2008, he served as a consultant to the USAF Scientific Advisory Board (SAB) investigating the effects of the contested cyber environ-ment on the USAF mission. In addition to being a Senior Security Engineer with Scitor Corporation in Northern Virginia where he supported the government in addressing enterprise level information assurance issues. Nick spent 20 years as a computer scientist in the Air Force retiring as a Lt. Col. In the Air Force, his positions ranged from system acquisitions to networking to computer security man-agement. He received a bachelorâĂŹs degree in mathematics from Manhattan College, New York; a masterâĂŹs degree in computing and information science from Trinity University, Texas; and a PhD in computer science from the University of Texas at Austin.

Manimaran Govindarasu Manimaran Govindarasu is currently the Mehl Professor of Computer Engineering in the Department of Electrical and Computer Engineering at Iowa State University, USA. He received his Ph.D degree in Computer Science and Engineering from the Indian Institute of Technology (IIT), Madras, India in 1998. He has been on the faculty of Iowa State University since 1999. His research experience is in the areas of cyber-physical system (CPS) security for the smart grid, cybersecurity, real-time systems and networks, and Internet of Things. He has co-authored nearly 200 peer-reviewed research publications, and co-authored the text "Resource Management in Real-time Systems and Networks," MIT Press, 2001. He served as a Guest Co-Editor for several flagship IEEE publications and served as an Associate Editor for IEEE Transactions on Smart Grid and IEEE Transactions on Mobile Computing. He is a Fellow of the IEEE.