This may be the author s version of a work that was submitted/accepted for publication in the following source:

(1)

This may be the author’s version of a work that was submitted/accepted for publication in the following source:

Gruber, Thierry,Larue, Gregoire,Rakotonirainy, Andry, & Poulsen, Niels (2017)

Developing a simulation framework for safe and optimal trajectories con-sidering drivers’ driving style.

IET Intelligent Transport Systems, 11(10), pp. 624-631.

This file was downloaded from: https://eprints.qut.edu.au/115346/

c

2017 The Institution of Engineering and Technology

This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the docu-ment is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recog-nise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to [email protected]

Notice: Please note that this document may not be the Version of Record (i.e. published version) of the work. Author manuscript versions (as Sub-mitted for peer review or as Accepted for publication after peer review) can be identified by an absence of publisher branding and/or typeset appear-ance. If there is any doubt, please refer to the published source.

(2)

Developing a simulation framework for safe and optimal

trajectories considering drivers’ driving style

Thierry Gruber1,2_{, Grégoire S. Larue}1*_{, Andry Rakotonirainy}1_{, Niels K. Poulsen}2

1. Centre for Accident Research and Road Safety – Queensland, Queensland University of Technology (QUT), Australia

2. Department of Applied Mathematics and Computer Science, Technical University of Denmark (DTU), Denmark

*130 Victoria Park Road, Kelvin Grove QLD 4059, Australia, +61 7 3138 4644, [email protected]

Abstract

Advanced Driving Assistance Systems (ADAS) have huge potential for improving road safety and travel times. However, their take-up in the market is very slow; and these systems should consider driver’s preferences to increase adoption rates. The aim of this paper is to develop a model providing drivers with the optimal trajectory considering the motorist’s driving style in real time. Travel duration and safety are the main parameters used to find the optimal trajectory. A simulation framework to determine the optimal trajectory was developed in which the ego car travels in a highway environment scenario, using an agent-oriented approach. The performance of the algorithm was compared against optimal trajectories computed offline with the hybrid A* algorithm. The new framework provides trajectories close to the optimal trajectory and is computationally achievable. The agents were shown to follow safe and fast trajectories in three tests scenarios: emergency braking, overtaking, and a complex situation with multiple vehicles around the ego vehicle. Different driver profiles were then tested in the complex scenario, showing that the proposed approach can adapt to driver preferences and provide a solution close to the optimal solution given the defined safety constraints.

Keywords:

Road safety, optimal trajectories, Artificial Intelligence. Introduction

Today the world is organised under a mobile paradigm, and transport needs to be efficiently handled. New approaches of transport are explored to ensure its sustainability in a fast-paced changing world, especially considering speed, safety, comfort and environmental concerns. Road transport is the preferred mode of transport as it offers both flexibility and comfort. However, road transport is also associated with negative effects on both human health and the planet, as manifested through road trauma, air pollution and greenhouse gases.

(3)

subsequent increases in gaseous and particulate pollution which have serious health effects, including respiratory and cardiovascular diseases.

Advanced Driver Assistance Systems (ADAS) are designed to increase mobility and comfort whilst preserving safety. Adaptive Cruise Control, Autonomous parking assistance systems, pre-crash systems, drowsiness detection systems, blind spot information systems and lane departure warning systems are examples of ADAS. Each new ADAS is increasingly complex and ultimately contributes towards the design of driverless cars. The complexity of ADAS is mainly driven by the need to provide a reliable safety critical system which can cater for all driving situations. Finding the optimal trajectory is one modelling approach to achieve safe and fast travels.

The current approach with ADAS is generally based on fixed rules of design, which make it difficult to accommodate for very different individual driving styles. There tends to be high variability in driving styles on the road, with factors such as age, gender, culture, personal traits and the history of exposure to various driving styles [4, 5], and this results in two important limitations which may result in such ADAS devices not providing the benefits they intend to deliver: (i) drivers may not accept and use such technology, not being able to adapt to their driving style and with the risk of bothering other road users [6], and (ii) this might result in assisted drivers driving in a unpredictable way by other road users, and this could result in new safety critical conflicts, as highlighted in the research by Preuk, et al. [7]. This is largely due to the fact that what drivers expect to happen is a large factor in how they react, and prepare to react [8-10], and this has been shown to be a contributing factor to rear-end crashes at signalised intersections [11].

(4)

to help drivers achieve in real time the fastest travel which respects safety constraints (as hard constraints) and their driving style characteristics, while adapting to the traffic environment. There are actually different approaches depending on the situation, as shown by the different designs and structures of the cars of the Stanford teams for different DARPA Challenges, organised by the Defense Advanced Research Projects Agency (DARPA): Junior for the urban situation [19] and Stanley for the desert challenge [20]. As a first step, this paper focuses on the particular environment of a highway; where sudden changes are less likely to occur as compared to urban situations. Michon [21]’s model shows that assistance can be provided to drivers at three different levels: strategic (higher level), tactical (medium) and control (lower). For our problem, only goals concerning risk acceptance, depending on the driver's profile, were considered for the higher level. Control is supposed to be perfect so the lower level is not considered. Other driving inherent tasks such as perception are supposed to be perfect. Therefore, this problem focuses on the medium level where tactical choices must be taken to compute an efficient trajectory. For this paper, optimality takes into account time and safety.

The proposed implementation aims at showing feasibility of such a system by providing an adapted simulation framework, and provides a first simple model to find a trajectory and to assess it. The car of interest evolves in an environment with other vehicles equipped with a car-following model to simulate basic car behaviour. Optimal trajectories computation has been adapted to this study framework; they are computed offline and were used as a benchmark to measure the performance of the "intelligent" online computation. Performance measurement of the online algorithms will be based on a series of test scenarios.

State of the art

This section summarises the concepts we are using in our simulation. As mentioned above, the final itinerary optimises safety and time travelled. The simulation of vehicle dynamics is based on existing vehicle models. The safety of the trajectory is obtained by ensuring that surrogate safety measures remain below a critical safety threshold. The critical threshold is selected to ensure low probability of crash risk.

Crash Risk models

Driving is a complex task and the driver is continuously exposed to different level of crash risk. We model crash risk with the probability of being involved in a crash and the severity of that potential crash [22]. The probability of being involved in a road crash, can be predicted by relevant indicators such as

 Headway: headway is the distance between the ego car and the car directly in front. The smaller the headway is, the more dangerous the situation is. However, this indicator only can be highly misleading because danger depends also on the speed of the vehicles. In the study of Vogel [23], two indicators are used. The first one is the time headway. It is measured by taking the difference of time between two vehicles reaching the same position. Vogel shows that short time headway potentially generates dangerous situations therefore can be a surrogate measure of risk; and

(5)

with a car in front if both cars maintain the same speeds. TTC is more accurate predictor of crashes since speed is taken into account.

Crash severity has been extensively studied and researchers accept broadly the common criterion of Energy Equivalent Speed (EES) [24, 25]. It expresses the deformation energy between two vehicles colliding and is expressed as in equation (1).

𝑬𝑺𝑺 = 𝟐.𝒎𝒆𝒈𝒐

𝒎𝒆𝒈𝒐+𝒎𝒇𝒓𝒐𝒏𝒕 . (𝒗𝒆𝒈𝒐(𝒕𝒄𝒓𝒂𝒔𝒉) − 𝒗𝒇𝒓𝒐𝒏𝒕(𝒕𝒄𝒓𝒂𝒔𝒉)) ( 1 )

𝑡𝑐𝑟𝑎𝑠ℎ is the time at which the collision occurs, 𝑚𝑖 is the mass of vehicle i, and 𝑣𝑖 its speed, i is the

index of the vehicle (ego for ego vehicle and front for front vehicle). The higher the EES is, the more serious the crash is. Based empirically on data, a relation between EES and probability of injuries has been given by Glaser [22].

Vehicle models

For our purpose, models on single lane are relevant as lateral position is discretised with lanes. The most common longitudinal models in the literature are Newell’s, Gipps’, Intelligent Driver (IDM), and Field Theory’s models [26]. Other more complex vehicle models exist and they also describe lateral positioning of the vehicle: bicycle, lateral, cognitive, optimal control, and lane-changing models. In this study, however, the lateral positioning is considered to be perfect and lane changes instantaneous. Therefore, these models were not further considered, but would be valuable for more complex implementations.

We tested these vehicle models and compared them to empirical data for acceleration, deceleration and safe distances performance. Newell's model was too simple. While Gipps' and Ni's models gave good performance, they were outperformed by the IDM, which was therefore selected for implementation in this study.

Approaches to process optimal trajectory

Finding an optimal trajectory assumes that the whole space of search is known. In the driving context, objects enter and exit the map dynamically, and as a result the space includes a time dimension. As a consequence, real-time algorithms cannot guarantee to find an optimal trajectory. Here are two ways for retrieving an optimal trajectory for this problem:

 Assuming that the trajectories of the other cars are known and fixed: the whole space of search is known exactly: any scenario could be retrieved a posteriori and the algorithm be ran to find optimal trajectories. However, this is a very strong assumption for this problem as the other cars should react to other vehicles’ movements and may have different behaviours if the ego car has a different behaviour. It is unrealistic and the optimal solution retrieved would not be meaningful.

(6)

Given a simulation model, finding such an optimal trajectory is relevant as it enables to compare and evaluate the performance of a real-time algorithm. Russell and Norvig [27] give a review of AI techniques, including search problems. Most of them suit discrete problems but the driving problem is continuous as most of real problems. This task is complex, time is part of search space and constraints and rules about vehicle, driver and traffic need to be respected. As a result, continuous techniques such as gradient-based, linear programming or convex optimisation are not adapted, and most of the Artificial Intelligence techniques to determine optimal trajectory use a discrete approach [27]. One way to avoid this issue is to discretise the problem. Sampling-based techniques are appropriate to adapt the problem but optimality may be lost depending on the technique. However, by choosing appropriate discretisation steps, the algorithm can compute the solution in a reasonable time and give an acceptable suboptimal solution.

There are different ways to tackle the problem for discretisation when the space is at least two-dimensional, such as cell decomposition, exact cell decomposition or skeletonisation. But the driving environment is particular because longitudinally-oriented; besides for this framework lateral positions are discretised on lanes. Environment sampling-based techniques are particularly adapted for the driving task [28]. Many ITS projects are based on it, including Stanley, the Stanford Robot that won the DARPA Challenge [20]. Longitudinal speed is discretised which gives a set of possible trajectories at any moment for each lane. The discretisation step needs to be well chosen so that the algorithm has a reasonable running time but still outputs an acceptable suboptimal solution.

For a real-time assistance, online algorithms need to be considered. Vanholme, et al. [28] reviewed such algorithms; all of them output locally (sub)optimal solutions, considering only a short time length for the trajectory computation. They are of two kinds: sampling-based or direct. The former discretises the solution space and select a best solution whereas the latter directly outputs a trajectory and leaves out the evaluation step.

An environment-based technique discretises longitudinal and lateral speed, computes several trajectories, evaluates them based on a defined metrics and selects the best one. In the grid algorithm of Glaser, et al. [22], 3 possible manoeuvres are possible per lane: braking, keeping the same speed or accelerating.

(7)

Agent paradigm to model optimal itinerary

The concept of rational agent is widely used in artificial intelligence. An agent perceives its environment through sensors and acts upon that environment through actuators. A human driver is a human agent who meets these requirements: a driver senses the road environment with his eyes, ears and other organs and acts through his hands, legs and so on. A robotic driver can have cameras and other sensors and acts in the environment by actuating motors to help driving the car and display/tell a message to assist the driver. Between perception and action, the agent needs to evaluate the situation and make decisions. It needs to be rational and to do the ‘right thing’. Evaluating this 'right thing' is operationalised using performance measures. However, appropriate metrics remain to be designed in order to properly assess agent's efficiency. A rational agent should choose actions that are expected to maximise the performance measure given its percept sequence, metric for the performance measure and the built-in knowledge it has.

The Belief-Desire-Intention (BDI) model is a common paradigm to describe components of an agent. It is used in many fields including psychology, economics, philosophy or artificial intelligence. The agents have three main concepts (see Figure 1):

 Beliefs: information the agent has about the world. They can be wrong;  Desires: things that an agent would like to achieve. They can conflict; and

 Intentions: things that an agent has committed to achieving. More concrete than desires, they cannot conflict, are possible and persistent. They are usually based on a utility measure (internal performance measure).

Beliefs are based upon percepts and desires possibly re-evaluated upon them. Intentions are derived from beliefs and desires and a plan is built in order to achieve these intentions. The agent executes actions of the plan and impacts on the environment. Then it gets new percepts and updates its beliefs. The program needs to be built so that plans are wisely elaborated (no need to plan too much in advance), possibly repaired (if something goes wrong, need contingency plans). Thus, intentions need to be reconsidered but neither too seldom nor too often. An efficient trade-off needs to be elaborated. Although this paradigm is common, it comports its limitations: for example, this model lacks of learning abilities, and three attitudes may not be sufficient to fully describe a rational agent. Other more flexible and universal models have been suggested [27].

(8)

Proposed integrated framework

The integrated model developed in this paper is based on concepts in the previous sections for the ego car. Models are also used for the simulation of the other vehicles. The online algorithm was implemented using the agent-oriented paradigm. The optimal search was based on the structure of the online search. The simulation structure aimed at being flexible, respecting driving rules and providing realistic driving behaviours.

The optimisation problem was formulated as minimising the time to reach an end destination (goal destination) by ensuring that the maximum attainable speed is chosen at each step (heuristic cost function). The maximum speed is evaluated in a way to ensure that traffic rules, safety distances, and driver characteristics are respected (constraints) at the next step as well as the steps within the prediction horizon. The perceptions of the environment at each time are used as inputs for evaluating safety measures. The algorithm provided the optimal speed (acceleration or deceleration) and lane to choose as outputs.

Environment

The environment is a straight and infinitely long highway with a fixed number of lanes (3) and a fixed speed limit. These are simplifying assumptions for a first model in order to focus on the algorithm itself rather than geometry considerations. The car targets a given abscissa 𝑋_{𝑔𝑜𝑎𝑙}. It is equivalent to being in the curvilinear lane coordinate system. This system is generally used for such problems; for instance, the car of the Stanford team who won the DARPA challenge uses this coordinate system. The transformation from one base to the other is not orthogonal and thus does not keep angles and distances, however for highways where curvature values are low (under 1/500m-1_{) errors introduced} by these transformations are assumed to be negligible with respect to perception and control errors [28].

Other cars are equipped with an agent structure and as such, they are able to react to their close environment. They are equipped with an IDM as this model has shown the best behaviour among longitudinal models considered. They do not change lane for simplification purposes. Each car is given characteristics (mass, maximal acceleration, deceleration and speed), initial conditions (position, speed, acceleration) and a maximum desired speed 𝑣max 𝑑𝑒𝑠𝑖𝑟𝑒𝑑. Each IDM car has a driver with a

reaction time 𝜏.

The ego car is deterministic as it is supposed that the car reacts exactly to the input: there is one outcome from an action decided by the agent. But after an action, the next overall state of the environment is not deterministic as it does not depend only on the ego car but also on other cars' behaviour. Thus, the environment is stochastic. This characteristic is somehow related to the un-observability of what the other drivers are planning to do, and is taken into account in the decision process of the agents.

(9)

Actuators

The actuators for a car are the pedals and the steering wheel which make it move on the road. Since lateral movement is modelled with discrete values, the actions to be taken contain two decisions:

 Acceleration/deceleration (continuous); and  lane changing

o change to right lane (overtake), or o keep lane, or

o change to left lane (finish overtaking by getting back in the left lane). Sensors

A driver can perceive and evaluate the distances with other vehicles, and also their speed or acceleration. Perception may be incomplete as there are blind spots and other vehicles or obstacles can obstruct the field of vision. There exist distance sensors, speed sensors (Doppler radar for example) that can supplement these data; acceleration could be derived from speed sensors. For this model, it is simplified and the car is able to perceive the distance to and the speed of the vehicles directly in front and behind which are located on the left lane, on the same lane and on the right lane, within 200 meters. This value corresponds to the frontal area covered by the radar of Stanley [20]. The radar could cover non-adjacent lanes but it would depend on obstacles on the road (a car could obstruct vision): for simplification purposes, only adjacent lanes are considered in perception. A more complex model could consider the other lanes for more informed decisions.

Performance Measure

Performance measure enables to evaluate the system in accordance to given criteria. In this particular problem, the trajectory aims at being:

 Legal: speed limits respected, lane limits respected, etc.;

 Safe: in compliance with traffic rules (safety distances, possibly keep lower lane etc.); and  Fast: minimise the time t to reach a given location (abscissa 𝑋_{𝑔𝑜𝑎𝑙})

In this implementation, lane keeping is assumed perfect, so that only rear-end collisions are considered. At each moment, a virtual scenario is launched to assess risk: the car in front brakes hardly and the ego car reacts after a reaction time and brakes hardly as well. In case there is no crash, the distance is retrieved between the two cars when both are stopped. In case there is a crash, the EES and the probability of an injury 𝑀𝐴𝐼𝑆 > 2 (Maximum Abbreviated Injury Scale) are retrieved. This information, together with the safety indicators headway and TTC with respect to the car in front of the ego vehicle, provides the safety performance measure of the trajectory. For a given scenario, the most extreme value for each criterion is extracted for comprehensively evaluating the safety of the trajectory.

Driving rules and driver profile

Actions to be taken must respect several sets of rules, considering the vehicle’s limits, human characteristics and traffic rules. Vanholme, et al. [28] give a review of the rules that must be respected, following the Vienna Convention on Road Traffic consensus.

(10)

maximum acceleration (g) and deceleration (d), as well as maximum speed 𝑣𝑚𝑎𝑥.

Cars are systems with physical limits. It must be programmed as constraint in the program as it is not possible to exceed these limits. Cars must comply with the following rules:

o Acceleration should be comprised between a maximum deceleration and a maximum acceleration

o Speed should be inferior to the maximum possible speed of the car o Speed should be superior to the minimum possible speed of the car o Perception should be limited

o Trajectories should be physically feasible for the car

Acceleration is bounded, thus the algorithm can only output accelerations that are in accordance with the car's characteristics. Speed is also upper bounded in the model and cannot exceed the maximum speed of the vehicle: if the acceleration of an action is too high, it is re-adjusted to target the maximum possible speed.

The traffic rules considered in this implementation are the speed limit (the driver should not go over the speed limit at any time), driving on the left-most lane when possible , and safe distances (from the safe distance given by Gipps’ model [29], which is used in the IDM model). At any time, a driver would have time to perform an emergency braking, as the headway is always long enough to let the driver react and then brake until complete stop. It is a consistent safety distance for a long term simulation: if all the cars respect it, there would not be any accident except unexpected circumstances (something blocks suddenly the road for example). However, as the driver acts after a reaction time 𝜏, the agent needs to provide information in advance and needs to predict other cars' trajectories. Those predictions may not be accurate and safe distance could be underestimated. Therefore a second criterion is considered to characterise safe distances on highways: a minimum time headway of 2-3 seconds with the car in front is required [30], value depending on the risk taking trait of the driver. Human rules describe especially the relation and interface between the system and the human driver. In case of fully automated driving, the human driver can choose what style of driving to take (similar to driving profiles) and whether to turn on the automated driving mode. In case of assisted driving, the system must let a degree of freedom to the human driver and takes into account that the reaction from the human driver may not be conform to what was planned. In this paper, simulation is run in a fully automated driving mode as the driver reacts exactly to what is decided by the agent - which enables to focus on the decision process. The system adapts to:

o Different possible driving profiles

o Reaction time of drivers (either human reaction/execution or system execution time) The algorithm has been implemented to consider the following four characteristics of human drivers:

 Reaction time;

 Risk taking trait (from very careful to high risk taker);  Ability and willingness to perform manoeuvres; and  Willingness to go fast.

Online agent-oriented algorithm

(11)

agent-oriented programming: simple reflex, model-based, goal-based, utility-based and learning agents. The utility-based agent suits to the requirements of the problem of study. It simulates human behaviour based on a high level of decision-making, considering performance measure. Moreover, it could be augmented by a learning structure which makes the agent capable of adapting to various environments and situations, and also achieving autonomy and getting better over time. However, at this stage, no good metrics for performance measure could be found in the literature, and a model-based agent was used. This model can then be updated to a utility-based agent once further research has highlighted potential metrics.

Two different agents were considered in this first implementation: a greedy agent and a BDI (Belief-Desire-Intention) agent.

The greedy agent has a straightforward strategy: it evaluates the maximum speed the car can reach, below the maximum desired speed of the driver 𝑣max 𝑑𝑒𝑠𝑖𝑟𝑒𝑑, on each lane, and chooses the fastest

solution. If the agent can go as fast as possible on two lanes, the decision that targets the left-most lane is chosen. As a result, it minimises time locally until the next time step.

The BDI agent with a simple intelligence was implemented to provide a more rational behaviour than the one of the greedy agent. Its characteristics are:

 Beliefs: the history of states of the world he knows through his percepts;  Desires: four different desires have been programmed:

o keepLane: the agent wants to keep the same lane o overtake: the agent wants to overtake the car in front

o keepOvertaking: the agent wants to keep overtaking. As soon as it starts overtaking, its desire switches automatically to keepOvertaking. Then it may want to overtake a car and get to a higher lane again; hence this desire defers from overtake.

o getBack: the agent wants to get back on a lower lane, to finish overtaking; and

 Intentions: In this model desires and intentions are the same: intentions are directly derived from desires.

(12)

Figure 2: Worklfow of the agents’ actions from their perceptions and implemented intelligence

Decision process

In this framework, the sampling time 𝑑𝑡 is fixed and cars take decisions every 𝑑𝑡. This approach simulates a continuous process of decision making. The sampling time needs to be relatively small since the environment in dynamic: cars evolve fast and continuously. However, changes in the road environment on highways are not that frequent: a car at 40 m/s travels only 2 meters in 0.05s, which is the highest frequency at which Stanley was able to compute decisions [20]. The proposed model uses threshold-based decisions on what acceleration to take: at each time step the algorithm computes an acceleration to take from the time 𝑡 to the time 𝑡 + 𝑑𝑡 in order to reach a target acceleration. If 𝑑𝑡 is too small (about 50ms for example) the algorithm will take decisions for very short terms and this results in regular changes in acceleration and rough driving patterns. Further drivers do not react at this time scale, and the algorithm should be able to provide information that drivers are able to implement while driving. Therefore, 𝑑𝑡 = .5𝑠 was used as a compromise between these conflicting requirements.

(13)

that the acceleration is kept constant during this time lapse. These choices of acceleration should give a trajectory that is safe and satisfies the agent's intentions.

The driver has a reaction time 𝜏. Therefore, the agent needs to give information to him at time 𝜏 second before the driver actually acts. As the environment is dynamic, the agent needs to predict at time t what the environment will be like at time 𝑡 + 𝜏. The agent has a built-in model to make it predict the future environment state. Two predictions are made at time t:

 Time 𝑡 + 𝜏: the environment state at the moment the agent will take the action; and  Time 𝑡 + 𝜏 + 𝑑𝑡: the environment state for the other cars at the next time step. The decision process scheme is presented in Figure 3.

Figure 3: Decision process scheme

Optimal search

(14)

Results

A set of scenarios were run to assess the behaviour of the Greedy and BDI agents in basic highway situations. Scenarios included an emergency braking, overtaking scenarios, and a more complex scenario with 8 vehicles around the ego vehicle. Tests were run on Eclipse Juno (4.2.2) with jre 7. Test scenario 1: Emergency braking scenario

In the braking scenario test, there is a car stopped 400m in front of the initial position of the ego car and the ego car travels at the desired speed of the ego driver. Both agents (greedy and BDI) managed to stop 0.75m from the stopped car, after a smooth braking that starts around 15 seconds before the full stop (150m to the obstacle).

Both agents have a similar, realistic and desired behaviour. Results do not depend on risk taking trait of the agent, since the stopping behaviour is obtained from Gipps' safety distance criterion.

Test scenario 2: Overtaking scenario

At the start, the agent is at the driver’s maximum desired speed (90% of the speed limit) on the left lane. There is another car on that lane 250m in front, travelling at 65% of the speed limit. On the middle lane, there is another car 150m in front, going at 70% of the speed limit.

For the Greedy agent takes 106.5s to complete the drive, while the BDI agent takes 95.5s, which is the optimal time found offline by the A* algorithm (see Table 1). The Greedy agent considers changing lane only once it is close to the vehicle in front. At that time, it is unsafe to change lane due to the vehicle in the right lane, and it has to wait for the other vehicle to overtake him to be able to change lanes. The BDI agent chooses to overtake as soon as it perceives the car in front, which makes its trajectory safer but especially faster as it does not get blocked by the car in the adjacent lane. The optimal trajectory is quite close to the BDI's in the sense that it overtakes early enough to avoid being stuck. However, it overtakes later as it was programmed to stay in the left-most lane when possible.

Table 1 - Performance of the agents and offline optimal trajectories for scenario 2

Online Offline (benchmark)

Greedy agent BDI agent hybrid A*

Travel time (s) 106.5 95.5 95.5

Number of lane changes 4 4 4

Min TTC (s) 11.1 12.9 28.1

Min headway (m) 2.3 2.9 2.6

Min time headway (s) 70.8 90.5 81

Safety margin (s) 25.5 31.8 50.5

Test scenario 3: Complex scenario

The scenario is composed of 8 vehicles around the ego-vehicle, with a goal set to 5 km. There are three cars in front of the ego-vehicle (left lane), all driving at lower speeds; 2 cars in the middle lane, one in front (slower), and one behind (faster); and 3 cars on the right lane, 2 being faster than the ego car and one at the same speed.

(15)

lanes, lacking from a structure to plan for future actions (see Table 2). Conversely, the BDI agent overtakes early and gains 3s of time to travel the 5 km. Its trajectory is even safer as it keeps larger headways with cars in front. The optimal solution obtained with the hybrid A* algorithm performs better than the BDI agent because it does not take into account the reaction time to perceive that it is safe to change lanes twice in a row.

Table 2 – Performance of the agents and offline optimal trajectories for scenario 3

Online Offline (benchmark)

Greedy agent BDI agent hybrid A*

Travel time (s) 169.5 166.5 160

Min TTC (s) 19.7 13.9 27.6

Min headway (m) 2.9 2.9 2.6

Min time headway (s) 74.9 75.3 80.8

Safety margin (s) 29.4 37.0 56.4

Test scenario 4: Complex driving scenario with different driver profiles

The same complex road environment as is scenario 3 was used in this scenario, and different driver profiles were tested:

 a very safe driver, who aims at travelling no closer to leading vehicles than a 4s headway, and at a maximum of 80% of the speed limit;

 a safe driver, who aims at travelling no closer than a 3s headway, and at a maximum of 90% of the speed limit, and which was the default driver profile used in the previous scenarios; and  a standard driver, who aims at travelling no closer than a 2s headway, and no faster than the

speed limit.

(16)

Table 3 –Optimal trajectories for scenario 4 depending on the driver profile

Driver profile Very safe Safe Standard

Minimum acceptable headway (s)

4 3 2

Maximum desired speed (% of speed limit)

80 90

100

Travel time (s) 187.7 166.5 150.2

Min TTC (s) 42.6 13.9 11.4

Min headway (m) 3.9 2.9 1.9

Min time headway (s) 108.5 75.3 66.8

Safety margin (s) 71.7 37.0 17.4

Computation cost

For the online algorithm, time elapsed between each time step of the simulation (.5s) were found to take on average 100 microseconds, which is satisfying for a real-time application, provided that the sampling time is at least an order of magnitude larger.

For the offline optimal trajectory, the A* algorithm was used for the most basic scenarios. For the complex scenario, the computation is more complex and time expensive, which resulted in the need to use the hybrid A* algorithm.

Conclusions

This model proves the feasibility of a simulation framework that supports traffic, human and system rules. Its flexibility enables to launch various scenarios and retrieves relevant information about the trajectory, considering both time and safety indicators. Previous works have mainly focused either on the strategic level, such as the global path planner of Junior in the Urban Darpa Challenge, or on a lower tactical level where decisions are taken for shorter terms, such as the collision avoidance program of Stanley in the Desert Darpa Challenge - whereas the model implemented in this paper targets a higher tactical level and aims at taking medium-term decisions.

(17)

such performance measure.

Although the current model has limits, additional features could be plugged in which is desirable for a simulation framework, due to the flexibility of the architecture: the object-oriented architecture of the program enables the model to be improved component per component- especially by augmenting the intelligence of the agents. For example, safety thresholds or system limits can be changed from one car to another. As another example, the simulation framework uses IDM models to simulate other cars' behaviours. Using another model just requires to change the agent's type. For instance, if all the IDM models are changed into Greedy Agents, and if the main agent is a Greedy Agent, the simulation is a macro simulation of Greedy Agents and shows its macroscopic behaviour. Other more complex simulation models could be used with the IDM model, especially for lane changing. Besides, if a performance measure is implemented, it can be plugged to the agent's structure in order to make it take more rational decisions and even learn.

The main focus of improvement should be augmenting the intelligence of the agents. This should be researched in order for the agents to have a higher view of the situation (such as cooperation with the development of cars at various levels of automation) and use the history of their percepts to take the most of it. The efficiency and rationality will depend only on what is programmed if the agent is not able to learn. Thus the agent will not be able to succeed in new environments or situations that were not planned. The implementation of a learning structure seems crucial for such a system.

Finally, it is a first simple model with limitations but it has shown desirable behaviours for simple tests. Besides the implemented solutions are all computationally possible in real-time. This model provides a first simulation framework and algorithms that are able to compute optimal safe trajectories that respect road rules and driver driving style preferences. Implementing such framework could assist drivers for changing lanes, selecting appropriate speeds for the traffic conditions and reduce the likelihood of emergency braking. The assessment of safety effects of EU ITS interventions [31] shows that targeting such issues with new technologies could result in a reduction of 15-20% of road fatalities.

References

[1] Bureau of Infrastructure, Transport and Regional Economics {BITRE], "Cost of road crashes in Australia 2006," CanberraReport 118, 2009.

[2] World Health Organization, "Global status report on road safety 2013: supporting a decade

of action.," 2013, Available:

http://www.who.int/violence_injury_prevention/road_safety_status/2013/en/.

[3] D. Shinar, Psychology on the road, the human factor in traffic safety. New York, USA: John Wiley and Sons, 1978, p. 212.

[4] G. Miller and O. Taubman - Ben-Ari, "Driving styles among young novice drivers—The contribution of parental driving styles and personal characteristics," Accident Analysis & Prevention, vol. 42, no. 2, pp. 558-570, 3// 2010.

(18)

Supplement, pp. S548-S551, 2013.

[6] D. M. L. Rittger, C. Maag, A. Kiesel, "Anger and Bother Experience when Driving with a Traffic Light Assistant: A Multi-driver Simulator Study," in Human Factors in High Reliability Industries, J. S. D. de Waard, S. Röttger, A. Kluge, D. Manzey, C. Weikert, H. Hoonhout, Ed., 2015, pp. 41-52.

[7] K. Preuk, E. Stemmler, C. Schießl, and M. Jipp, "Does assisted driving behavior lead to safety-critical encounters with unequipped vehicles’ drivers?," Accident Analysis & Prevention, vol. 95, Part A, pp. 149-156, 10// 2016.

[8] G. M. Björklund and L. Åberg, "Driver behaviour in intersections: Formal and informal traffic rules," Transportation Research Part F: Traffic Psychology and Behaviour, vol. 8, no. 3, pp. 239-253, 5// 2005.

[9] M. H. Martens, "Stimuli Fixation and Manual Response as a Function of Expectancies," Human Factors, vol. 46, no. 3, pp. 410-423, 2004/09/01 2004.

[10] E. Muhrer and M. Vollrath, "Expectations while car following—The consequences for driving behaviour in a simulated driving task," Accident Analysis & Prevention, vol. 42, no. 6, pp. 2158-2164, 11// 2010.

[11] V. Eis, R. Sferco, and P. A. Fay, "A Detailed Analysis of the Characteristics of European Rear Impacts," Proceedings: International Technical Conference on the Enhanced Safety of Vehicles, vol. 2005, pp. 9p-9p, / 2005.

[12] (2013). Smart Cities and Communities − Key to Innovation Integrated Solution—Cooperative Intelligent Transport Systems and Services (C-ITS). Available:

https://eu-smartcities.eu/sites/all/files/Cooperative%20Intelligent%20Transport%20Syste ms%20and%20Services%20-%20Smart%20Cities%20Stakeholder%20Platform%20january .pdf

[13] B. Saerens and E. Van den Bulck, "Calculation of the minimum-fuel driving control based on Pontryagin’s maximum principle," Transportation Research Part D: Transport and Environment, vol. 24, pp. 89-97, 10// 2013.

[14] F. Mensing, E. Bideaux, R. Trigui, and H. Tattegrain, "Trajectory optimization for eco-driving taking into account traffic constraints," Transportation Research Part D: Transport and Environment, vol. 18, pp. 55-61, 1// 2013.

[15] S. Glaser, B. Vanholme, S. Mammar, D. Gruyer, and L. Nouveliere, "Maneuver-based trajectory planning for highly autonomous vehicles on real road with traffic and driver interaction," Intelligent Transportation Systems, IEEE Transactions on, vol. 11, no. 3, pp. 589-606, 2010.

[16] H. Lim, W. Su, and C. C. Mi, "Distance-Based Ecological Driving Scheme Using a Two-Stage Hierarchy for Long-Term Optimization and Short-Term Adaptation," IEEE Transactions on Vehicular Technology, vol. 66, no. 3, pp. 1940-1949, 2017.

[17] D. Moser, R. Schmied, H. Waschl, and L. d. Re, "Flexible Spacing Adaptive Cruise Control Using Stochastic Model Predictive Control," IEEE Transactions on Control Systems Technology, vol. PP, no. 99, pp. 1-14, 2017.

(19)

Parameter Values," Transportation Research Part A, vol. 39, no. 1, pp. pp. 425-444, 2005. [19] M. Montemerlo et al., "Junior: The Stanford entry in the Urban Challenge," Journal of

Field Robotics, vol. 25, no. 9, pp. 569-597, 2008.

[20] S. Thrun et al., "Stanley: The robot that won the DARPA Grand Challenge," Journal of Field Robotics, vol. 23, no. 9, pp. 661-692, 2006.

[21] J. A. Michon, "A Critical View of Driver Behavior Models: What Do We Know, What Should We Do?," in Human Behavior and Traffic Safety, L. Evans and R. C. Schwing, Eds. Boston, MA: Springer US, 1985, pp. 485-524.

[22] S. Glaser, B. Vanholme, D. Gruyer, and S. Mammar, "Probability and risk based maneuver planning for collision avoidance," presented at the First International Symposium on Future Active Safety Technology toward zero-traffic-accident, Tokyo, Japan, 5-9 Spetember 2011, 2011.

[23] K. Vogel, "A comparison of headway and time to collision as safety indicators," Accident Analysis & Prevention, vol. 35, no. 3, pp. 427-433, 5// 2003.

[24] C. A. Hobbs and P. J. Mills, " Injury probability for car occupants in frontal and side impacts," Transport and Road Research Laboratory. Road Safety Division1984

[25] F. Zeidler, H. Schreier, and R. Stadelmann, "Accident Research and Accident Reconstruction by the EES-Accident Reconstruction Method," SAE Technical Paper, 1985. [26] A. Kesting and M. Treiber, Traffic Flow Dynamics: Data, Models and Simulation (no. Book,

Whole). Berlin, Heidelberg: Springer Berlin Heidelberg, 2013.

[27] S. J. Russell and P. Norvig, Artificial intelligence: a modern approach (no. Book, Whole). Harlow: Pearson, 2014.

[28] B. Vanholme, D. Gruyer, B. Lusetti, S. Glaser, and S. Mammar, "Highly Automated Driving on Highways Based on Legal Safety," Intelligent Transportation Systems, IEEE Transactions on, vol. 14, no. 1, pp. 333-347, 2013.

[29] P. G. Gipps, "A behavioural car-following model for computer simulation," Transportation Research Part B: Methodological, vol. 15, no. 2, pp. 105-111, 1981/04/01 1981.

[30] Queensland Government. (2010). Safe following distances. Available:

http://www.tmr.qld.gov.au/Safety/Queensland-road-rules/Road-rules-refresher/Safe-followi ng-distances.aspx