Computational Urban Network Control:
Using Schedule-Driven Traffic Control
to Mitigate Network Congestion
Submitted in partial fulfillment of the requirements for the degree of
Doctor of Philosophy in
Electrical and Computer Engineering
Hsu-Chieh Hu
B.S., Electrical Engineering, National Taiwan University M.S., Communication Engineering, National Taiwan University
M.S., Machine Learning, Carnegie Mellon University
Carnegie Mellon University Pittsburgh, PA
c
Hsu-Chieh Hu, 2019. All Rights Reserved.
Keywords: online planning and scheduling, network scheduling, multi-agent systems, reinforcement learning, intelligent transportation
Abstract
Recent work in decentralized, schedule-driven traffic control has demonstrated the ability to improve the efficiency of traffic flow in complex urban road networks. In this approach, a schedul-ing agent is associated with each intersection. Each agent senses the traffic approaching its intersection and in real-time constructs a schedule that minimizes the cumulative wait time of vehicles ap-proaching the intersection over the current look-ahead horizon. In order to achieve network-level coordination in a scalable manner, scheduling agents communicate only with their direct neighbors. Each time an agent generates a new intersection schedule it com-municates its expected outflows to its downstream neighbors as a prediction of future demand and these outflows are appended to the downstream agent’s locally perceived demand.
In this thesis, we study how to upgrade the network-level co-ordination of schedule-driven traffic control to tackle increasingly serious traffic congestion from three aspects: stability, optimality and learnability. For stability, a hybrid approach that incorpo-rates the stability of queuing theory into a schedule-driven con-trol framework is proposed. For optimality, the basic coordina-tion protocol is extended to addicoordina-tionally incorporate the comple-mentary flow of information reflective of an intersection’s current congestion level to its upstream neighbors. We propose an asyn-chronous decentralized algorithm in order to approach network-wide optimality. In addition, we show that integrating connected and autonomous vehicles with the intersection control could pro-vide benefits for further improving performance. To present its learnability, we study a parameter learning problem to config-ure maximum green time by a fully decentralized reinforcement learning algorithm and a timing prediction problem utilizing the cluster representation for scheduling approaching vehicles. The goal of this thesis is to demonstrate the capability of schedule-driven traffic control to mitigate urban network congestion. To achieve this goal, we leverage the techniques originated from on-line planning, distributed optimization and reinforcement learning as well as incorporate the new technologies including connected and autonomous vehicles into the schedule-driven framework.
Acknowledgments
First, I would like to thank my advisor, Stephen Smith. With-out his support and guidance, I could not be able to complete this dissertation. He has always given me freedom to explore different ideas and taught me how to tell a good story. Moreover, he is a role model for me on how to be a good researcher and how to make the world a better living place through our research. I will always treasure and remember the time when we were discussing research on his small round table.
I am grateful to my committee members, Anthony Rowe, Car-lee Joe-Wong, and Gregory Barlow for their help with this disser-tation. I would especially like to express my gratitude to Anthony. Without his support, I could not work on this interesting traffic project with Steve. Also, I appreciate he provided me an oppor-tunity to work on the DARPA Spectrum Collaboration Challenge to broaden my vision and introduced Carlee to me. I would like to thank Carlee for her insightful discussion and inputs to my the-sis. I would like to thank Greg for providing me lots of technical support when I was implementing my ideas.
I would like to thank everyone in the ECE Department and the Robotics Institute for a wonderful experience. I would espe-cially like to thank current and former members of the Intelligent Coordination and Logistics Laboratory: Laura Barbelescu, Zack Rubinstein, Joe Zhou, Isaac Isukapati, Chung-Yao Chuang, Allen Hawkes, Jayanth Krishna Mogali and Rick Goldstein for their support and constructive comments. Special thanks to Laura and Zack for their advices on how to raise a child during Ph.D. life. I thank my fellow graduate students: Oliver, Shih-Chang, Chia-Yin and Ming Hsiao. Additionally, I would like to thank Rushane, my mentor when I was doing my internship at Fetch robotics.
Thanks to all the friends I made in Pittsburgh: Jiun-Ren and Tzu-Yi, Yue-Hsun and Hsiang-Yi, Yi-Ping and Ying-Chih, and Yi-Jiun and Jing-Yi. I would especially like to thank Jiun-Ren for being a wonderful mentor in my early years of CMU life. Yi-Ping and Ying-Chih were the best neighbors I ever had.
I thank my master advisor, Professor Ping-Cheng Yeh of Na-tional Taiwan University, who showed me the way of doing re-search and encouraged me to study abroad. Thanks to my EE friends for those relaxing times.
To all the brothers and sisters in Pittsburgh, thanks for shep-herding and caring me these years. Pittsburgh has become my second home. Special thanks go to Timothy and Grace, Andy and Tanya, Tom and Ellen, Po-Yu and Chia-Chen (and three kids!), William and Nadia, Hazel, Janie, Carol, Blanche, Abraham and Evelyn, Yohan, Yoonah, Joey and Kim, Sun, Liang, Emily, Rui and many more. I would especially like to thank Harvest for pray-ing with me durpray-ing my hard time.
For my years in CMU, I was supported in part by FHWA (Con-tract DTFH6117C00014), CMU Mobility-21 University Transporta-tion Center (Grant 43333.1.2002319), and the CMU Robotics In-stitute. I would like to express my gratitude to them.
Without my supportive family, none of this would have been possible. I would particularly like to thank my grand parents, Jung-Chun Hu and Mei-Kuei Yeh, my parents, Tsung-Chien Hu and Shao-Min Hui, my brother, Hsu-Chia Hu, my parents-in-law, Ruey-Hong Hsiao and Chang-Hua Chien and my brother-in-law, Chieh-Chien Hsiao. Most of all, I would like to thank my dear wife and my best friend, Yun-Hao. Yun-Hao has been my greatest supporter during my time at CMU, and I could not have done this without her. No matter what difficulties I encountered, her smile could always comfort and encourage me. Finally, Luke deserves to be on this list for being the regulator of my daily schedule and the source of joy in my life. I love you all.
Contents
1 Introduction 1
1.1 Motivation . . . 1
1.2 Thesis Statement . . . 5
1.3 Thesis Overview and Contributions . . . 6
2 Background 9 2.1 Schedule-Driven Traffic Control . . . 9
2.2 System Architecture of Pilot Deployment . . . 12
I
Stability
16
3 Coping with Large Traffic Volumes in Schedule-Driven Traffic Signal Control 17 3.1 Related Work . . . 183.2 Problem Setting . . . 19
3.2.1 A Queueing Network . . . 19
3.2.2 Schedule-Driven Traffic Control . . . 20
3.3 Stability . . . 21
3.3.1 Network Stability . . . 21
3.3.2 Backpressure . . . 22
3.4 Softpressure . . . 22
3.4.1 Weighted Cumulative Delay . . . 22
3.4.2 Queue Management Using Mean Field Methods . . . . 23
3.4.3 Weight Functions . . . 28
3.4.4 Theoretical Guarantees of Stability . . . 29
3.5 Experimental Results . . . 30
3.6 Initial Field Experiment . . . 33
II
Optimality
37
4 Using Bi-Directional Information Exchange to Improve
De-centralized Schedule-Driven Traffic Control 38
4.1 Related Work . . . 40
4.2 Problem Formulation . . . 41
4.3 Bi-Directional Information Exchange . . . 42
4.3.1 Congestion Feedback . . . 42
4.3.2 Decentralized Congestion Compensation . . . 45
4.3.3 Outdated Information Prevention . . . 46
4.3.4 Bottleneck Prevention . . . 47
4.3.5 Turning Movement Proportion . . . 49
4.4 Experimental Evaluation . . . 49
4.4.1 Two-Intersection Model . . . 50
4.4.2 Urban Network Model . . . 51
4.5 Conclusion . . . 54
5 Cooperative Schedule-Driven Intersection Control with Con-nected and Autonomous Vehicles 56 5.1 Related Work . . . 58
5.1.1 Intersection Control . . . 58
5.1.2 Intersection Control with CAV . . . 58
5.2 Cooperative Algorithm . . . 59
5.2.1 Scheduling Information . . . 59
5.2.2 Optimizing Schedule via Changing Platoons . . . 60
5.2.3 Cluster Size and Delay-Capacity Tradeoff . . . 63
5.3 Experimental Evaluation . . . 64
5.3.1 Simulation Results . . . 65
5.3.2 Partial Penetration of CAV . . . 66
5.4 Conclusion . . . 68
III
Learnability
69
6 A Hierarchical Algorithm Combining Planning and Learning for Mitigating Urban Traffic Congestion 70 6.1 Related Work . . . 726.1.1 RL-based Approach . . . 72
6.1.2 Communication in RL and Planning . . . 73
6.2 Background: Online Planning . . . 73
6.2.2 Maximum Green Time . . . 74
6.3 Background: Reinforcement Learning . . . 74
6.3.1 Q-Learning and Deep Q-Network . . . 74
6.3.2 Deterministic Policy Gradient (DPG) Algorithm . . . . 75
6.3.3 Coordinated Reinforcement Learning . . . 76
6.3.4 Temporal Abstraction and Hierarchical Deep Reinforce-ment Learning . . . 76 6.4 Hierarchical Algorithm . . . 76 6.4.1 System Models . . . 76 6.4.2 Algorithm . . . 77 6.5 Experimental Evaluation . . . 82 6.5.1 Two-Intersection Model . . . 82
6.5.2 Urban Network Model . . . 83
6.6 Conclusion . . . 89
7 Behavior Prediction of Scheduling Agent 90 7.1 Pedestrian Signal Timing Problem . . . 90
7.2 Simple Window-based Policy . . . 92
7.3 Policy Search for Picking Window Size . . . 92
7.4 Results . . . 93
7.5 Conclusion . . . 94
8 Conclusions 98 8.1 Future Directions . . . 99
List of Figures
1.1 50 intersections of Pittsburgh deployment. . . 3 1.2 The outline of approaches in this thesis: 1) Stability, 2)
Opti-mality, 3) Learnability. . . 6 2.1 A two-phased intersection with its corresponding input cluster
sequence C and the produced schedule at certain time point. . 10 2.2 The resulting control flow (S, CCF) calculated by scheduling
agents: each block represents a vehicular cluster of input cluster sequence C, which combines the road cluster sequences CR,m.
For instance, (2, 1) represents the first cluster at phase 2. The shaded blocks of CCF represent the delayed clusters. . . 11
2.3 The system diagram of implementation. . . 13 2.4 The system diagram of implementation. . . 13 3.1 An example of queueing networks consisting of scheduling agents
A, B, C and D. . . 20 3.2 The phased signalized intersection associated with a
two-state Ising model . . . 26 3.3 Map of the 24 intersections in the Baum-Centre neighborhood
of Pittsburgh, Pennsylvania . . . 31 3.4 The CDF of delay on three different optimization schemes. . . 33 3.5 Initial field comparison on Baum-Centre corridors - average
queue length and cluster size during PM rush hour. . . 36 4.1 Intersection scheduling agents allocate green time through
ex-changing information with neighbors in a transportation network. 39 4.2 Exchange congestion feedback with neighbor intersections:
in-tersection 3 and 7 belong to phase 2 of inin-tersection 1, and intersection 5 and 9 belong to phase 1. Each intersection sends and receives congestion feedback to and from neighbor inter-sections. . . 48 4.3 The delay of balanced and unbalanced traffic pattern. . . 52
4.4 Map of the 24 intersections in the Baum-Centre neighborhood of Pittsburgh, Pennsylvania . . . 53 4.5 The cumulative distribution function of delay and number of
stops. . . 55 5.1 The replanning and control cycle . . . 62 6.1 The hierarchical structure of traffic control systems with the
top reinforcement learning module and the bottom planning module. . . 72 6.2 Map of the 24 intersections in the Baum-Centre neighborhood
of Pittsburgh, Pennsylvania . . . 84 6.3 Costs of actor and critic networks under high traffic demand. . 85 6.4 The average delay for each episode . . . 86 6.5 The cumulative distribution function of delay. . . 88 7.1 Pedestrian timing for fixed timing and actuated systems. . . . 91 7.2 The distribution of prediction error for Centre-Beatty. . . 95 7.3 The distribution of prediction error for Penn-East Liberty. . . 96
List of Tables
3.1 Avg. delay of Baum Centre Model at PM rush . . . 31
3.2 Average delay under different scenarios. . . 31
3.3 Queue length and cluster size of the intersections under high demand traffic . . . 34
4.1 Summary of Baum Centre Model Results . . . 51
4.2 Average delay under different scenarios. . . 53
5.1 Average delay of single intersection and clustering interval 0 second. . . 66
5.2 Average delay of single intersection and clustering interval 3 second. . . 67
5.3 Average delay of three intersections and clustering interval 0 second. . . 67
5.4 Average delay of three intersections and clustering interval 3 second. . . 67
5.5 Average delay under different penetration rates. . . 68
6.1 Summary of two intersection model results . . . 83
6.2 Average delay under different scenarios. . . 86
6.3 Summary of Baum Centre Model Results . . . 88
Chapter 1
Introduction
1.1
Motivation
Over half of the world’s population now lives in cities and global urbaniza-tion continues at a steady pace. As this trend continues, urban mobility is becoming an increasingly critical problem. In the US cities alone, the cost of congestion now exceeds $160 Billion in lost time and fuel consumption, and is responsible for release of an additional 50 Billion pounds of CO2 into the
atmosphere [45]. To emphasize this issue, National Academy of Engineering published a report in 2008 to outline 14 game-changing goals for improving life on the planet in the 21st century, and one grand challenge is to improve urban infrastructure, especially, for transportation. As the transportation infrastructure of many countries has had no major innovation for decades, growing computer and communications technologies are already opening up vast stores of knowledge and entertainment. For example, Edge Computing, Internet of Things (IoT), Artificial Intelligence (AI) have been widely used in smart home, internet or streaming services. As remarkable as these engi-neering achievements are, certainly just as many more great challenges and opportunities in real world remain to be realized.
It is commonly recognized that better optimization of traffic signals could lead to substantial reduction of congestion and travel times, yet how to opti-mize a large transportation network in a responsive but scalable way remains a problem that continues to attract researchers from different fields. Basically, traffic signals continue to operate in a way that originated several decades ago. In urban environments, traffic signal control is still dominated by the use of fixed timing plans, which are based on average traffic conditions, and quickly become outdated as flow characteristics evolve over time. To improve matters, centralized approaches that adjust signal timing plan parameters (e.g., cycle
time, green time split) according to actual sensed traffic data [16, 23, 37, 44] have been proposed. However, these approaches are designed to accommodate continuous gradual change in traffic patterns (typically adjusting parameters after integrating information for several minutes), and are not responsive to real-time traffic events and disruptions. Alternatively, decentralized online panning approaches have been proposed [5, 14, 28, 46, 48]. These approaches solve the problem of scalability in principle, but have historically had dif-ficulty computing timing plans in real-time with a sufficiently long horizon to achieve network-level coordination. Based on these observations, we can summarize the following problems current traffic signal technologies have:
• Current traffic signal technologies tend to optimize for macro traffic
conditions, e.g., several months or a specific period. Actual conditions vary greatly and evolve over time.
• The current traffic control still uses sensors, i.e., loop or camera
detec-tors, in mundane ways and has not advanced in 40 years.
• A real-time optimization under the setting of large urban scale still
lacks.
• With the new progress in technology, e.g, AI, connected and autonomous vehicle (CAV), and wireless networking, there is still no a proper way to integrate them with the traffic signals.
Recent work in decentralized, online planning, however, has developed a schedule-driven approach to real-time traffic control that overcomes this hori-zon problem [67, 68]. Key to this approach is a formulation of the core in-tersection scheduling problem as a single machine scheduling problem, where input jobs are clusters of vehicles in close proximity to each other (i.e., ap-proaching platoons, queues). This aggregate representation allows plans to be efficiently generated with order-of-magnitude longer horizons than was previously possible, and hence enables network-level coordination through exchange of schedule information. Under this approach, the goal is to allo-cate green time to different signal phases, over time, where a signal phase is a compatible traffic movement pattern (e.g., East-West traffic flow). Each intersection asynchronously computes a schedule of green phases that mini-mizes the cumulative delay through the intersection of all approaching vehi-cles, and then communicates expected outflows to its downstream neighbors as it begins to execute its schedule. Scalability is ensured by the fact that in-tersections only communicate with their direct neighbors. However, since the planning horizon is extended, outflow information can propagate to non-local
neighbors. Results obtained in an initial field test showed a 25% reduction in travel times, a 40% reduction in wait times and 30% reduction in number of stops through the network [2, 50], and the system currently controls a net-work of 50 intersections in the East End area of Pittsburgh PA as shown in Figure 1.1.
Figure 1.1: 50 intersections of Pittsburgh deployment.
Although schedule-driven traffic control has achieved an initial success and laid a well-established foundation for scalable real-time traffic signal control systems from the perspective of planning and scheduling in AI community, a better network-level coordination scheme that can handle various conges-tion levels is still required. For high congesconges-tion levels, a method is needed to improve the efficiency of schedule-driven traffic control that is hindered by large platoons and gridlock of networks. The field of queueing control has already shed the light to us by introducing the concept of queueing stability. In addition, if we investigate the traffic signal control problem in a network carefully, we will find that the formulation of the problem is general. The entire network can be described by a directed acyclic graph (DAG). Each node specifies an intersection and each directed edge specifies a lane connect-ing two intersections. Under this problem settconnect-ing, we are usually concerned with a global objective of the network with a couple of constraints that define the interdependency between nodes. For example, given the external arrival traffic at edges of network, the goal is either minimizing average delay or
maximizing average throughput by designing a control policy for each signal controller to adjust timing. Due to the coupling between intersections, it is impossible to optimize intersections independently and requires joint consid-eration of respective control policies. We can solve this problem by either changing the optimization model for planning or learning a joint control pol-icy in a decentralized way to approach the global optimality. It is interesting to note that the stability problem and this generality of the network-level optimization problem provide us opportunities to explore other engineering concepts and see if a composite approach could bring forth new insights of decentralized schedule-driven systems.
The main motivation of this thesis is to contribute to bridging the gap between a schedule-driven approach and other approaches to network-level control, including:
Control of queueing networks
• The queueing network has arbitrary network topology and multiple
servers. Servers are connected by an arbitrary number of queues. The servers are interdependent in that they cannot provide service simulta-neously. We consider a system to be stable if the queues do not tend to increase without bound. We wish to find control policies under which the system is stable for given distributions of service rate and arrival rate.
Distributed optimization
• In distributed optimization, the essential part has always been
decom-position: based on the specific structure of the objective function and constraints, the problem is decomposed into a number of subproblems. Based on the convexity or non-convexity of those subproblems, they can be solved independently, but typically require a coordinator to ensure that the local decisions converge to the global optimum. Practically, we can replace the coordinator with a message exchange protocol between direct neighbors.
Multi-agent systems
• Agent-based approaches suit the decentralized traffic management
prob-lem, given newly developed sensing technologies and historical temporal data, as well as the frequent and flexible interaction between the agents and their environment [4, 12]. A common approach related to control of
traffic signals is to let multiple agents learn a policy for mapping states to actions by monitoring traffic flow and selecting actions.
Within these contexts, the objective of this thesis is to explore the pos-sibility of combining other engineering concepts with schedule-driven traffic control to improve the network-level performance further. Other than above possibilities, we also investigate how new vehicle technologies like CAV can collaborate with the schedule-driven approach. The progress of future urban mobility will be highly dependent on this vehicle-to-infrastructure framework, even if traffic signals will not exist.
1.2
Thesis Statement
The key problem in scalable real-time traffic control as a component of im-proving urban mobility, and the primary focuses of this thesis, is that the schedule-driven traffic control still lacks theoretical guarantee of network-level stability as in queueing control, a proper definition of network-level optimality as in distributed optimization and the capability of self-improving over a net-work as in multi-agent systems on the basis of its combinatorial optimization and real-time responsiveness.
Compared with above engineering approaches to the traffic control prob-lem, the advantage of schedule-driven traffic control lies on solving a combi-natorial optimization problem in real-time. However, this advantage raises a barrier for achieving better network-level performance. For example, dis-tributed optimization usually assumes the subproblems are continuous or follow certain smooth conditions. Then, we can apply convex analysis or mathematical theory to derive the optimal solution. However, combinatorial optimization problems do not have such characteristics. It is interesting to explore how to approach global optimality by solving modified subproblems separately. In this thesis, we intend to enhance schedule-driven traffic control with better network-level coordination to tackle urban network congestion. This thesis addresses these challenging issues, summarized in the following statement:
Thesis Statement: In this thesis, we investigate possibilities for im-proving the network-level performance of decentralized schedule-driven traffic signal control by integrating with complementary control perspec-tives and techniques.
The thesis statement can be decomposed into three subproblems related to stability, optimality and learnability of schedule-driven traffic control re-spectively as shown in Figure 1.2. Later, we empirically demonstrate how the hybrid approach could outperform the original method and still retain its computational advantages.
Figure 1.2: The outline of approaches in this thesis: 1) Stability, 2) Optimal-ity, 3) Learnability.
1.3
Thesis Overview and Contributions
In this thesis, we investigate three aspects of schedule-driven traffic control: stability (Part I), optimality (Part II) and learnability (Part III), following the thesis problem. We begin with a review of the schedule-driven traffic control and the system architecture in Chapter 2. Then, what follows is the core three parts and primary contributions of the thesis. The three parts is summarized in Figure 1.2 through three ordered questions. First of all, we need stability to ensure that system does not transit to irreversible states (e.g., unbounded growth of queues). Second, the system is a decentralized approach. We can leverage the tools from multi-agent theory and distributed optimization to
better approach global optimality. Finally, utilizing generated data to further improve performance or benefit of multi-modal traffic is inevitable in such data-rich systems where a great amount of data is generated on an urban scale. Based on the same decentralized information exchange framework, we propose a fully decentralized learning algorithm to coordinate traffic by configuring parameters correctly.
This thesis makes the following specific contributions:
Part I: Stability In queueing networks, stability is a critical issue of control policy design. If the system cannot guarantee stability, it is possible for queues to grow without bound. Although schedule-driven traffic control can guarantee the optimality of solution given the current representation within a finite horizon, the representation may not be optimal from the perspective of the underlying system. To address this issue, we propose a way to stabilize queues in Chapter 3, while retaining local optimality of problem solving. The proposed composite approach to real-time traffic control addresses this issue, by using sensed information on queue lengths to influence scheduling decisions and gracefully shift the signal control strategy to queue management in high volume/high congestion settings. The approach is shown to reduce average waiting times by 60% in heavy traffic scenarios.
Part II: Optimality For decentralized systems, optimality usually refers to network-level performance or social warfare of the whole network. In the current setting of schedule-driven approach, each intersection is solving a single-machine scheduling problem individually, even though the upstream intersection shares its schedule to downstream intersections. To approach global optimality, we propose that a reversed-directional information flow should be provided by downstream intersections in Chapter 4. Then the objective function of original scheduling problem is modified to a multi-hop version of delay. By optimizing this new objective, we can approach global optimality with just slight modification of model. It is analogous to impose a tax or price on agents for achieving better network-level performance. Results showed that the new bi-directional information exchange model improves av-erage delay overall in comparison to both the baseline schedule-driven traffic control approach and a cycle-based adaptive traffic signal control approach, and that solutions provide substantial gain in highly congested scenarios.
In addition to the pure intersection control, we explore how using CAV is able to be integrated with the schedule-driven approach to create better traffic flow that can guarantees smaller delay compared to the baseline schedule-driven traffic control approach. in Chapter 5, we propose to utilize produced
scheduling information to enable shift of platoons by either speeding up or slowing down of vehicles. The proposed algorithm can provide additional 19% reduction of delay compared to the original schedule-driven approach.
Part III: Learnability Learnability means that the system is able to im-prove itself through utilizing the data or information produced when interact-ing with environments or other agents. The schedule-driven approach can be categorized into model-based intersection optimization in which we conduct planning and scheduling based on a predefined model. A model defines the scheduling problem to be solved and is embedded with several parameters that need to be configured correctly according to the real traffic condition. In Chapter 6, we propose a hierarchical framework that integrates reinforce-ment learning with the schedule-driven traffic control, where the top-level module learns a policy to configure parameters, and the bottom-level module generates timing plans that minimize average delay of vehicles.
In Chapter 7, frequent replanning and extended horizon with outflow in-formation are two keys to the success of schedule-driven traffic control. In terms of these two techniques, a huge amount of data is generated in real time. For real-time traffic signal control, the data was usually seen as a byproduct of detection. We want to leverage these data to assist scheduler/planner to make better decisions on switching various timing in traffic control systems. One example is to apply sequential samples of remaining green time from schedulers to predict actual green time. In this case, pedestrian signal timing is able to be extended along with green time, even though there is a fixed count down timer for guaranteeing safety.
Conclusions Through examination of network-level control from stability, optimization and learnability perspectives, this thesis has shown how the global performance of schedule-driven traffic signal control systems can be substantially improved. Moreover, the aspects depict a possible direction of system evolution for future transportation infrastructure. At the same time, this thesis is only a first step in enabling AI-controlled transportation infras-tructure and developing scalable real-time control algorithm. In Chapter 8, we identify remaining questions and directions for future research in schedule-driven traffic control.
Chapter 2
Background
2.1
Schedule-Driven Traffic Control
As indicated above, the key to the single machine scheduling problem formula-tion of the schedule-driven approach of [67, 68] is an aggregate representaformula-tion of traffic flows as sequences of clusters c over the planning (or prediction) horizon. Each cluster c is defined as (|c|, arr, dep), where |c|, arr and dep are number of vehicles, arrival time and departure time respectively. Vehicles entering an intersection are clustered together if they are traveling within a pre-specified interval of one another. The clusters become the jobs that must be sequenced through the intersection (the single machine). Once a vehicle moves through the intersection, it is sensed and grouped into a new cluster by the downstream intersection.The sequences of clusters provide short-term variability of traffic flows at each intersection and preserve the non-uniform nature of real-time flows. Specifically, the road cluster sequence CR,m is a
se-quence of (|c|, arr, dep) triples reflecting each approaching or queued vehicle on entry road segment m and ordered by increasing arr. Since it is possible for more than one entry road to share the intersection in a given phase (a phase is a compatible traffic movement pattern, e.g., East-West traffic flow), the input cluster sequence C as shown in Figure 2.1 can be obtained through combining the road cluster sequences CR,m that can proceed concurrently
through the intersection. The travel time on entry road m defines a finite horizon (Hm), and the prediction horizon H is the maximum over all roads.
Every time the cluster sequences along each approaching road segment are determined, each cluster is viewed as a non-divisible job and a forward-recursion dynamic programming search is executed in a rolling horizon fashion to continually generate a phase schedule that minimizes the cumulative de-lay of all clusters. The frequency of invoking scheduling is once a second
Figure 2.1: A two-phased intersection with its corresponding input cluster sequence C and the produced schedule at certain time point.
for reducing uncertainty associated with clusters and queues. The process constructs an optimal sequence of clusters that maintains the ordering of clusters along each road segment, and each time a phase change is implied by the sequence, then a delay corresponding to the intersection’s yellow/all-red changeover time constraints is inserted based on Algorithm 5. If the result-ing schedule is found to violate the maximum green time constraints for any phase (introduced to ensure fairness), then the first offending cluster in the schedule is split, and the problem is re-solved.
Formally, the resulting control flow can be represented as a tuple (S, CCF)
shown in Figure 2.2, where S is a sequence of phase indices, i.e., (s1, · · · , s|S|),
CCF contains the sequence of clusters (c1, · · · , c|S|) and the corresponding
starting time after being scheduled. More precisely, the delay that each cluster contributes to the cumulative delay P|S|
k=1d(ck) is defined as
d(ck) = |ck| · (ast − arr(ck)), (2.1)
where ast is the actual start time that the vehicle is allowed to pass through, which is determined by Algorithm 5. Note that ast is determined by the arr and permitted start time (pst) described in Algorithm 5. For a partial sched-ule Sk, the corresponding state variables are defined as a tuple, (s, pd, t, d),
where s is phase index and pd is duration of the last phase, t is the finish time of the kth cluster and d is the accumulative delay for all k clusters. The state variable of Sk can be updated from Sk−1. Algorithm 5 is based on a
greedy realization of planned signal sequence, where M inSwitch(s, i) returns the minimum time required for switching from phase s to i and slti is the
start-up lost time for clearing the queue in the phase i. We can use t and M inSwitch(s, i) to derive pst. The optimal sequence (schedule) CCF∗ is the one that incurs minimal delay for all vehicles.
Figure 2.2: The resulting control flow (S, CCF) calculated by scheduling
agents: each block represents a vehicular cluster of input cluster sequence C, which combines the road cluster sequences CR,m. For instance, (2, 1)
rep-resents the first cluster at phase 2. The shaded blocks of CCF represent the
delayed clusters.
Algorithm 1 Calculate (pd, t, d) of Sk
Require: 1) (s, pd, t, d) of Sk−1 ; 2) sk
1: i = sk; c = next job of phase i
2: pst = t + M inSwitch(s, i) . Permitted start time of c
3: ast = max(arr(c), pst) . Actual start time of c
4: if s 6= i and pst > arr(c) then ast = ast + slti
5: end if
6: t = ast + dep(c) − arr(c) . Actual finish time of c
7: if s 6= i then pd = t − pst
8: else pd = pd + (t − pst)
9: end if
10: d = d + |c| · (ast − arr(c)) . Total accumulative delay
To collaborate with neighbor intersections, each intersection receives a projection of expected outflows from its upstream neighbors and plugs it into its local computation. After starting to execute its schedule, the resulting flows are communicated to its downstream neighbors. Since a vehicle may enter into/leave from intersection via different road segments, the clusters that are propagated to neighbors over extended look-ahead horizon H are split and weighted by turning movement proportion. Thus, the weight |c| of the non-local cluster will be a fractional number to reflect the uncertainty of movement. The turning movement proportion data is estimated by taking average of traffic flow rates for different phases. All approaching vehicles are sensed through the intersection’s lane detectors.
2.2
System Architecture of Pilot Deployment
The implementation of the schedule-driven traffic control is organized as a completely decentralized multi-agent system in which each intersection is con-trolled by an agent. The agent software is running on a computer located in the traffic cabinet as a multi-threaded service shown in Figure 2.4. The agent detects vehicles, computes schedule, and control traffic signal at the inter-section. Basically, the agent can be decomposed into four parts: Detector, Communicator, Executor, and Scheduler. After introducing the architecture, the implementation of this thesis is briefly described.
Detector The job of Detector service is to manage the interfaces with all sensors located at an intersection including radar sensors, camera sensors or loop detectors. These sensors will produce real-time data, and the Detector service encodes data into a message after receiving them and forwards to the local Scheduler service. If the sensor functions as an advance detector for a neighboring intersection, the message must also be sent to the remote Scheduler for further processing.
Two types of data, counts of vehicles and occupancy time at each detector, are processed by the Scheduler service. We define two separate detection zones, which are a data zone and a presence zone, for the video detection in the pilot system. Data zones require to be small to count vehicles, whereas presence zones are large to estimate vehicle occupancy information. These two information correspond to flow rate and density of the road segment respectively. As a vehicle passes a data zone, a message is generated and routed through the Communicator. Occupancy for all presence zones is sensed every 0.1 seconds and aggregated every second, encoded into messages, and sent the same way.
Figure 2.3: The system diagram of implementation.
Communicator As shown in Figure 2.3, all messages are routed through the Communicator at a given intersection. There are two types of rout-ing: internal communication and external communication. Most messages are routed locally, including phase messages and scheduling information, etc. External communication includes the messages to neighbors such as outflow information. All data are encoded as messages of pre-defined types, and can be addressed to any intersection through local area network. By defining those types, the hardware from different manufactures can be integrated together. Formally, each message can be described as a tuple {type, time, orig, dest, source, data} of the message type, the time that the message was generated, the intersection where the message originated, a list of destination intersec-tions for the message, the service or detector that created the message, and the content of the message as a JSON (JavaScript Object Notation)-encoded string.
The deployments rely fundamentally on connectivity throughout the road network, but to ensure scalability it is only necessary for an intersection to be able to communicate with direct neighbors. By keeping communication strictly between neighbors, the system can grow incrementally or scale to very large signal networks. All communication is asynchronous and robust to temporary network failure such as maintenance.
Executor In the deployed system, the Executor service plays a role on com-municating frequently with the traffic signal controller. Any phase switches, e.g., start or end of phase, are forwarded to the Scheduler by the Executor. Because of replanning, the Executor follows the most recent schedule that reflects current traffic conditions provided by the Scheduler and sends exten-sion calls to continue the current phase until the phase switch implied by the schedule. When the Scheduler updates the schedule, it may extend the cur-rent phase by any amount greater than or equal to the minimum extension (a system parameter). The extension time was set to one second, so that the Scheduler could recompute schedule as frequently as once per second. When the current phase is extended, the Executor notifies the Scheduler of the upcoming decision time in the schedule - the point by which a subsequent update to extend the phase must be received. However, error handling is still required as the Scheduler has to make decisions within an extremely short extension times. The schedules may arrive to the Executor too late to extend the current phase. Under this situation, the Executor would roll back to use default phase duration computed by the Scheduler. The Executor will only end a phase earlier than the default duration if the Scheduler chooses to ter-minate the phase. The Executor may also fall back to these phase durations
in the case of prolonged sensor or network failure.
Scheduler The Scheduler service is an implementation of the schedule-driven traffic control approach described above. As continuously receiving real-time phase and detection data and scheduled upstream outflows infor-mation, the Scheduler builds its abstract model of the traffic approaching the intersection, and constructs a new phase schedule. Once a new schedule has been constructed, the leading portion of clusters is sent to the Execu-tor for controlling traffic light, and part of schedule is sent to downstream intersection as the outflow information.
If the connection to a neighbor intersection fails, the data from upstream intersections or advance detectors would be lost. If the problem has been there for a short time, the local scheduler can still work with the recent data. How-ever, a longer disconnection might cause the link to be under-serviced since no new detected vehicle information is forwarded. After detecting longer dis-connection, the system will fall back to the original timing plan, and neighbor intersections will only utilize local information to compute schedule. Hence, the scheduler operates using hybrid information when look-ahead information is only available for some links. The performance of the intersection might be degraded due to disconnection. Thus, short communication failures will not have major effects on the overall system performance.
Implementation of this thesis After introducing the system architecture, we briefly describe where the proposed methods of this thesis are located in Figure 2.3. For Chapter 3, the stabilized algorithm is implemented in the Scheduler service. We define new types of messages to include queue length information. With neighbor queue and local queue information routed by the Communicator, the Scheduler service is able to computes new schedule that stabilizes networks. For Chapter 4, the changes is also mainly located in the Scheduler service. For Chapter 6, we implement the reinforcement learning algorithm on the Executor since it is the module that is able to process every scheduling information. We also define a new type of messages to configure the parameters we want to optimize.
Part I
Stability
Chapter 3
Coping with Large Traffic
Volumes in Schedule-Driven
Traffic Signal Control
Schedule-driven traffic control has achieved a preliminary success in an initial field test [2, 50]. However, there are circumstances when the effectiveness of such a schedule-driven approach to coordination can break down. In particu-lar in situations of high congestion, where the number of vehicles approaches the physical capacity of interconnected queues in the network, the traffic con-trol problem becomes less of a scheduling problem (e.g., involving just a single cluster spanning the planning horizon along each intersection approach in the extreme case), and more of a problem of managing queues. In this chapter, we propose a composite approach to real-time traffic control that addresses this issue, by using sensed information on queue lengths to influence scheduling decisions and gracefully shift the signal control strategy to queue management in high volume/high congestion settings.
To stabilize queues in a network (i.e., to prevent unbounded growth of queues), the vehicle clusters associated with longer queues should be serviced first. Within the above intersection scheduling framework, one straightfor-ward way of achieving this behavior is to assign higher weights (i.e., higher priority) to these input clusters and compute phase schedules that minimize weighted cumulative delay. To balance the emphasis placed on queue man-agement as a function of network saturation, we propose to use queue-length information (both local to the intersection and non-local from neighbors) to establish the weights. In situations where queue lengths are small, cluster priority will continue to be a function of the cluster size (number of vehicles) as before; however as the network becomes saturated and queues become
longer, clusters associated with longer queues will begin to dominate cluster priority. To ensure scalability, queue information is only exchanged between direct neighbors and the asynchronous nature of local intersection scheduling is preserved.
To derive an appropriate set of weights, the signal phases of a given inter-section are viewed as different states of an Ising model [52] and the probabilis-tic distribution of this model, whose parameters quantify transitions between phases and strength of interactions in terms of queue-length information, is calculated. However, computing the exact distribution is a hard problem [7]. Hence we turn to approximation through mean field methods originating in statistical physics and the graphical model literature [61]. The marginal dis-tribution derived for each phase is then used as the weight of that phase’s clusters. We show formally that the proposed composite approach prevents queues from increasing without bound and therefore achieves network stabil-ity. We also present simulation results on a real-world traffic network that demonstrate the ability of our approach to effectively integrate queue man-agement into schedule-driven traffic control. The approach is shown to reduce average waiting times by 60% in heavy traffic scenarios. Finally, we report the results from some initial experiments in the field, which verify the ability to reduce queues during heavy traffic periods.
The remainder of this chapter is organized as follows. We first summarize relevant literature and the problem setting. Next, the concept of stability and a related backpressure algorithm are introduced. Then, the mechanisms necessary to achieve decentralized queue management are discussed. Finally, an empirical analysis of the composite approach is presented, and conclusions are drawn.
3.1
Related Work
Techniques that restrict queues from increasing in a network is a problem of broad interest in computer network design and manufacturing. Researchers from computer science, operation research, and communication engineering have been persistently working toward development of queue management techniques, which is able to stabilize queues and preserve performance. In the field of scheduling, however, this and related problems are rarely discussed. One exception is the recent research in coupling queueing theory concepts with finite capacity scheduling [56, 58].
In [55], a ”backpressure” policy called the max-weight algorithm was in-troduced to maximize the throughput of a network through stabilization of queues. This approach has been applied mainly to communication networks
and but also recently to transportation networks [65]. However, there are two complications with applying backpressure to the problem of real-time traffic control. First, although backpressure is maximizing network throughput, the practical version [54] still seems to induce large average delay [47], which is undesirable in the case of traffic networks. Second, the approach does not consider non-local influence from neighbors and is thus susceptible to my-opic decisions. Actually, the proposed approach can be seen as a soft version of backpressure policy, so that the stability the queues is guaranteed. Fur-thermore, delay performance is not sacrificed due to the scheduling problem formulation. Recent work has also proposed that hybridization of scheduling and queueing techniques can provide synergistic benefits in different applica-tion settings [57, 58].
Analytical network models based on queueing theory [41] are another way to approach the problem of network congestion. By solving a large-scale optimization problem [42], signal timing plans can be derived for an urban road network. As mentioned earlier, however, such an approach optimizes from a snapshot of average traffic flow, which is typically quite different than actual traffic flow. Traffic flow prediction work [70] has also concentrated in recent years on dealing with urban congestion. However, the interconnected queues increase the difficulty of predicting arrival rates of coming vehicles.
To our knowledge, statistical physics or graphical models have never been used to solve the network congestion problem. Previous work has focused on using statistical physics to study traffic flow dynamics, although they have received less attention than other approaches. Recently, these approaches have witnessed a resurgence. For instance, [27] recently studied the effect of driver algorithms on traffic flow. [52] applied the Ising model to study chaotic dynamics and serve as a starting point for considering statistical mechanics of traffic signals. In the context of self-organized traffic flow, however, these techniques have been less discussed due to the lack of suitable model.
3.2
Problem Setting
Consider a dynamic environment where the jobs change dynamically over time and their processing times are affected by various types of uncertainty.
3.2.1
A Queueing Network
The system of interest is a queueing network [20]. The connectivity of the networked system is represented by a directed graph G = (V, E), where V is the set of nodes and E is the set of links. We consider a network consisting
of |V | = L nodes and |E| = N links. Each node has a scheduling agent to serve jobs belonging to specific classes. On the link (i, j), the node j has a corresponding queue Qij to buffer approaching jobs. The scheduling agent
can only serve one queue at a time. For instance, Figure 3.1 shows that node C has two queues QBC and QAC, and at any point it can either serve QBC or
QAC (but not both). Jobs arrive over time via an arrival process determined
by the last visited node. A job may enter the network from any node and leave the network if it reaches its destination by appropriate routing through the network. We assume that there are K job classes. At each time t a job i belonging to class k arrives at node l with probability pk, and its service rate
and processing time on the agent are denoted by µlk and dli. We assume the
agents in the network are able to serve jobs of all classes. The objective of the agents is to minimize the total cumulative delay of jobs traveling through the network over a given time period, while maintaining the stability of the network.
Figure 3.1: An example of queueing networks consisting of scheduling agents A, B, C and D.
3.2.2
Schedule-Driven Traffic Control
As mentioned earlier, our hypothesis is that the effectiveness of this schedule-driven process degrades as congestion increases near saturation, due to the fact that it becomes more and more difficult to accurately predict when in-coming clusters are going to arrive at the intersection when queues become large. Note that a queueing network is considered to be stable if the queues do not tend to increase without bound. To boost the performance of this schedule-driven process in a network that is experiencing high congestion, we introduce a weight into this delay computation. The basic idea is to bias the scheduling search more toward stabilizing local queues (both at the local
intersection and at its neighbor intersections) as the level of local congestion increases. To measure the level of congestion, we rely on queue-length infor-mation associated with various phases. To provide a low complexity scheme for queue management, we propose to weight each cluster of a given phase equally. The delay incurred by each cluster is thus rewritten as
d(c) = |c| · (ast − arr(c)) · w(p), (3.1)
where w(p) is the weight assigned to the phase p that cluster c belongs to. The important question then becomes: how to set the weights for competing phases.
3.3
Stability
In [57], stability was first introduced to the scheduling community. Informally, a system is considered to be stable if the queues remain bounded over time. This previous work used the definition that the load of each machine, defined as the ratio of the arrival rate to the service rate, is strictly less than 1 as the necessary conditions for stability. However, queueing networks need a more general definition of stability to describe the behaviors of networks.
3.3.1
Network Stability
To discuss stability for networks, we specify the general network model. We consider that a set of scheduling agents controls job flows through a network with the goal of minimizing the average delay, subject to the constraint that only a specified set of links can be activated simultaneously due to the need to share the common resource. The network is assumed to operate in slotted time, i.e., t ∈ {0, 1, 2, . . . }. We assume there are N queues in the network. Let Q(t) = (Q1(t), . . . , QN(t)) ∈ RN+, t = 0, 1, 2, . . . be the queue length
vector of the network, in units of job processing time. In this chapter, we adopt the following notion of queueing stability for a network [15, 55, 65]:
¯ Q ≡ lim T →∞ 1 T T −1 X t=0 N X i=1 E[Qi(t)] < ∞, (3.2)
implying that the time-averaged length of queues is bounded. This defini-tion also implies that the Markov process that describes the dynamics of the system is positive recurrent [38]. In queueing theory, establishing net-work stability is considered to be a prerequisite to more detailed analysis and scheduling policy design.
3.3.2
Backpressure
We start with a description of the backpressure algorithm introduced by Tas-siulas and Ephremides [55]. At each time slot t, the agent at each node selects a particular non-conflicting set of links (e.g., a particular signal phase) to serve from the set of all incoming links. Each agent assigns a weight to each such set of links by summing the queue length that the set of links proposes to serve, and then chooses the set of links with the largest weight. It has been shown that the backpressure is stable and throughput optimal for the general networks considered in queueing systems. The backpressure algorithm is also known to induce a reasonable (polynomial in network size) average queue-size for this model, and therefore the induced delay is upper bounded.
The algorithm is sketched as follows: Consider a queueing network with the queue vector Q(t). Let Qi(t) be the size of queue i at the beginning of
time slot t. We denote the feasible set of non-conflicting links by S ⊂ RN +. In
every time slot an activation vector πππ ∈ S is chosen; Qi(t) is given an amount
of service πi in that time slot. For simplicity, we will restrict ourselves to S
such that S ⊂ {0, 1}N; that is, for any πππ ∈ S , π
i = 1 (Qi(t) receives one
unit of service) or 0 (Qi(t) receives no service). The backpressure algorithm
chooses a vector πππ such that
πππ · Q(t) = max
ρ∈S ρ · Q(t), (3.3)
where u · v =PN
i=1uivi. The backpressure algorithm chooses the set of links
to activate solely on the basis of current queue length and does not need to learn other parameters.
3.4
Softpressure
The backpressure algorithm ensures stability by activating the links with largest queue length. The activation is binary according to (3.3). Dynamic scheduling, on the other hand, provides no such assurance of stability. In this section, we introduce an integration of schedule-driven control and back-pressure that overcomes this deficiency and produces a stable schedule for a general queueing network.
3.4.1
Weighted Cumulative Delay
As mentioned earlier, a queueing network is considered to be stable if the queues do not tend to increase without bound. According to the backpres-sure algorithm, serving the set of non-conflicting links with the largest queue
length establishes this property. Larger queue length implies higher priority. Similarly, in the schedule-driven approach discussed earlier, we can introduce weights into this delay computation as a way to prioritize jobs from different links. The jobs with larger weights are served first. The basic idea is to bias the scheduling search more toward stabilizing local queues (both at the local nodes and at its neighbor nodes) as the level of local congestion increases. To measure the level of congestion, we rely on queue-length information as-sociated with different links. To provide a low complexity scheme for queue management, we propose to weight each job of a given link equally. The weight associated with job n on link (i, j) can be expressed as
d(n) = (ast − arr(n)) · wij. (3.4)
In the traffic signal control problem, the delay incurred by cluster c is thus rewritten as
d(c) = |c| · (ast − arr(c)) · w(p), (3.5)
where w(p) is the weight assigned to the phase p that cluster c belongs to. The important question then becomes: how to set the weights for competing phases.
3.4.2
Queue Management Using Mean Field Methods
In this section, we introduce a decentralized method for calculating the weights to assign to each phase of a given intersection, so that its queues are locally stabilized in coordination with the queues at neighboring intersections. In brief, we propose a special Ising model in which the phases of a given traffic signal are viewed as different states, and their respective queue lengths are its parameters. Then, the intractable marginal distribution of the phases in each intersection is approximated by appealing to the use of mean field methods. We use the marginal distribution as the weight (priority) of each phase. Weight in a Form of Probability Function
To stabilize queues in the network, the clusters associated with longer queues should be served with higher priority. Hence, the weight function representing priority should be a function of queue length. The goal is to find a proper function that can reflect such queue dynamics and is suitable for integration with schedule-driven traffic control. Let us assume that the sum of weights of all phases is equal to one,
P
X
p=1
where Qp is the queue length at the corresponding phase p and P is number
of phases for a single intersection.
A probability function is a reasonable choice as the phase of each in-tersection is viewed as a random variable. A larger weight for a specific phase implies a higher probability of that phase occurring. Furthermore, the probability function is a continuous function of queue length matching the continuous variability of vehicular cluster size. In the following sections, we apply techniques from statistical physics and graphical models to derive this probability function.
Boltzmann Distribution for Traffic Signal
The green and red lights of a traffic signal can be viewed as two states of an Ising particle spin. Moreover, intersections in urban environments are interconnected and interact with each other. In statistical physics, a Boltz-mann distribution is a probability distribution that assigns probability that an interacting system is in a certain configuration of states described by an energy function of states. A standard form of Boltzmann distribution that has binary states and energy function E(·) is defined as follows:
p(σ; θ) = 1 Z(θ)e
−βE(σ;θ)
, (3.7)
where σ ∈ {1, 0}n are state variables, and T = 1β, and Z(θ) are temperature and normalization constants. We assume that β is assimilated into the θ parameters. Consider a graph G = (V, E), where s ∈ V and t ∈ V are two adjacent nodes (intersections) in G. Then the energy function incorporating interaction takes the following form,
E(σ; θ) = − X (s,t)∈E θstσsσt− X s∈V θsσs, (3.8)
where the first term defines the interaction between two intersections and the second term specifies the external field of each intersection. With this form of energy function, a Boltzmann distribution can also be expressed by the following general form of exponential family:
p(σ; θ) = exp(hσ, θi) − A(θ)), (3.9)
where h·i is Euclidean inner product of two vectors and log Z(θ) = A(θ). After getting the distribution, the marginal probability of phase p at intersection i is calculated, and we set the weight for phase schedule generation to this probability,
Queue-based Energy Function
In an interconnected queueing network, the strength of interaction θst and
ex-ternal potential θs are related to queue length, which corresponds to different
phases (states). Suppose that we have two phases: {1} represents East-West phase, and {0} represents North-South phase. Taking the East-West phase as an example, the external field θs> 0 is the pressure that ”pushes” vehicles
along with East-West direction. The stronger the external field, the greater the tendency of the vehicles to keep moving East-West. θs < 0 is analogous
to a repulsive field that prevents vehicles from approaching further. More-over, the queues of neighbor intersections contributing to East-West phase will be summed up together and used to measure the interaction strength θst.
Specifically, θst is a measure of repulsion or attraction faced by intersections
as they synchronize with another intersection. The sign of θst corresponding
to North-South phase is opposite to that of the East-West phase, since the traffic flows in both directions compete with each other. The energy function of a two-phased signalized transportation network is given by
E(σ; θ) = − X (s,t)∈E θstσsσt− X s∈V θsσs (3.11) = − X (s,t)∈E L(Qt→s, Qs→t)σsσt− X s∈V (Qs,h− Qs,v)σs,
where L(Qt→s, Qs→t) is the interaction strength and depends on which phase
these queues are contributing to. It is defined as
L(Qt→s, Qs→t) =
(
Qt→s− Qs→t, if (s, t) ∈ Eh
Qs→t− Qt→s, if (s, t) ∈ Ev,
(3.12)
where h,v denote the East-West and North-South directions. Eh and Ev
are the sets of the East-West and North-South road segments respectively. Specifically, the queues served during the East-West phase contribute to pos-itive terms of the energy function, while those served during the North-South phase contribute to negative terms. The intersection of the signalized trans-portation network is depicted in Figure 3.2.
Calculation of the Weights
For a distribution associated with a complex graph, especially with loops that typify grid networks, it is intractable to perform probabilistic inference, e.g.,
compute the exact marginal distribution of all random variables. The varia-tional approach to the probabilistic inference involves converting the inference problem into an optimization problem, by approximating the feasible set, and solving the relaxed problem.
A Boltzmann distribution is one of the exponential families [61]. An ap-pealing feature of the exponential family is that moments of the distribution are obtained by the derivatives of log normalization function A(θ). For a given tractable subgraph F , mean field methods are based on optimizing over the subset of realizable mean parameters µ that can be obtained by the sub-set of exponential family distribution. With the subsub-set MF(G) of µ and the
corresponding conjugate dual function A∗F(µ), the A(θ) can be computed by solving the following optimization problem
Figure 3.2: The two-phased signalized intersection associated with a two-state Ising model
A(θ) = max
µ∈MF(G)
hµ, θi − A∗F(µ) (3.13)
and the resulting mean parameter is
In this chapter, the approximation is based on choosing product distribu-tion p(σ1, σ2, · · · , σn; θ) = Y s∈V p(σs; θ) (3.15)
as the tractable approximation. It is also referred to as the naive mean field approach. According to this approximation, the optimization problem is rewritten as A(θ) = max µ∈[0,1] X s∈V X t∈Ns L(Qt→s, Qs→t)µsµt +X s∈V (Qs,h− Qs,v)µs −X s∈V [µslog(µs) − (1 − µs) log(1 − µs)] (3.16)
Solving the problem yields a specific form of the mean parameter update µs ←{1 + exp[−(Qs,h− Qs,v) − X t∈Ns L(Qt→s, Qs→t)µt]}−1 =S Qs,h+ X t∈Ns,h (Qt→s− Qs→t)µt − Qs,v− X t∈Ns,v (Qt→s− Qs→t)µt, (3.17)
where S(x) is sigmoid function 1+exp(−x)1 . Ns,h and Ns,v are sets of
neigh-bor intersections corresponding to East-West and North-South phases. From (3.17), the two terms in the sigmoid function denote effective queues for each phase, which are defined as
ˆ Qs,h= Qs,h+ X t∈Ns,h (Qt→s− Qs→t)µt ˆ Qs,v= Qs,v+ X t∈Ns,v (Qt→s− Qs→t)µt (3.18)
With the effective queues, the marginal distribution is expressed concisely as
P (σs = 1) = S ˆQs,h− ˆQs,v)
P (σs = 0) = 1 − S ˆQs,h− ˆQs,v). (3.19)
Note that the marginal distribution is a Bernoulli distribution whose param-eter is only related to the difference between the effective queues. In other words, the weight function used in scheduling is a function of queue difference.
3.4.3
Weight Functions
In this section, we provide an example of weight functions. To be suitable for combinatoric scheduling, the weights should have the following characteristics: (a) when the queues are empty or balanced, the corresponding links should be served with equal priorities (combinatorial scheduling should dominate performance in this case); and (b) the influence of shorter queues should be reduced in case jobs with larger dliarrive and incur larger cumulative delay. A
probability function is a reasonable choice when the queue activation choice is viewed as a multinomial random variable. For instance, a node needs to pick a link to serve from k links, with corresponding probabilities p1, · · · , pk and
Pk
i=1pi = 1. If a link has larger weight, it suggests that the link should be
served with a higher probability. Furthermore, the probability is a continuous function with parameters that match the variability in processing times.
We propose a softmax function of queue length as the weight function. The softmax formula can be derived analytically, based on a graphical model and statistical physics, if queue lengths are taken as the parameters of the graph-ical model [24]. Assume that node i has a set of queues {Qsi|s ∈ Niin, (s, i) ∈
E}, where Nin
i are the neighbors of node i with a directed link (s, i).
wsi = exp(Qsi) P j∈Nin i exp(Qji) , (3.20) X s∈Nin i wsi = 1, (3.21)
For example, node C in Figure 3.1 has two queues QAC and QBC. The weight
wAC is
exp(QAC)
exp(QAC)+exp(QBC). The softmax function fulfills the two
characteris-tics we propose for the weight functions. However, we only takes individual node into consideration when calculating the weight function. It does not include the effect from its upstream and downstream nodes. Hence, we pro-pose another modification of the weight functions that incorporate non-local observation.
Equation (3.20) only applies local queue-length information to weight the incoming jobs. We can use non-local observation of queue-lengths of neigh-boring intersections to extend the prediction of queue length and improve the stability (3.2) further. In addition to upstream neighbors Niin, we denote downstream neighbors as Nout
i . Then, we define effective queue length as
ˆ Qsi = Qsi+ X h∈Nin s Qhswhsηhis − X k∈Nout i Qikwikηisk, (3.22)
where the ηs
hi is the proportion of jobs that are routed from node h to node i
through s. The second term can be viewed as the pressure that ”pushes” jobs along in the (s, i) direction. The stronger the value, the greater the tendency for jobs to keep moving. The third term is analogous to a repulsive force that prevents jobs from approaching further.
The weights for a grid network is a special case and can be derived from a probabilistic graphical model by applying the naive mean field method. Note that the weights can be computed in a decentralized way. We assume that each node knows its neighbor nodes and is able to communicate with them. First, the scheduling agent collects its local queue-length information. Once the queue-length information and the calculated weights are received from neighbor intersections, the agent then computes its weights and applies them as its job weights for generating the schedule. In the following sections, softpressure is realized as this special case:
wsi = exp( ˆQsi) P j∈Nin i exp( ˆQji) . (3.23)
3.4.4
Theoretical Guarantees of Stability
In this section, we prove that by applying this weight function to jobs, an upper bound on the expected queue length is achieved. According to Little’s law (queue length is equal to arrival rate multiplied by waiting time) [35], the delay is bounded as well. Furthermore, scheduling is able to provide a tighter bound than simply applying backpressure. We state the following theorem of our algorithm with the weight function (3.20).
Theorem 1. Consider a network has N queues with arrival rates λ = (λ1, · · · , λN).
Under the proposed softpressure algorithm, expected queue length is bounded by lim sup t E[ N X i=1 Qi(t)] ≤ N2 2 (3.24)
if for any queues the arrival rates satisfy λ ≤PK
j=1αisj with
PK
j=1αj = 1 −
, > 0 and activation vector sj ∈ S, |S| = K.
Proof. Let the queue dynamics follow Qi(t+1) = Qi(t)−si(t)1Qi(t)>0+ai(t) =
Qi(t) + ∆i(t), where si(t) and ai(t) are service and arrival rate. Note that
P
iQ2i(t) and use Lyapunov-Foster theory to write down the expected drift
E[L(Q(t + 1)) − L(Q(t))|Q(t)] = 2 N X i=1 E[Qi(t + 1) · ∆i(t)|Qi(t)] + N X i=1 E[∆2i(t)|Qi(t)]
Since the larger weight causes jobs to be serviced with higher priority until the queue is cleared, softpressure establishes the same drift criteria as backpressure: E[L(Q(t + 1))|Q(t)] ≤ L(Q(t)) −2 N n X i=1 Qi(t) + N.
The corresponding Lyapunov moment bound is lim sup t E[ n X i=1 Qi(t)] ≤ N2 2.
Furthermore, as the queues become balanced or empty, the schedule-driven approach improves the service rate further through combinatorial opti-mization, which means that S ≥ B,where S and B represent the difference
between service rate and arrival rate of softpressure and backpressure respec-tively.
3.5
Experimental Results
To evaluate our approach, we simulate performance on a real world network with 2-way, multiple lane, and multi-directional traffic flow. The network model is based on the Baum-Centre neighborhood of Pittsburgh, Pennsylva-nia as shown in Figure 3.3. The network consists mainly of 2-phased intersec-tions. It can be seen as a two-way queueing grid network. All simulation runs were carried out according to a realistic traffic pattern from late afternoon through ”PM rush” (4-6 PM). The traffic pattern ramps up volumes over the simulation interval as follows: (0-30mins: 236 cars/hr, 30min-1hr: 354 cars/hr, 1hr-2hrs: 528 cars/hr ). This simulation model presents a complex practical application to verify the effectiveness of the proposed approach.
The simulation model was developed in VISSIM, a commercial microscopic traffic simulation software package. To assess the performance boost provided by our composite softpressure approach, we measure the average waiting time of all vehicles over 5 runs and take the performance of the original schedule-driven traffic control system [67, 68] as our baseline system.