in Mobile Edge Cloud-Networks
via Efficient Resource Allocation
and Optimization
Mike Jia
August 2019
A thesis submitted for the degree of Doctor of Philosophy
ii
This thesis is a presentation of the original work except where otherwise stated. I
completed this work jointly with my supervisor, Professor Weifa Liang. My
contri-bution to the work is around 80%.
Mike Jia
28 August 2019
This thesis could not have been completed without the help of many people.
I would like to express my sincerest gratitude towards my supervisor, Professor
Weifa Liang for his unwavering support throughout my PhD. He has been a constant
source of wisdom and encouragement, especially during times of doubt and
diffi-culty. He has put great effort into training me in the scientific field, and under his
expert guidance, I have become a qualified researcher. I will forever be grateful to
him.
I would also like to thank the other members of my supervisor panel, Professor
Brendan McKay, and Professor Song Guo, for their brilliant ideas, and professional
guidance. This thesis would not be possible without them.
The staff at the Research School of Computer Science also deserve my deepest
thanks for their generous help and support for my research. Trina Merrell and Janette
Rawlinson were particularly helpful and deserve to be especially appreciated.
I am grateful to my friends, Meitian Huang, Yu Ma, Yang Liu, Haotian Chang,
Jing Li, Kiki Wang etc. for their kindness and company throughout my study,
espe-cially my best friend Kiki who has been a well-spring of optimism and
understand-ing.
Finally, I want to express my profound gratitude towards my parents who have
given me their utmost support my entire life. Without their continuous love and
encouragement, this thesis could not have been completed.
Journal
1. Jia, M.; Liang, W.; Xu, Z.; Huang, M.; and Ma, Y., 2019. Qos-aware cloudlet
load balancing in wireless metropolitan area networks. To appear in IEEE Trans-actions on Cloud Computing, (2019)
2. Jia, M.; Liang, W.; Huang, M.; Xu, Z.; and Ma, Y., 2019. Routing cost
mini-mization and throughput maximini-mization of nfv-enabled unicasting in
software-defined networks. To appear in IEEE Transactions on Network and Service Manage-ment, (2019)
3. Jia, M.; Cao, J.;andLiang, W., 2017. Optimal cloudlet placement and user to
cloudlet allocation in wireless metropolitan area networks. IEEE Transactions on Cloud Computing, 5, 4 (2017), 725–737
4. Xu, Z.; Liang, W.; Xu, W.; Jia, M.; and Guo, S., 2016. Efficient algorithms for
capacitated cloudlet placements. IEEE Transactions on Parallel and Distributed Systems, 27, 10 (2016), 2866–2880
5. Xu, Z.; Liang, W.; Jia, M.; Huang, M.;andMao, G., 2019. Task offloading with
network function requirements in a mobile edge-cloud network. To appear in IEEE Transactions on Mobile Computing, (2019)
6. Xu, Z.; Liang, W.; Huang, M.; Jia, M.; Guo, S.; andGalis, A., 2019. Efficient
nfv-enabled multicasting in sdns. To appear in IEEE Transactions on Communica-tions, (2019)
viii
Conference
1. Jia, M.; Liang, W.; Xu, Z.; and Huang, M., 2016. Cloudlet load balancing in
wireless metropolitan area networks. InComputer Communications, IEEE INFO-COM 2016-The 35th Annual IEEE International Conference on, 1–9. IEEE
2. Jia, M.; Liang, W.; Huang, M.; Xu, Z.;andMa, Y., 2017. Throughput
maximiza-tion of nfv-enabled unicasting in software-defined networks. In GLOBECOM 2017-2017 IEEE Global Communications Conference, 1–6. IEEE
3. Jia, M.; Liang, W.; andXu, Z., 2017. Qos-aware task offloading in distributed
cloudlets with virtual network function services. InProceedings of the 20th ACM International Conference on Modelling, Analysis and Simulation of Wireless and Mo-bile Systems, 109–116. ACM
4. Jia, M. and Liang, W., 2018. Delay-sensitive multiplayer augmented reality
game planning in mobile edge computing. In Proceedings of the 21st ACM In-ternational Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, 147–154. ACM
5. Xu, Z.; Liang, W.; Xu, W.; Jia, M.; and Guo, S., 2015. Capacitated cloudlet
placements in wireless metropolitan area networks. InLocal Computer Networks (LCN), 2015 IEEE 40th Conference on, 570–578. IEEE
6. Huang, M.; Liang, W.; Xu, Z.; Jia, M.; and Guo, S., 2016. Throughput
maxi-mization in software-defined networks with consolidated middleboxes. InLocal Computer Networks (LCN), 2016 IEEE 41st Conference on, 298–306. IEEE
7. Xu, Z.; Liang, W.; Huang, M.; Jia, M.; Guo, S.; and Galis, A., 2017.
Ap-proximation and online algorithms for nfv-enabled multicasting in sdns. In
Thanks to advances in wireless communication and mobile computing, the last decade
has seen an explosion of new innovative services on smartphone devices in areas as
diverse as transportation, mobile payment and social media. The ubiquity of mobile
smart devices and their constant presence in every day life has generated
unprece-dented data traffic between end users and the remote cloud. To prepare for the
increasing data traffic in the coming years and the demand for low-latency
compu-tation resources near the user, network service providers are increasingly turning to
Mobile Edge Computing to bring cloud computing capabilities to the edge of the
network.
Mobile Edge Computing (MEC) is a recent network paradigm and is conceived
as consisting of three layers: (1) a layer of users, (2) a layer of small-scale data centers
called cloudlets situated at the network edge that inter-connect to form theedge-cloud, and (3) geographically distributed data-centers that form the remote cloud with vast resources in remote locations. Users at the edge of the network offload computation
tasks to the edge-cloud instead of remote clouds, thereby decreasing the response
time for offloaded tasks and reducing congestion in the back-haul network.
In this thesis, we will focus on the provisioning of Delay-Aware Services in MEC
networks by efficiently utilizing various MEC resources to reduce the latency of user
offloaded tasks in different application scenarios, while meeting ever-growing user
demands.
We firstly address how to balance the workload among cloudlets in the
edge-cloud with the aim to minimize the maximum response time of all offloaded tasks.
We propose two algorithms for the problem: one is a fast heuristic, and another is a
distributed genetic algorithm that is capable of delivering a more accurate solution
compared to the heuristic, but at the expense of a longer running time.
We then study policy-aware unicast request admissions with and without
x
to-end delay constraints in a Software Defined Network (SDN). We develop efficient
algorithms for the admission of a single request with and without the end-to-end
delay constraint, and online algorithms with a guaranteed performance for the
dy-namic admission of requests without the knowledge of future arrivals. In particular,
we provide the very first online algorithm with a provable competitive ratio for the
problem without the end-to-end delay requirement.
We thirdly investigate the deployment of virtualized network functions among
cloudlets to serve end-users, while meeting the resource demands of mobile users
and their Quality-of-Service (QoS) requirements. We devise an efficient algorithm
for the problem by utilizing VNF instance sharing and cost-effective creation of new
VNF instances, and develop an effective prediction mechanism to predict idle VNF
instance releases and new VNF instance creations for further cost savings over time.
We fourthly envision a scenario in the near future where players wearing AR
heads-up display devices engage with other players over a large area with densely
deployed cloudlets. We propose a novel system model and formulate the
Decentral-ized Multiplayer Coordination (DMC) Problem with the aim of minimizing the game
frame duration of all players. We then devise an efficient algorithm for the problem.
Finally we conduct extensive experiments to evaluate the effectiveness of each
proposed algorithm, and investigate the impact of various algorithm parameters and
environmental settings. Experimental results show that the proposed algorithms are
1 Introduction 1
1.1 Mobile Edge Computing . . . 2
1.1.1 Cloudlets and the Edge-Cloud . . . 2
1.1.2 Data and Computation Transfer to the Edge-Cloud . . . 3
1.1.3 The Remote Cloud . . . 5
1.1.4 Data and Computation Transfer to the Remote Cloud . . . 6
1.2 Supporting Network Services in MEC . . . 7
1.3 Augmented Reality: The killer MEC Application . . . 8
1.4 Research Topics . . . 10
1.4.1 QoS-Aware Task Load Balancing in the Edge-Cloud . . . 11
1.4.2 Routing Cost and Throughput Optimization of Requests in the Remote Cloud . . . 11
1.4.3 QoS-Aware Virtual Network Service Deployment in the Edge-Cloud . . . 12
1.4.4 Multiplayer Augmented Reality Game Planning in Mobile Edge Computing . . . 13
1.5 Thesis Contributions . . . 13
1.6 Thesis Organization . . . 15
2 QoS-Aware Task Load Balancing in the Edge-Cloud 17 2.1 Introduction . . . 17
2.2 Related Works . . . 18
2.3 Preliminaries . . . 19
2.3.1 System Model . . . 19
2.3.2 Problem Definition . . . 21
2.4 Heuristic Algorithm . . . 22
xii Contents
2.4.1 Balancing Task Response Time . . . 23
2.4.2 Minimum-latency Flow . . . 25
2.5 Distributed Genetic Algorithm . . . 28
2.5.1 Genetic Algorithm Operations . . . 28
2.5.2 Distributed Algorithm . . . 30
2.6 Performance Evaluation . . . 33
2.6.1 Simulation Environments . . . 33
2.6.2 Performance Evaluation of Different Algorithms . . . 34
2.6.3 Impact of Important Parameters on the Performance of Algo-rithms . . . 36
2.7 Summary . . . 40
3 Routing Cost and Throughput Optimization of Requests in the Remote Cloud 43 3.1 Introduction . . . 43
3.2 Related Works . . . 44
3.3 Preliminaries . . . 46
3.3.1 System model . . . 46
3.3.2 Problem definitions . . . 48
3.4 A Generic Optimization Framework . . . 50
3.4.1 Overview . . . 50
3.4.2 The construction of the auxiliary directed acyclic graph Hk for requestρk . . . 50
3.4.3 Operational cost and transmission delay models . . . 51
3.5 Algorithms for Delay-Aware NFV-enabled Unicasting Problem . . . 53
3.5.1 Optimal algorithm without the end-to-end delay constraint . . . 53
3.5.2 Heuristic algorithm with the end-to-end delay constraint . . . . 56
3.6 Online Algorithm for Dynamic Admissions of NFV-enabled Requests Routing . . . 59
3.6.2 Online algorithm for the online delay-aware NFV-enabled
uni-casting problem . . . 66
3.7 Performance Evaluation . . . 67
3.7.1 Experimental environmental setting . . . 67
3.7.2 Performance evaluation of the proposed algorithms for a single request . . . 68
3.7.3 Performance evaluation of the proposed online algorithms . . . 69
3.7.4 Impact of different parameters . . . 70
3.8 Summary . . . 71
4 QoS-Aware Virtual Network Service Deployment in the Edge-Cloud 75 4.1 Introduction . . . 75
4.2 Related Works . . . 76
4.3 Preliminaries . . . 77
4.3.1 System model . . . 77
4.3.2 End-to-end delay of offloading requests . . . 79
4.3.3 The admission cost . . . 80
4.3.4 Problem definition . . . 81
4.4 Online Algorithm . . . 82
4.4.1 Algorithm for offloading requests at each time slot . . . 82
4.4.2 Online algorithm for the minimum operational cost problem . . 83
4.4.3 Algorithm complexity analysis . . . 87
4.5 Performance Evaluation . . . 90
4.5.1 Experimental settings . . . 90
4.5.2 Algorithm performance within a single time slot . . . 91
4.5.3 Performance evaluation of the proposed online algorithm . . . . 99
4.6 Summary . . . 102
5 Multiplayer Augmented Reality Game Planning in Mobile Edge Computing105 5.1 Introduction . . . 105
5.2 Related Works . . . 106
xiv Contents
5.3.1 Overview . . . 108
5.3.2 Optimization Objective . . . 111
5.3.3 Problem Definition . . . 115
5.4 Algorithm . . . 115
5.5 Performance Evaluation . . . 118
5.6 Summary . . . 123
6 Conclusion and Future Directions 125 6.1 Summary of Contributions . . . 125
6.2 Future Directions . . . 128
1.1 Mobile Edge Computing Architecture . . . 3
1.2 Example of an Augmented Reality frame. First, a snapshot of the user
view is captured and sent to a nearby MEC cloudlet. Digital elements
are then rendered on the cloudlet and transferred back to the user
device. . . 10
2.1 Cloudlets in the edge-cloud collocated with micro-base station access
points (APs) . . . 20
2.2 The flow of tasks from cloudletito cloudletj . . . 22 2.3 Finding optimal outgoing and incoming workload for each cloudlet in
the edge-cloud . . . 24
2.4 Auxiliary flow graph is generated to find minimum-latency flow . . . . 26
2.5 Interactions between the supervisor and islands . . . 32
2.6 Impact of network conditions on performance of algorithmsHeuristic
andDistributed . . . 35 2.7 Impact of network sizeKon the performance of algorithmsHeuristic
andDistributed. . . 36 2.8 The impact of important parameters on the performance of algorithm
Heuristic . . . 38 2.9 The impact of important parameters on the performance of algorithm
Distributed. . . 39
3.1 A software-defined network G with a setV = {v1,v2,v3,v4,v5,v6} of SDN-enabled switch nodes and a subsetVS= {v1,v4,v5,v6}(VS ⊆V) of switch nodes attached with data centers. . . 48
xvi LIST OF FIGURES
3.2 A constructed auxiliary graph Hk, where V1, . . . ,Vl represent the sets of candidate nodes for each service layer in the service chain . . . 51
3.3 The performance of algorithmsUNICAST,UNICAST_DELAY,ONE_DC, and
ONE_DC_DELAY. . . 69 3.4 The performance of online algorithms ONLINE, ONLINE_DELAY, and
LINEAR. . . 70 3.5 The impact oflmaxon the performance of algorithmsUNICAST,UNICAST_DELAY,
ONE_DCandONE_DC_DELAY. . . 72 3.6 The number of requests admitted by online algorithms ONLINE and
ONLINE_DELAYwith different values ofβ. . . 72
3.7 The number of requests admitted by online algorithms ONLINE and
ONLINE_DELAYwith and without the admission control threshold σ. . . 72
4.1 Performance of algorithms ALG and HRF, by varying the number of requests from 50 to 250, while the number of cloudlets in the network
is 10. . . 92
4.2 Performance of algorithms ALG and HRF by varying the number of cloudlets in the network from 5 to 25. . . 94
4.3 Performance of algorithms ALG and HRF by varying the network size from 50 to 800. . . 96
4.4 Performance of algorithmsALGandHRFwhen the maximum delay re-quirement of a request is varied from 0.04 to 0.12. . . 97
4.5 Performance of algorithms ALGandHRFin a time horizon of 100 time slots. . . 98
4.6 (a) - (b) show the performance of algorithms ALGandHRF by varying the number of cloudlets from 5 to 25 for a time horizon of 100 time
slots. (c) - (d) show the performance of algorithms ALG and HRF by varying the network size from 50 to 800 for a time horizon of 100 time
4.7 Performance of algorithmsALGandHRFby varying the maximum end-to-end delay requirement of a request from 0.04 to 0.12 for a time
horizon of 100 time slots. . . 101
4.8 Performance of algorithmsALGandHRFfor a time horizon of 100 time slots, by varying the idle cost threshold and the creation cost threshold from 100 to 10,000. . . 103
5.1 Partitioning players into groups with overlapping AOIs . . . 107
5.2 An illustrated overview of a game frame . . . 110
5.3 The performance of the proposed algorithm and the benchmark. . . 120
Introduction
Thanks to advances in wireless communication and mobile computing, the last decade
has seen an explosion of new innovative services on smartphone devices in areas
as diverse as transportation, mobile payment and social media. At the same time,
there has also been a rapid adoption of sensors and wearable devices that make
up an emerging Internet of Things (IoT), with Cisco predicting that over 50 billion
of these devices will be added to the internet by 2020 [27]. As the computing
ca-pacity of mobile and IoT devices are limited due to their portable size, they often
rely on the abundant storage and computation resources on the remote cloud to
process offloaded data and computation tasks. As sensor feedback and mobile
appli-cations become more real-time and interactive, there exists an increasing demand for
low-latency response time when offloading data processing and computation tasks.
However the geographical distance between end-users and the remote cloud can
re-sult in lengthy delays of up to a second, which is unacceptable for some interactive
applications and delay-sensitive sensors. Furthermore, the ubiquity of mobile smart
devices and their constant presence in every day life has generated unprecedented
data traffic between end users and the remote cloud. As data exchange between
end-users and the remote cloud continue to rapidly increase, relying solely on cloud
computing can strain mobile network resources, and overwhelm the back-haul
net-work of mobile service providers [19]. To prepare for the increasing data traffic in
the coming years and the demand for low-latency computation resources near the
user, planners of the next generation of cellular network technology are increasingly
turning to Mobile Edge Computing to bring cloud computing capabilities to the edge
of the network.
2 Introduction
1.1
Mobile Edge Computing
Mobile Edge Computing (MEC) is a network paradigm that has recently emerged as
a potential solution to the problem of providing a low latency computing
environ-ment for mobile users. By densely deploying clusters of computers called cloudlets
collocated with micro-base stations in urban areas [3, 82], MEC pushes cloud
com-puting capabilities to the edge of the network, thus providing a reliable low latency
computing environment for mobile users. MEC is typically conceived as consisting
of three layers as seen in Fig. 1.1: (1) a layer of users, (2) a layer of small-scale data
centers called cloudlets situated at the network edge that inter-connect to form the
edge-cloud, and (3) geographically distributed data centers that form theremote cloud
with vast resources in remote locations. Users at the edge of the network offload
com-putation tasks to the layer of cloudlets instead of the remote cloud, thereby
decreas-ing the response time for offloaded tasks and reducdecreas-ing congestion in the back-haul
network. Relevant user information or surplus computation tasks can be transferred
from the cloudlet layer to the remote cloud layer for storage and processing.
1.1.1 Cloudlets and the Edge-Cloud
A cloudlet is a trusted, resource-rich cluster of computers wirelessly connected to its
nearby mobile users [81]. Mobile devices are resource constrained due to their
porta-bility and can struggle with applications that have heavy computation demands, for
example real time video games with high fidelity graphics. By offloading part of
the application to the cloudlet for execution, the user can take advantage of the low
latency computing resources available on the cloudlet and enjoy a better game
expe-rience.
To provide seamless support for mobile users on the go, cloudlets must be
con-stantly accessible to users while outside. Studies have shown how cloudlets can be
deployed in public wireless metropolitan area networks (WMANs) [43, 102, 103] as a
complimentary service to Wi-Fi Internet access, or together with micro-base stations
accessible through mobile cellular networks[3, 50].
Figure 1.1: Mobile Edge Computing Architecture
[80], however a growing body of work has demonstrated the feasibility of managing
a network of cloudlets, and the clear benefits of cloudlet load balancing [48, 49, 76,
86, 92, 105]. By linking cloudlets together, either wirelessly or via wired connections,
they form an Edge-Cloud Network (ECN). Once tasks are uploaded to a cloudlet
within the network, they can be migrated to a different cloudlet for execution in
the case where the former cloudlet has a large workload. Load balancing within
a network of cloudlets is especially important in dense urban environments where
user demand can be particularly heavy and fluctuates over time, and designing a load
balancing algorithm in such an environment while avoiding network congestion is
thus an important challenge.
1.1.2 Data and Computation Transfer to the Edge-Cloud
The mechanism for offloading data processing and computation varies across
differ-ent studies, but generally, researchers follow a clidiffer-ent-server model. The mobile user
first establish a wireless connection with a nearby cloudlet, and the task is
encapsu-lated in a light-weight virtual machine (VM) [18, 20, 35, 57]. This VM capsule is then
uploaded to the cloudlet for execution, and once the task on the VM in the cloudlet
4 Introduction
To ensure a high Quality of Service (QoS) for the user, the response time of a task,
that is the time taken for an offloaded task to be remotely executed and returned to
the device, has to be minimized. A major difficulty in optimizing this objective is the
limited wireless bandwidth available between the user and the cloudlet, especially
in the case where the data output of the task is especially heavy. A trade-off must
be made between the gains in computing resources on the cloudlet and the data
transfer delay, when deciding which tasks to offload to the cloudlet. This is further
complicated by the fact that some tasks will have dependencies on other tasks. To
tackle this problem, a common approach is to conceptualize the application as a
graph of task dependencies each with data inputs and outputs. Assuming that some
subset of the tasks in the application task graph can be remotely executed [44, 52,
106], a carefully designed algorithm can take a fine grained approach by offloading
individual tasks within an application to the cloudlet for execution while minimizing
the transfer of data between the mobile device and the cloudlet.
Many applications that involve machine learning techniques easily lend
them-selves to this model, in particular, mobile task offloading techniques have been
stud-ied in the context of facial recognition applications [9, 85]. However, not all
applica-tions with heavy computation demands fit into this mold, and developing an
appli-cation with offloadable components can be burdensome for developers. To overcome
this, a recent work [108] has proposed a method to parse Android applications and
automatically classify methods that can be offloaded. In a related study [109] the
authors further presents a tool for optimizing mobile computation offloading in
An-droid applications.
Another challenge to optimizing task offloading in an application is how to
ac-curately predict the response time of an offloaded task. Calculating the exact task
response time of each task offloaded on to a cloudlet is highly complex, especially
in a system of where computation resources are shared by other tasks. While
calcu-lating the precise response time for each specific task is infeasible, the average task
response times can be accurately estimated using queueing theory [54], thus many
studies [17, 28, 42, 43, 62, 69, 84] have presented system models that rely on queueing
To control the complex offloading decision, most studies also assume that a task
offloading manager operates in the background on the mobile device, monitoring
network performance, predicting computation requirements of mobile applications,
and estimating execution times on both local devices and the cloud [20, 23]. Using
this information, the task offloading manager can coordinate with cloudlets to decide
which tasks to offload.
Studies so far have mainly focused on optimizing either the throughput of the
application task graph or the total delay under a single-user, single-cloudlet scenario.
However, when considering a more realistic edge-cloud scenario where a network of
cloudlets share the workload from multitudes of users competing for cloudlet
re-sources, it becomes a challenge to balance the needs of all the users. An obvious
approach is to optimize the average task response time, however this can be an
in-adequate solution as the algorithm may choose to deliberately disadvantage some
users in order to provide other users with resources to increase the global average.
To achieve a more egalitarian solution, it is important for our objective to minimize
the maximum response time of all user requests, ensuring that all users can benefit
from the edge-cloud.
1.1.3 The Remote Cloud
While cloudlets in an MEC environment can provide mobile users with a low-latency
computing environment, their resources are limited. During periods of peak
de-mand, the cloudlet layer can rely on the remote cloud layer and offload some of its
data processing and computation tasks. Remote distributed clouds (remote clouds)
are often defined as a network of small to medium-sized data centers inter-connected
by Wide Area Network and accessible to mobile users through the Internet [25].
Al-though remote clouds have a higher latency to end users compared to cloudlets,
they have far more abundant resources. In the case where the cloudlet layer is
over-whelmed by user demands, the remote cloud can act as a fallback and admit
sur-plus computation tasks from the cloudlet layer for processing. Furthermore, while
6 Introduction
collected will be too large to be persisted on the cloudlet layer in the long-term. The
remote cloud is thus an ideal solution for storing rapidly increasing amounts of social
media data and user information collected by cloudlets, due to their rich resources,
reliability, and disaster-resilience [6]. Remote clouds are therefore vital as the
des-tination for high volumes of data collected by cloudlets that need to be analyzed,
processed, or persisted.
Similar to research done on task offloading systems in cloudlet networks, many
studies that focus on optimizing operations for routing and executing task requests
in the remote cloud also rely on Queueing Theory [96]. This continuity makes it
convenient for the remote cloud and the edge-cloud to be formulated as a single
system [17, 43, 48, 49]. While the edge-cloud and the remote cloud work closely
together, these two layers in the MEC architecture are functionally independent and
sometimes even operated by different service providers. As a result there exists a
challenge in how requests from the edge-cloud should be admitted into the remote
cloud.
1.1.4 Data and Computation Transfer to the Remote Cloud
As the service provider of the edge-cloud network may differ from the service provider
of the remote cloud,requestsfor offloading data and computation tasks from cloudlets to the remote cloud will need to be processed by a specified sequence of network
functions such as firewalls, intrusion detection systems (IDSs), deep packet
inspec-tion (DPI), and so on, to protect the user’s data and ensure its integrity. Tradiinspec-tionally
such network functions were performed by hardware specific network devices at the
data center, however these devices are difficult to update and lead to inflexible data
pipelines that become bottlenecks [36]. To overcome the inflexibility of hardware
middle-boxes, Network Function Virtualization (NFV) emerged as a leading
solu-tion. By implementing network functions that previously ran on specific hardware
as software on generic machines, network function instances can be instantiated as
VMs and deployed anywhere within the data center.
Network-ing (SDN) to deal with unexpected link failures in the network under increasNetwork-ingly
heavy traffic. By reserving a portion of bandwidth to report link failures and other
important information, an SDN central controller can coordinate traffic around
bro-ken links as soon as they occur, avoiding congestion [2, 53, 71]. SDN techniques
can also be applied to manage computation resources in the data center, allowing
specially designed algorithms to deploy NFV instances to meet customer demands.
Unlike in the edge-cloud layer where QoS is delivered by minimizing task response
time, the QoS objective within a data center is typically to minimize operation costs,
or maximize throughput, sometimes with a delay requirement.
Due to the huge volume of data generated by cloudlets and the limited
band-width and data processing resources at the data centers, not all data transfer requests
issued by cloudlets can be immediately admitted into the data center. Furthermore,
many requests have an associated delay-constraint that must be met. However using
a combination of NFV and SDN techniques, data center operators are given a greater
control and flexibility to meet the increased demand provided by the edge-cloud. As
such, designing a request admission policy for data centers with the aim to maximize
the throughput of requests and minimize operation costs poses a new and interesting
challenge.
1.2
Supporting Network Services in MEC
Previously we discussed how deploying virtualized network functions in data
cen-ters can increase flexibility and responsiveness in distributed data cencen-ters. Similarly,
network functions, or network services, can also be deployed on cloudlets at the edge of the network. The emergence of IoT and the proliferation of sensors deployed in
ur-ban environments demands the support of middle-box software to provide solutions
to problems like heterogeneity, interoperability, security and dependability [41, 90].
Virtual Network Function (VNF) instances deployed on the cloudlets will be close to IoT devices and sensor nodes, allowing cloudlets to generate up-to-date and accurate
information of the local area [73]. This further enables a wide range of context-aware
8 Introduction
A potential example is the deployment of security cameras in crowded areas
to spot wanted criminals. The security cameras will stream their live video feed to
nearby cloudlets, where facial-recognition network service instances process multiple
video streams simultaneously, to screen out individuals faces from a police database.
MEC services could also potentially be deployed to manage fleets of self-driving
vehicles. Autonomous vehicles in urban areas can detect sudden changes in traffic
conditions due to accidents or pedestrian activities and notify a nearby cloudlet [66].
The cloudlet can then cross-reference the information with other vehicles in the area,
and re-optimize the recommended routes of some vehicles to avoid congested roads.
Supporting these kind of network services requires the traditional client-server
task offloading approach to be re-examined. While the client-server model of discrete
user tasks being offloaded to the cloudlet is accurate in many scenarios, it assumes
that cloudlet resources are dedicated to each individual task. Since there is no sharing
of resources between tasks, when several users are demanding the same service,
cloudlet resources will be inefficiently allocated. On the other hand, network services
can be deployed as VNF instances on cloudlets can serve multiple users at the same
time, and cross-reference user uploaded information to produce more useful results.
This model for processing multiple user requests simultaneously is also critical to
the success of Augmented Reality, which is one of the most anticipated use-cases for
MEC.
1.3
Augmented Reality: The killer MEC Application
Augmented Reality (AR) is a technology that superimposes interactive digital
ele-ments on top of the real world view of a user device, and has attracted considerable
investment from major technology companies. In 2017 at the F8 developer
confer-ence, Facebook CEO Mark Zuckerberg spoke at length about the potential of AR
and described a future where artists could display digital artwork in public spaces
and friends could share virtual signs and objects[1]. AR could also disrupt the work
environment, with AR headset displays like Microsoft Hololens, Google Glass, and
collab-orations among colleagues. However, AR has been particularly successful in games,
as demonstrated by the explosive popularity of the mobile AR game Pokemon Go.
Pokemon Go was released in July 2016 and became the most active mobile game in
the United States while generating more than 160 million US dollars through in-game
purchases before the end of the month [83].
An AR device displays digital elements to a user by proceeding inframes[34, 79, 89]. At the start of a frame, an image is captured on the device’s camera along with data from other sensors, and the user’s precise position and orientation are
aggre-gated from the raw data stream of sensors like accelerometers, as seen in Fig. 1.2.
The image frame may also be analyzed to identify surfaces, obstacles, landmarks
that may effect the appearance or behavior of the digital elements. A view of the
digital elements is then rendered according to the aggregated data and integrated
with the captured image. The image is displayed to the user and the next frame
begins. As AR devices are either hand held, or worn as a headset, there are
se-vere weight limitations on the device that strictly limit its computing resources. As
a result, AR devices strongly depend on cloudlets deployed throughout mobile
net-works to deliver cached contents and provide low latency computation environments
for the computation intensive steps of tracking the user, analyzing image frames, and
integrating digital elements into the user’s view [11, 15, 37].
Since AR combines virtual elements with real world environments, many AR
games and objects will exist in the context of a specific environment, e.g., digital
fish swimming in a real world fountain. These AR games and elements can thus
be hosted on nearby cloudlets and accessed by users in the area who connect to the
cloudlet [48, 79]. As many users in a single area are likely to request the same AR
ser-vice, instead of allocating resources to each individual user on the cloudlet, it is more
efficient to instantiate a service instance on the cloudlet to serve multiple users
simul-taneously. However, allocating existing service instances to new users and creating
new service instances where demand arises is a non-trivial tasks, especially where
user experience demands a very tight delay-requirement when processing user
re-quests for a particular service. Furthermore, public digital objects will be constantly
10 Introduction
Figure 1.2: Example of an Augmented Reality frame. First, a snapshot of the user view is captured and sent to a nearby MEC cloudlet. Digital elements are then
rendered on the cloudlet and transferred back to the user device.
computation-intensive and strain cloudlet resources, especially if there are multiple
users in close proximity to each other are acting simultaneously.
1.4
Research Topics
In this thesis, we study the provisioning of delay-aware services in MEC by
effi-ciently utilizing various resources of MECs and remote clouds to reduce the latency
of user offloaded tasks in different application scenarios, while meeting ever-growing
user demands. Specifically, we will address the following four main issues: (1) how
to balance the workload among cloudlets in the edge-cloud to reduce the average
response time of user offloaded tasks; (2) how to minimize the operation cost of
service providers while maximizing network throughput of tasks with specified
ser-vice chains; (3) how to deploy network serser-vice instances (instances of virtual network
functions) to meet the resource demands of mobile users and their Quality-of-Service
(QoS) requirements; and finally (4) how to coordinate a massive multi-player
Aug-mented Reality (AR) game among players in MEC networks to maximize the quality
1.4.1 QoS-Aware Task Load Balancing in the Edge-Cloud
One major issue that MEC planners face is how to allocate user task requests to
different cloudlets so that the workload among cloudlet is well balanced, thereby
shortening the response time delay of tasks and enhancing user experience in the
use of the service. A typical solution to this is to allocate user requests to their
closest cloudlets to minimize the network delay, however this approach has been
demonstrated to be inadequate in high user-density areas [43]. Specifically, the vast
number of users in the network means that the workload at each individual cloudlet
will be highly volatile. If a cloudlet is suddenly overwhelmed with user requests,
the task response time at the cloudlet will increase dramatically, causing lag in the
user applications and degrading user experiences. To prevent some cloudlets from
being overloaded, it is crucial to assign user requests to different cloudlets such that
the workload among the cloudlets is well balanced, thereby reducing the maximum
response time of offloaded tasks.
1.4.2 Routing Cost and Throughput Optimization of Requests in the Re-mote Cloud
In order to ensure data transfer security, system performance, and data integrity, re-quests to offload data and computation task from the cloudlets to the remote cloud must adhere to specific policy enforcement requirements to be admitted in to the
re-mote cloud environment. These policy enforcement requirements consist of a service
chain of network functions the request must past through before reaching the final
destination of the request in the remote cloud. Using NFV techniques, these network
functions can be instantiated on a VM running on generic hardware, allowing them
to be dynamically created anywhere within a data center, giving the network
admin-istrator great flexibility in handling large throughputs of NFV-enabled requests with
heterogeneous service chain specification.
However, admitting NFV-enabled requests in an SDN poses great challenges.
First, for each NFV-enabled request, we must determine not only a routing path
12 Introduction
Second, since NFV-enabled requests arrive into the system dynamically and
unpre-dictably, the response to each incoming request by either admitting or rejecting it
is crucial in order to maximize the network throughput. If a request is admitted, a
routing path and a set of data centers on the path should be found for the request
immediately. The dynamic nature of resource allocation in SDNs and
unpredictabil-ity of future request arrivals further increases the difficulty in tackling this dynamic
request admission problem.
1.4.3 QoS-Aware Virtual Network Service Deployment in the Edge-Cloud
Generally, the classical model for application offloading systems [20, 23] in mobile
cloud computing consists of a client component on a mobile device, and a server
component on the cloudlet to remotely execute offloaded tasks from the device. As
the options for user applications are too numerous for server components to be stored
in the cloudlet, most existing studies [18, 20, 57] assumed that each user connects to
a dedicated VM in the cloudlet, without consideration of whether an existing VM for
the same application could be used to serve multiple users. However, many popular
emerging applications and services are location-specific, and so it becomes realistic
to assume that multiple users in a local area will request the same computing service
from cloudlets. This is especially the case for AR experiences, for example in [79], the
authors introduced a mobile task offloading architecture specifically for mobile
aug-mented reality in a museum setting, where multiple users gaze at the same exhibit.
Meanwhile, Google has created a patent for “Location-based games and augmented
reality systems”[70] allowing players to engage in AR games designed for particular
real-world locations. Since AR technology functions by processing video frames, the
processing of each user video frame can be modeled as an individual task, making it
possible for a Virtual Network Function (VNF) instance on a cloudlet to serve
mul-tiple users. If the VNF instance for that service has already been instantiated, the
offloading cost will be less expensive and the service can be carried out immediately.
However, it then becomes a challenge to assign users to existing VNF instances, or
of each user is met, as they share computing resources on the cloudlet.
1.4.4 Multiplayer Augmented Reality Game Planning in Mobile Edge Com-puting
Supporting multiplayer interactions between multiple AR users in a shared virtual
environment is a new and difficult challenge that has yet to be addressed. Similar
to AR applications, real-time multiplayer games also proceed frame by frame, where
actions performed by players are taken as input at the beginning of the game frame,
and a sequence of events are generated as output at the end of each game frame.
The duration of a game frame is of critical importance to the user experience, as
it represents the length of time for a player to receive feedback from his or her
ac-tion. Long and erratic game frames can irritate the player and potentially render
the game unplayable. While network service instances deployed on cloudlets can
support multiple users, processing interactions between the users has a much higher
computational demand on the cloudlet. If too many users in close proximity to each
other are acting simultaneously, the limited resources of a cloudlet could be
over-whelmed. To support a large number of players at the same time, it is necessary that
the workload of processing user interactions in MEC is evenly distributed among the
cloudlets, to ensure that players receive feedback from their actions with short
de-lay. However, coordinating a decentralized multiplayer system with large numbers
of users is challenging.
1.5
Thesis Contributions
The main contributions of this thesis are to systematically study the provisioning of
Delay-Aware Services in MEC networks by formulating novel system models and
de-veloping optimization frameworks for the aforementioned problems. By efficiently
managing resources in MEC through developing efficient algorithms, we can
signifi-cantly reduce the delay for end-users as well as increase the throughput of requests
14 Introduction
• We first address how to balance the workload among cloudlets in the
edge-cloud to optimize mobile application performance. We introduce a novel
sys-tem model to capture the response time delays of offloaded tasks and formulate
an optimization problem with the aim to minimize the maximum response time
of all offloaded tasks. We propose two algorithms for the problem: one is a fast
heuristic, and another is a distributed genetic algorithm that is capable of
de-livering a more accurate solution compared with the first algorithm, but at the
expense of a much longer running time.
• We then study policy-aware unicast request admissions with and without
end-to-end delay constraints in a Software Defined Network (SDN). We aim to
mini-mize the operational cost of admitting a single request in terms of both
comput-ing resource consumption for implementcomput-ing the NFVs in the service chain and
bandwidth resource consumption for routing its data traffic, with a further aim
to maximize the network throughput for a sequence of requests without the
knowledge of future request arrivals. We first formulate four novel
optimiza-tion problems and provide a generic optimizaoptimiza-tion framework for the problems.
We then develop efficient algorithms for the admission of a single NFV-enabled
request with and without the end-to-end delay constraint, where NFV-enabled
requests are defined as the requests with policy enforcement requirements. We
also devise online algorithms with a guaranteed performance for dynamic
ad-missions of requests without the knowledge of future arrivals. In particular, we
provide the very first online algorithm with a provable competitive ratio for the
problem without the end-to-end delay requirement.
• We third investigate the deployment of virtualized network functions among
cloudlets to serve end-users, while meeting the resource demands of mobile
users and their Quality-of-Service (QoS) requirements. We formulate a novel
task offloading problem in a cloudlet network, where each offloaded task
re-quests a specific network function with a maximum tolerable delay, and
dif-ferent offloading requests may require difdif-ferent network function services. We
in-stance sharing and cost-effective creation of new VNF inin-stances, and develop
an effective prediction mechanism to predict idle VNF instance releases and
new VNF instance creations for further cost savings over time.
• We finally study how users can interact with each other in an AR game.
Sup-porting multiplayer interactions in an MEC environment brings many
chal-lenges. Processing user interactions can be computation-intensive especially
when multiple users in close proximity to each other are acting
simultane-ously; the limited resources of a cloudlet could be overwhelmed if there are
too many players involved. We envision a scenario in the near future where
players wearing AR heads-up display devices engage with other players over a
large area with densely deployed cloudlets, for which we first propose a novel
system model. We then formulate the Decentralized Multiplayer Coordination
(DMC) Problem with the aim of minimizing the game frame duration of all
players, and devise an efficient algorithm for the problem.
1.6
Thesis Organization
The remainder of this thesis is organized as follows. In Chapter 2, we explore the
topic of load balancing offloaded user tasks among cloudlets in the edge-cloud to
reduce the average response time. In Chapter 3, we study how to minimize
op-eration cost in a remote cloud while maximizing cloudlet request throughput with
a specified service chain. In Chapter 4, we investigate how cloudlets can deploy
NFV instances to serve end users while meeting their Quality-of-Service (QoS)
re-quirements. In Chapter 5, we examine how to coordinate a massive multiplayer
Augmented Reality (AR) game among a network of mobile cloudlets and the remote
cloudlet to maximize performance for all participating users. Finally in Chapter 6,
QoS-Aware Task Load Balancing in
the Edge-Cloud
2.1
Introduction
A major problem that Mobile Edge Computing service providers face is how to
al-locate user task requests to different cloudlets so that the workload among cloudlets
in the mobile edge network are well balanced, thereby shortening the response time
delay of tasks and enhancing user experience in the use of their service. A typical
so-lution to this problem is to allocate user requests to their closest cloudlets to minimize
the network delay, however this approach has been demonstrated to be inadequate
in an urban setting [43]. Specifically, the vast number of users in the network means
that the workload at each individual cloudlet will be highly volatile. If a cloudlet
is suddenly overwhelmed with user requests, the task response time at the cloudlet
will increase dramatically, causing lag in the user applications and degrading user
experiences. To prevent some cloudlets from being overloaded, it is crucial to assign
user requests to different cloudlets such that the workload among the cloudlets is
well balanced, thereby reducing the maximum response time of offloaded tasks.
In this chapter we deal with the QoS-aware load balancing problem among
cloudlets in the edge-cloud in response to the dynamic resource demands of user
re-quests, by devising efficient algorithms to allocate user requests to different cloudlets.
Specifically, we devise two load balancing algorithms for cloudlets within an
edge-cloud, to reduce the maximum response time of offloaded tasks from mobile users
that consists of queueing and processing time delays at each cloudlet and routing
18 QoS-Aware Task Load Balancing in the Edge-Cloud
time delays of packets between users and cloudlets.
The rest of the chapter is organized as follows. Section 2.2 discusses the related
works to this topic. Section 2.3 introduces the system model and problem definition.
Section 2.4 gives a detailed description of the fast heuristic algorithm. Section 2.5
pro-poses the distributed genetic algorithm. Section 2.6 presents the simulation results,
and a summary is given in Section 2.7.
2.2
Related Works
Although load balancing has been extensively studied in centralized data centers,
there are essential differences in load balancing between cloudlets and centralized
clouds. Specifically, in a centralized data center, there is a centralized queue for all
in-coming user requests, the workload balancing and task allocations among servers in
the data center is performed by a centralized scheduler called the hypervisor [21, 88].
In such a scenario, the task transfer delay and processing delay between servers in
the data center is several orders of magnitude less than the task transfer and
pro-cessing delays between different cloudlets in an edge-cloud network [7, 103], since
the bandwidth and computing resources within a data center is usually abundant.
In contrast, in a distributed cloudlet environment, user requests are admitted by the
network through their access points (APs), and cloudlets are usually co-located at
the APs. Associated with each cloudlet is a queue of waiting tasks, and the QoS
re-quirements of requests is now an important concern. Balancing the workload among
the waiting queues at different cloudlets while minimizing the maximum task
re-sponse time (delay) among offloaded tasks is challenging. In addition, due to limited
processing capacities of cloudlets, the wait time delay of a task at a cloudlet is a
non-linear function of the workload (processing capacity) of the cloudlet. How to
consider such a distributed queueing delay feature in the workload balance among
cloudlets is another challenge. Finally, the transfer delay by redistributing tasks from
one cloudlet to another cloudlet cannot be ignored. Although there are several
stud-ies on load balancing a network of cloudlets [43, 94, 103], none of these studstud-ies
among cloudlets.
2.3
Preliminaries
In this section we first introduce our system model. We then define the problem
precisely.
2.3.1 System Model
We assume that an edge-cloud service provider has set up K cloudlets{1, 2, . . . ,K}
at fixed locations in the edge-cloud network. We restrict the scope of the edge-cloud
to just cloudlets at the edge of the network, however including remote distributed
clouds as an offloading destination is trivially done by including an additional index.
The cloudlets are co-located at micro-base station access points (APs) in the network,
and all cloudlets are connected to each other via network connection. We assume
user applications are dynamically partitioned into discrete offloadable tasks that can
be processed at any of theKcloudlets. Users will offload tasks to a nearby AP with a cloudlet, and the cloudlet can either choose to add the task to its own queue or
redirect the task to another cloudlet in the network (See Fig. 2.1).
We model the cloudlets as M/M/n queues, where cloudlet i has a number of servers ni with service rate µi, for i ∈ {1, 2, . . . ,K}. Due to the rapidly changing nature of user demands, the rate of incoming requests can fluctuate wildly at each
cloudlet over time. As such, we assume that the incoming user tasks at each cloudlet
iarrive randomly according to the Poisson process with arrival rateλi. The average
wait time of a task in cloudleticonsists of the queueing time and the service time of the task at cloudleti. We useTi, which is a function of a given task arrival rateλ, to calculate the average task wait time at cloudleti:
Ti(λ) =
Cni,µλi
niµi−λ +
1
µi
20 QoS-Aware Task Load Balancing in the Edge-Cloud
Figure 2.1: Cloudlets in the edge-cloud collocated with micro-base station access points (APs)
where
C(n,ρ) =
(
nρ)c
n! 1−ρ1
∑n−1
k=0
(nρ)k
k! +
(n
ρ)c
n! 1 1−ρ
. (2.2)
Eq. (2.2) is known as Erlang’sCformula [54].
As task arrival rates at different cloudlets can be significantly different, some
cloudlets may be overloaded while others may be underloaded. We assume that all
cloudlets are reachable from each other, and each cloudlet can redirect a fraction of
its tasks to another cloudlet. We use f(i,j) to denote the amount of task flow from cloudletito cloudletj, fori6= j(see Fig. 2.2). We thus have the following constraint on f(i,j).
f(i,j) =
−f(j,i) if i6=j,
0 otherwise
∀i,j∈ {1, . . . ,K} (2.3)
K
∑
i=1
K
∑
j=1
f(i,j) =0, (2.4)
K
∑
j=1
max{f(i,j), 0} ≤λi, ∀i∈ {1, . . . ,K}. (2.5)
Eq. (2.4) ensures that all flow is conserved, while Eq. (2.5) ensures that the sum of all
outgoing task flows from cloudlet i(we ignore the incoming flow by summing the maximum of f(i,j) and 0 for each cloudlet j) is less than its incoming task arrival rateλi.
We assume all offloaded packets are of equal size, and so the delay incurred in
transferring any packet between a pair of APs through the network is identical. To
model such a network delay in the edge-cloud, denote by c ∈ RK×K the network delay matrix, where entry ci,j represents the shortest possible communication delay in relaying a task between cloudlet i and cloudlet j. We assume that the flow of incoming redirected tasks f(i,j) < 0 at cloudlet i has a delay of −f(i,j)·ci,j. We can then calculate the sumTnet(i)of all network delays of incoming tasks from other cloudlets to cloudletias
Tnet(i) = K
∑
j=1
max{f(j,i), 0} ·cj,i (2.6)
Having Eqs. (2.1) and (2.3), the average task response time D(i)of all tasks that are executed on cloudletican be calculated as follows.
D(i) =Ti λi
+Tnet(i), (2.7)
whereλi is the final incoming task flow that will be processed at cloudleti, which is defined as follows.
λi = λi− K
∑
i=1
f(i,j). (2.8)
2.3.2 Problem Definition
The Cloudlet Load Balancing Problem (CLBP) in an edge-cloud is defined as follows.
Given a set of cloudlets{1, . . . ,K}, where each cloudletiwith task arrival rateλiand
22 QoS-Aware Task Load Balancing in the Edge-Cloud
Figure 2.2: The flow of tasks from cloudletito cloudletj
minimized, i.e.,
min max
f D(i). (2.9)
In this chapter we propose two algorithms for the problem that strive for the
trade-off between the solution accuracy obtained by each algorithm and the running
time of the algorithm.
2.4
Heuristic Algorithm
In this section we propose a heuristic for CLBP. The approach is to identify a balanced
task response timeD and then decide the out-going or in-coming workload of each cloudlet, based on D. As we reduce the task response time of overloaded cloudlets by redirecting some of their tasks to underloaded cloudlets, the task response times
of underloaded cloudlets are increased. By carefully directing workload (flow)
be-tween cloudlets, the tasks processed among the cloudlets will have roughly the same
response time. To this end, we compute the task flow from overloaded cloudlets to
underloaded cloudlets, using a transportation algorithm [30]. This procedure
2.4.1 Balancing Task Response Time
To find the balanced task response time D and decide the amount of outgoing/in-coming workload for each cloudlet, we guess the value of Dand iteratively improve it until it is within a given accuracy bound. We begin by examining the range of the
average task wait time at cloudlets.
Let Tmax = max
1≤i≤K {Ti(λi)}and Tmin = 1min≤i≤K {Ti(λi)}, then the value of Dis in the range between Tmin and Tmax. We assign D = Tmax+2Tmin as its initial value. We then partition all cloudlets into two disjoint sets, the setVsof overloaded cloudlets:
Vs=i|Ti(λi)> D ,
and the setVt of underloaded cloudlets:
Vt =
j|Tj(λj)≤D .
For each overloaded cloudlet ini∈Vs, we decide the task demandφi of cloudlet i, which is the amount of task flow that should be redirected from its arrival task flowλi, such that its task response time is within a given accuracyeofD,
D−Ti(λi−φi)
≤e, (2.10)
whereeis a given threshold.
For each underloaded cloudletj∈ Vt, we decide the task demandφj of cloudlet
j, which is the amount of task flow to arrive at cloudlet jsuch that its task response time is within the accuracyeofD,
D−Tj(λj+φj)
≤e. (2.11)
Once we have calculated φi for each overloaded cloudlet i ∈ Vs and φj for each
24 QoS-Aware Task Load Balancing in the Edge-Cloud
Figure 2.3: Finding optimal outgoing and incoming workload for each cloudlet in the edge-cloud
cloudlets to underloaded cloudlets. Fig. 2.3 is an illustrative diagram of the
calcula-tion ofφi andφj for alliandjwith 1≤i≤ |Vs|and 1≤ j≤ |Vt|.
Because redirecting tasks from overloaded cloudlets to underloaded cloudlets
will incur network delays at underloaded cloudlets, we should further adjustDsuch that the sum of the task response time at each underloaded cloudlet j ∈ Vt and the incoming network delay Tnet(j) of all tasks to cloudlet j is nearly equal to D, i.e., Tj(λj) +Tnet(j) ≈ D. Let D0 = max
j∈Vt
Tj(λj) . If the difference between D and D0 is within a certain bound of accuracy θ, we are done; otherwise, we need to further refine D. If D < D0, this means that we must reduce the amount of outgoing tasks from overloaded cloudlets, and we need to increaseDby reducingφi for each overloaded cloudlet i ∈ Vs. Otherwise, we should increase the amount of outgoing tasks from overloaded cloudlets, and lowerDto allow overloaded cloudlets to redirect more tasks to underloaded cloudlets. We choose D ← 12 D+D0
, and
recursively search forDand continue this procedure until the difference between D
andD0 is within the accuracy boundθ.
Algorithm 2.1CLBP-Heuristic Algorithm
Input: NET, θ, e
Output: f(i,j),i,j∈ {1, 2, . . . ,K}. 1: Tmax ← max
1≤i≤K {Ti(λi)}; 2: Tmin ← min
1≤i≤K {Ti(λi)}; 3: D← Tmax+2Tmin;
4: Vs ←
i|Ti(λi)>D ; 5: Vt ←j|Tj(λj)≤D ; 6: D0 ←∞;
7: while
D−D
0
>θ do
8: foreachi∈Vsdo 9: calculateφi such that
D−Ti(λi−φi) D
≤e;
10: foreachj∈Vt do 11: calculateφj such that
D−Tj(λj+φj) D
≤e; 12: Φ← {φk|k∈Vs∪Vt}
13: calculate f(i,j) for each i,j ∈ {1, 2, . . . ,K} by invoking Procedure
minLatencyFlow(Vs,Vt,Φ); 14: foreachj∈Vt do
15: calculate D(j)by Eq. (2.7); 16: D0 ←max
j∈Vt{D(j)}; 17: D← 1
2
D+D0;
2.4.2 Minimum-latency Flow
Once we have determined the amount of outgoing or incoming task flow φk for each cloudletk, we then determine for each outgoing task flow from an overloaded cloudletito each incoming task flow of underloaded cloudletj, the value of the redi-rected task flow f(i,j). To this end, we reduce the problem of routing the outgoing task flow from overloaded cloudlets to underloaded cloudlets to the minimum-cost
maximum flow problem in an auxiliary flow graph G = (V,E) derived from the original network as follows (see Fig. 2.4).
We first add a virtual source node s and a virtual sink node t to V. We then partition the cloudlets into two disjoint sets: set Vs of overloaded cloudlets and set
26 QoS-Aware Task Load Balancing in the Edge-Cloud
Figure 2.4: Auxiliary flow graph is generated to find minimum-latency flow
from node sto each node inVs, and a directed edge from each node inVtto nodet. This gives us the set of edgesE= {hs,ii |i∈Vs} ∪ {hj,ti |i∈Vt} ∪ {hi,ji |i∈ Vs, j∈
Vt}.
Denote byu(i,j)the capacity of edge hi,ji ∈ E. The edge capacity of each edge from the source node s to a cloudlet nodei ∈ Vs is set as u(s,i) = φi for each edge
hs,ii, and the edge capacity of each edge from an underloaded cloudlet node j∈ Vt to the sink node t is set as u(j,t) = φj for each edgehj,ti. The latency cost of each
edge from the source node s to an overloaded cloudlet node i ∈ Vs is set as zero, i.e. cs,i = 0. Similarly, the latency cost of each edge from each underloaded cloudlet node j∈Vtto the sink nodetis set as zero, i.e. cj,t =0. For each edgehi,ji, i,j6= s,t from an overloaded cloudlet ito an underloaded cloudlet j, its edge capacity is set asu(i,j) =min {u(s,i),u(j,t)}.
Having constructed the flow graphG, it can be seen that the problem of routing outgoing task flow from overloaded cloudlets to underloaded cloudlets is reduced to
finding a minimum-cost maximum flow inGfromsto t, i.e.,
min
∑
(i.j)∈E
subject to the following constraints:
f(i,j)≤u(i,j), ∀i, j∈V (2.13)
∑
i∈V\{s}
f(s,i) =
∑
j∈V\{t}f(j,t), i6=sorj6= t (2.14)
∑
j∈V\{s,t}
f(i,j) =0, i6=sorj6= t (2.15)
where f(i,j)·ci,j is the amount of network delay incurred by transferring tasks from cloudlet i to cloudlet j. This is clearly an instance of the Hitchcock Transportation Problem [30], and can be solved within O K4
time, using a Transportation
Algo-rithm [24, 30], whereKis the number of cloudlets. The details are given in Procedure 2.1.
Procedure 2.1minLatencyFlow
Input: Vs, Vt, {φk|k∈ Vs∪Vt}
Output: fi,j,i,j∈ {1, 2, . . . ,K}.
1: /* Construct the flow network with latency weighted edges. */ 2: V← {1, 2, . . . ,K} ∪ {s, t};
3: E←∅;
4: foreachi∈Vs do 5: E←E∪ {hs,ii}; 6: u(s,i)←φi; 7: cs,i ←0; 8: foreach j∈Vtdo 9: E←E∪ {hj,ti}; 10: u(j,t)←φj; 11: cj,t←0; 12: foreachi∈Vs do 13: for j∈Vtdo
14: E← E∪ {hi,ji};
15: u(i,j)←min {u(s,i),u(j,t)};
28 QoS-Aware Task Load Balancing in the Edge-Cloud
2.5
Distributed Genetic Algorithm
Although later experimental results indicate that the proposed heuristicAlgorithm2.1 in the previous section can deliver a feasible solution quickly, the solution may not
be sufficiently accurate. In this section we devise a distributed genetic algorithm for
CLBP that improves the accuracy of the solution at the expense of a longer running
time. The key to this distributed algorithm is to perform fine-grain workload
balanc-ing among cloudlets iteratively through gene mutations until the solution converges
on a given threshold.
2.5.1 Genetic Algorithm Operations
Genetic algorithms (GAs) have been widely used for combinatorial optimization
problems. Traditional GAs maintain a population of solutions encoded as “genes”
where only the fittest genes are bred to produce successively fitter generations of
genes. However, genetic algorithms are often computation intensive and scale poorly.
To overcome this, we design a distributed genetic algorithm for CLBP by leveraging
the computing power of cloudlets. We first partition the cloudlets into overloaded
and underloaded cloudlets, using each cloudlet p ∈ {1, . . . ,K}as a partition refer-ence. We then solve the CLBP problem using a distributed GA to find the fittest
gene for each partition. We finally select the fittest gene among all partitions as the
solution to the problem.
We begin by discussing the representation of genes. A gene is simply an encoded
version of the task flow matrix f. While it is possible to use f directly as the gene representation, it is not the most efficient way. Many potential solutions using f as the gene will contain cloudlets that both receive and redirect flow, which
guaran-tees that the solution is sub-optimal. By partitioning the cloudlets into overloaded
and underloaded cloudlets, and representing only the task flow from overloaded to
underloaded cloudlets, the potential solution space is substantially reduced. This in
turn significantly reduces the number of iterations to an acceptable solution. If we
sort the cloudlets according to their local task response times, we can partition the
the setsVsandVt of overloaded and underloaded cloudlets as defined in the previ-ous section. Letg(i,j)denote the amount of task flow from cloudleti∈Vsto cloudlet
j∈ Vt, fori6= j, wheregis a|Vs| × |Vt|matrix. It can be seen that ghas a one-to-one mapping to the task flow variable f, i.e., every gene has a unique corresponding flow matrix. We have the following constraint on g(i,j):
∑
j∈Vt
g(i,j)≤λi, ∀i∈Vs (2.16)
∑
i∈Vs
g(i,j)<nj·µj, ∀j∈Vt (2.17)
where Eq. (2.16) limits any given cloudlet i ∈ Vs from redirecting more tasks than available according to its task arrival rate, and Eq. (2.17) limits any given cloudlet
j ∈ Vt from having a total incoming task flow of more than nj·µj, as this would result in an infinite queue time at the cloudlet (see Eq. (2.1)).
Our initial gene population is constructed by randomly populating gi with uni-formly selected random numbers in the range(0,λi). If a randomly generated gene violates one of the constraints, we randomly decrease values in the relevant row or
column until the constraints are met.
Denote byPthe given number of genes maintained in the gene population. The fitness of each gene is evaluated using our problem objective (defined in Formula. 2.9)
as our fitness function. We then select only the fittest genes to survive to the next
generation and repopulate the genepool. Two genes g1 and g2 breed to create an offspring gene gby taking the mean of each value:
g(i,j) = 1
2(g1(i,j) +g2(i,j)). (2.18)
Denote bySthe number of surviving genes that will persist into the next genera-tion. We use the roulette selection to selectSsurvivors and randomly crossbreed the survivors until the genepool population has been replenished. As we aim to
mini-mize the fitness function, we take the reciprocal of each gene’s fitness metric when
performing roulette selection.