Provisioning Delay Aware Services in Mobile Edge Cloud Networks via Efficient Resource Allocation and Optimization

(1)

in Mobile Edge Cloud-Networks

via Efficient Resource Allocation

and Optimization

Mike Jia

August 2019

A thesis submitted for the degree of Doctor of Philosophy

(2)

ii

(3)

This thesis is a presentation of the original work except where otherwise stated. I

completed this work jointly with my supervisor, Professor Weifa Liang. My

contri-bution to the work is around 80%.

Mike Jia

28 August 2019

(4)

(5)

This thesis could not have been completed without the help of many people.

I would like to express my sincerest gratitude towards my supervisor, Professor

Weifa Liang for his unwavering support throughout my PhD. He has been a constant

source of wisdom and encouragement, especially during times of doubt and

diffi-culty. He has put great effort into training me in the scientific field, and under his

expert guidance, I have become a qualified researcher. I will forever be grateful to

him.

I would also like to thank the other members of my supervisor panel, Professor

Brendan McKay, and Professor Song Guo, for their brilliant ideas, and professional

guidance. This thesis would not be possible without them.

The staff at the Research School of Computer Science also deserve my deepest

thanks for their generous help and support for my research. Trina Merrell and Janette

Rawlinson were particularly helpful and deserve to be especially appreciated.

I am grateful to my friends, Meitian Huang, Yu Ma, Yang Liu, Haotian Chang,

Jing Li, Kiki Wang etc. for their kindness and company throughout my study,

espe-cially my best friend Kiki who has been a well-spring of optimism and

understand-ing.

Finally, I want to express my profound gratitude towards my parents who have

given me their utmost support my entire life. Without their continuous love and

encouragement, this thesis could not have been completed.

(6)

(7)

Journal

1. Jia, M.; Liang, W.; Xu, Z.; Huang, M.; and Ma, Y., 2019. Qos-aware cloudlet

load balancing in wireless metropolitan area networks. To appear in IEEE Trans-actions on Cloud Computing, (2019)

2. Jia, M.; Liang, W.; Huang, M.; Xu, Z.; and Ma, Y., 2019. Routing cost

mini-mization and throughput maximini-mization of nfv-enabled unicasting in

software-defined networks. To appear in IEEE Transactions on Network and Service Manage-ment, (2019)

3. Jia, M.; Cao, J.;andLiang, W., 2017. Optimal cloudlet placement and user to

cloudlet allocation in wireless metropolitan area networks. IEEE Transactions on Cloud Computing, 5, 4 (2017), 725–737

4. Xu, Z.; Liang, W.; Xu, W.; Jia, M.; and Guo, S., 2016. Efficient algorithms for

capacitated cloudlet placements. IEEE Transactions on Parallel and Distributed Systems, 27, 10 (2016), 2866–2880

5. Xu, Z.; Liang, W.; Jia, M.; Huang, M.;andMao, G., 2019. Task offloading with

network function requirements in a mobile edge-cloud network. To appear in IEEE Transactions on Mobile Computing, (2019)

6. Xu, Z.; Liang, W.; Huang, M.; Jia, M.; Guo, S.; andGalis, A., 2019. Efficient

nfv-enabled multicasting in sdns. To appear in IEEE Transactions on Communica-tions, (2019)

(8)

viii

Conference

1. Jia, M.; Liang, W.; Xu, Z.; and Huang, M., 2016. Cloudlet load balancing in

wireless metropolitan area networks. InComputer Communications, IEEE INFO-COM 2016-The 35th Annual IEEE International Conference on, 1–9. IEEE

2. Jia, M.; Liang, W.; Huang, M.; Xu, Z.;andMa, Y., 2017. Throughput

maximiza-tion of nfv-enabled unicasting in software-defined networks. In GLOBECOM 2017-2017 IEEE Global Communications Conference, 1–6. IEEE

3. Jia, M.; Liang, W.; andXu, Z., 2017. Qos-aware task offloading in distributed

cloudlets with virtual network function services. InProceedings of the 20th ACM International Conference on Modelling, Analysis and Simulation of Wireless and Mo-bile Systems, 109–116. ACM

4. Jia, M. and Liang, W., 2018. Delay-sensitive multiplayer augmented reality

game planning in mobile edge computing. In Proceedings of the 21st ACM In-ternational Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, 147–154. ACM

5. Xu, Z.; Liang, W.; Xu, W.; Jia, M.; and Guo, S., 2015. Capacitated cloudlet

placements in wireless metropolitan area networks. InLocal Computer Networks (LCN), 2015 IEEE 40th Conference on, 570–578. IEEE

6. Huang, M.; Liang, W.; Xu, Z.; Jia, M.; and Guo, S., 2016. Throughput

maxi-mization in software-defined networks with consolidated middleboxes. InLocal Computer Networks (LCN), 2016 IEEE 41st Conference on, 298–306. IEEE

7. Xu, Z.; Liang, W.; Huang, M.; Jia, M.; Guo, S.; and Galis, A., 2017.

Ap-proximation and online algorithms for nfv-enabled multicasting in sdns. In

(9)

Thanks to advances in wireless communication and mobile computing, the last decade

has seen an explosion of new innovative services on smartphone devices in areas as

diverse as transportation, mobile payment and social media. The ubiquity of mobile

smart devices and their constant presence in every day life has generated

unprece-dented data traffic between end users and the remote cloud. To prepare for the

increasing data traffic in the coming years and the demand for low-latency

compu-tation resources near the user, network service providers are increasingly turning to

Mobile Edge Computing to bring cloud computing capabilities to the edge of the

network.

Mobile Edge Computing (MEC) is a recent network paradigm and is conceived

as consisting of three layers: (1) a layer of users, (2) a layer of small-scale data centers

called cloudlets situated at the network edge that inter-connect to form theedge-cloud, and (3) geographically distributed data-centers that form the remote cloud with vast resources in remote locations. Users at the edge of the network offload computation

tasks to the edge-cloud instead of remote clouds, thereby decreasing the response

time for offloaded tasks and reducing congestion in the back-haul network.

In this thesis, we will focus on the provisioning of Delay-Aware Services in MEC

networks by efficiently utilizing various MEC resources to reduce the latency of user

offloaded tasks in different application scenarios, while meeting ever-growing user

demands.

We firstly address how to balance the workload among cloudlets in the

edge-cloud with the aim to minimize the maximum response time of all offloaded tasks.

We propose two algorithms for the problem: one is a fast heuristic, and another is a

distributed genetic algorithm that is capable of delivering a more accurate solution

compared to the heuristic, but at the expense of a longer running time.

We then study policy-aware unicast request admissions with and without

(10)

x

to-end delay constraints in a Software Defined Network (SDN). We develop efficient

algorithms for the admission of a single request with and without the end-to-end

delay constraint, and online algorithms with a guaranteed performance for the

dy-namic admission of requests without the knowledge of future arrivals. In particular,

we provide the very first online algorithm with a provable competitive ratio for the

problem without the end-to-end delay requirement.

We thirdly investigate the deployment of virtualized network functions among

cloudlets to serve end-users, while meeting the resource demands of mobile users

and their Quality-of-Service (QoS) requirements. We devise an efficient algorithm

for the problem by utilizing VNF instance sharing and cost-effective creation of new

VNF instances, and develop an effective prediction mechanism to predict idle VNF

instance releases and new VNF instance creations for further cost savings over time.

We fourthly envision a scenario in the near future where players wearing AR

heads-up display devices engage with other players over a large area with densely

deployed cloudlets. We propose a novel system model and formulate the

Decentral-ized Multiplayer Coordination (DMC) Problem with the aim of minimizing the game

frame duration of all players. We then devise an efficient algorithm for the problem.

Finally we conduct extensive experiments to evaluate the effectiveness of each

proposed algorithm, and investigate the impact of various algorithm parameters and

environmental settings. Experimental results show that the proposed algorithms are

(11)

1 Introduction 1

1.1 Mobile Edge Computing . . . 2

1.1.1 Cloudlets and the Edge-Cloud . . . 2

1.1.2 Data and Computation Transfer to the Edge-Cloud . . . 3

1.1.3 The Remote Cloud . . . 5

1.1.4 Data and Computation Transfer to the Remote Cloud . . . 6

1.2 Supporting Network Services in MEC . . . 7

1.3 Augmented Reality: The killer MEC Application . . . 8

1.4 Research Topics . . . 10

1.4.1 QoS-Aware Task Load Balancing in the Edge-Cloud . . . 11

1.4.2 Routing Cost and Throughput Optimization of Requests in the Remote Cloud . . . 11

1.4.3 QoS-Aware Virtual Network Service Deployment in the Edge-Cloud . . . 12

1.4.4 Multiplayer Augmented Reality Game Planning in Mobile Edge Computing . . . 13

1.5 Thesis Contributions . . . 13

1.6 Thesis Organization . . . 15

2 QoS-Aware Task Load Balancing in the Edge-Cloud 17 2.1 Introduction . . . 17

2.2 Related Works . . . 18

2.3 Preliminaries . . . 19

2.3.1 System Model . . . 19

2.3.2 Problem Definition . . . 21

2.4 Heuristic Algorithm . . . 22

(12)

xii Contents

2.4.1 Balancing Task Response Time . . . 23

2.4.2 Minimum-latency Flow . . . 25

2.5 Distributed Genetic Algorithm . . . 28

2.5.1 Genetic Algorithm Operations . . . 28

2.5.2 Distributed Algorithm . . . 30

2.6 Performance Evaluation . . . 33

2.6.1 Simulation Environments . . . 33

2.6.2 Performance Evaluation of Different Algorithms . . . 34

2.6.3 Impact of Important Parameters on the Performance of Algo-rithms . . . 36

2.7 Summary . . . 40

3 Routing Cost and Throughput Optimization of Requests in the Remote Cloud 43 3.1 Introduction . . . 43

3.3.1 System model . . . 46

3.3.2 Problem definitions . . . 48

3.4 A Generic Optimization Framework . . . 50

3.4.1 Overview . . . 50

3.4.2 The construction of the auxiliary directed acyclic graph Hk for requestρk . . . 50

3.4.3 Operational cost and transmission delay models . . . 51

3.5 Algorithms for Delay-Aware NFV-enabled Unicasting Problem . . . 53

3.5.1 Optimal algorithm without the end-to-end delay constraint . . . 53

3.5.2 Heuristic algorithm with the end-to-end delay constraint . . . . 56

3.6 Online Algorithm for Dynamic Admissions of NFV-enabled Requests Routing . . . 59

(13)

3.6.2 Online algorithm for the online delay-aware NFV-enabled

uni-casting problem . . . 66

3.7.1 Experimental environmental setting . . . 67

3.7.2 Performance evaluation of the proposed algorithms for a single request . . . 68

3.7.3 Performance evaluation of the proposed online algorithms . . . 69

3.7.4 Impact of different parameters . . . 70

3.8 Summary . . . 71

4 QoS-Aware Virtual Network Service Deployment in the Edge-Cloud 75 4.1 Introduction . . . 75

4.3.1 System model . . . 77

4.3.2 End-to-end delay of offloading requests . . . 79

4.3.3 The admission cost . . . 80

4.3.4 Problem definition . . . 81

4.4 Online Algorithm . . . 82

4.4.1 Algorithm for offloading requests at each time slot . . . 82

4.4.2 Online algorithm for the minimum operational cost problem . . 83

4.4.3 Algorithm complexity analysis . . . 87

4.5.1 Experimental settings . . . 90

4.5.2 Algorithm performance within a single time slot . . . 91

4.5.3 Performance evaluation of the proposed online algorithm . . . . 99

4.6 Summary . . . 102

5 Multiplayer Augmented Reality Game Planning in Mobile Edge Computing105 5.1 Introduction . . . 105

(14)

xiv Contents

5.3.1 Overview . . . 108

5.3.2 Optimization Objective . . . 111

5.3.3 Problem Definition . . . 115

5.4 Algorithm . . . 115

5.6 Summary . . . 123

6 Conclusion and Future Directions 125 6.1 Summary of Contributions . . . 125

6.2 Future Directions . . . 128

(15)

1.1 Mobile Edge Computing Architecture . . . 3

1.2 Example of an Augmented Reality frame. First, a snapshot of the user

view is captured and sent to a nearby MEC cloudlet. Digital elements

are then rendered on the cloudlet and transferred back to the user

device. . . 10

2.1 Cloudlets in the edge-cloud collocated with micro-base station access

points (APs) . . . 20

2.2 The flow of tasks from cloudletito cloudletj . . . 22 2.3 Finding optimal outgoing and incoming workload for each cloudlet in

the edge-cloud . . . 24

2.4 Auxiliary flow graph is generated to find minimum-latency flow . . . . 26

2.5 Interactions between the supervisor and islands . . . 32

2.6 Impact of network conditions on performance of algorithmsHeuristic

andDistributed . . . 35 2.7 Impact of network sizeKon the performance of algorithmsHeuristic

andDistributed. . . 36 2.8 The impact of important parameters on the performance of algorithm

Heuristic . . . 38 2.9 The impact of important parameters on the performance of algorithm

Distributed. . . 39

3.1 A software-defined network G with a setV = {v1,v2,v3,v4,v5,v6} of SDN-enabled switch nodes and a subsetVS= {v1,v4,v5,v6}(VS ⊆V) of switch nodes attached with data centers. . . 48

(16)

xvi LIST OF FIGURES

3.2 A constructed auxiliary graph Hk, where V1, . . . ,Vl represent the sets of candidate nodes for each service layer in the service chain . . . 51

3.3 The performance of algorithmsUNICAST,UNICAST_DELAY,ONE_DC, and

ONE_DC_DELAY. . . 69 3.4 The performance of online algorithms ONLINE, ONLINE_DELAY, and

LINEAR. . . 70 3.5 The impact oflmaxon the performance of algorithmsUNICAST,UNICAST_DELAY,

ONE_DCandONE_DC_DELAY. . . 72 3.6 The number of requests admitted by online algorithms ONLINE and

ONLINE_DELAYwith different values ofβ. . . 72

3.7 The number of requests admitted by online algorithms ONLINE and

ONLINE_DELAYwith and without the admission control threshold σ. . . 72

4.1 Performance of algorithms ALG and HRF, by varying the number of requests from 50 to 250, while the number of cloudlets in the network

is 10. . . 92

4.2 Performance of algorithms ALG and HRF by varying the number of cloudlets in the network from 5 to 25. . . 94

4.3 Performance of algorithms ALG and HRF by varying the network size from 50 to 800. . . 96

4.4 Performance of algorithmsALGandHRFwhen the maximum delay re-quirement of a request is varied from 0.04 to 0.12. . . 97

4.5 Performance of algorithms ALGandHRFin a time horizon of 100 time slots. . . 98

4.6 (a) - (b) show the performance of algorithms ALGandHRF by varying the number of cloudlets from 5 to 25 for a time horizon of 100 time

slots. (c) - (d) show the performance of algorithms ALG and HRF by varying the network size from 50 to 800 for a time horizon of 100 time

(17)

4.7 Performance of algorithmsALGandHRFby varying the maximum end-to-end delay requirement of a request from 0.04 to 0.12 for a time

horizon of 100 time slots. . . 101

4.8 Performance of algorithmsALGandHRFfor a time horizon of 100 time slots, by varying the idle cost threshold and the creation cost threshold from 100 to 10,000. . . 103

5.1 Partitioning players into groups with overlapping AOIs . . . 107

5.2 An illustrated overview of a game frame . . . 110

5.3 The performance of the proposed algorithm and the benchmark. . . 120

(18)

(19)

Introduction

Thanks to advances in wireless communication and mobile computing, the last decade

has seen an explosion of new innovative services on smartphone devices in areas

as diverse as transportation, mobile payment and social media. At the same time,

there has also been a rapid adoption of sensors and wearable devices that make

up an emerging Internet of Things (IoT), with Cisco predicting that over 50 billion

of these devices will be added to the internet by 2020 [27]. As the computing

ca-pacity of mobile and IoT devices are limited due to their portable size, they often

rely on the abundant storage and computation resources on the remote cloud to

process offloaded data and computation tasks. As sensor feedback and mobile

appli-cations become more real-time and interactive, there exists an increasing demand for

low-latency response time when offloading data processing and computation tasks.

However the geographical distance between end-users and the remote cloud can

re-sult in lengthy delays of up to a second, which is unacceptable for some interactive

applications and delay-sensitive sensors. Furthermore, the ubiquity of mobile smart

devices and their constant presence in every day life has generated unprecedented

data traffic between end users and the remote cloud. As data exchange between

end-users and the remote cloud continue to rapidly increase, relying solely on cloud

computing can strain mobile network resources, and overwhelm the back-haul

net-work of mobile service providers [19]. To prepare for the increasing data traffic in

the coming years and the demand for low-latency computation resources near the

user, planners of the next generation of cellular network technology are increasingly

turning to Mobile Edge Computing to bring cloud computing capabilities to the edge

of the network.

(20)

2 Introduction

1.1 Mobile Edge Computing

Mobile Edge Computing (MEC) is a network paradigm that has recently emerged as

a potential solution to the problem of providing a low latency computing

environ-ment for mobile users. By densely deploying clusters of computers called cloudlets

collocated with micro-base stations in urban areas [3, 82], MEC pushes cloud

com-puting capabilities to the edge of the network, thus providing a reliable low latency

computing environment for mobile users. MEC is typically conceived as consisting

of three layers as seen in Fig. 1.1: (1) a layer of users, (2) a layer of small-scale data

centers called cloudlets situated at the network edge that inter-connect to form the

edge-cloud, and (3) geographically distributed data centers that form theremote cloud

with vast resources in remote locations. Users at the edge of the network offload

com-putation tasks to the layer of cloudlets instead of the remote cloud, thereby

decreas-ing the response time for offloaded tasks and reducdecreas-ing congestion in the back-haul

network. Relevant user information or surplus computation tasks can be transferred

from the cloudlet layer to the remote cloud layer for storage and processing.

1.1.1 Cloudlets and the Edge-Cloud

A cloudlet is a trusted, resource-rich cluster of computers wirelessly connected to its

nearby mobile users [81]. Mobile devices are resource constrained due to their

porta-bility and can struggle with applications that have heavy computation demands, for

example real time video games with high fidelity graphics. By offloading part of

the application to the cloudlet for execution, the user can take advantage of the low

latency computing resources available on the cloudlet and enjoy a better game

expe-rience.

To provide seamless support for mobile users on the go, cloudlets must be

con-stantly accessible to users while outside. Studies have shown how cloudlets can be

deployed in public wireless metropolitan area networks (WMANs) [43, 102, 103] as a

complimentary service to Wi-Fi Internet access, or together with micro-base stations

accessible through mobile cellular networks[3, 50].

(21)

[image:21.595.179.458.111.335.2]

Figure 1.1: Mobile Edge Computing Architecture

[80], however a growing body of work has demonstrated the feasibility of managing

a network of cloudlets, and the clear benefits of cloudlet load balancing [48, 49, 76,

86, 92, 105]. By linking cloudlets together, either wirelessly or via wired connections,

they form an Edge-Cloud Network (ECN). Once tasks are uploaded to a cloudlet

within the network, they can be migrated to a different cloudlet for execution in

the case where the former cloudlet has a large workload. Load balancing within

a network of cloudlets is especially important in dense urban environments where

user demand can be particularly heavy and fluctuates over time, and designing a load

balancing algorithm in such an environment while avoiding network congestion is

thus an important challenge.

1.1.2 Data and Computation Transfer to the Edge-Cloud

The mechanism for offloading data processing and computation varies across

differ-ent studies, but generally, researchers follow a clidiffer-ent-server model. The mobile user

first establish a wireless connection with a nearby cloudlet, and the task is

encapsu-lated in a light-weight virtual machine (VM) [18, 20, 35, 57]. This VM capsule is then

uploaded to the cloudlet for execution, and once the task on the VM in the cloudlet

(22)

4 Introduction

To ensure a high Quality of Service (QoS) for the user, the response time of a task,

that is the time taken for an offloaded task to be remotely executed and returned to

the device, has to be minimized. A major difficulty in optimizing this objective is the

limited wireless bandwidth available between the user and the cloudlet, especially

in the case where the data output of the task is especially heavy. A trade-off must

be made between the gains in computing resources on the cloudlet and the data

transfer delay, when deciding which tasks to offload to the cloudlet. This is further

complicated by the fact that some tasks will have dependencies on other tasks. To

tackle this problem, a common approach is to conceptualize the application as a

graph of task dependencies each with data inputs and outputs. Assuming that some

subset of the tasks in the application task graph can be remotely executed [44, 52,

106], a carefully designed algorithm can take a fine grained approach by offloading

individual tasks within an application to the cloudlet for execution while minimizing

the transfer of data between the mobile device and the cloudlet.

Many applications that involve machine learning techniques easily lend

them-selves to this model, in particular, mobile task offloading techniques have been

stud-ied in the context of facial recognition applications [9, 85]. However, not all

applica-tions with heavy computation demands fit into this mold, and developing an

appli-cation with offloadable components can be burdensome for developers. To overcome

this, a recent work [108] has proposed a method to parse Android applications and

automatically classify methods that can be offloaded. In a related study [109] the

authors further presents a tool for optimizing mobile computation offloading in

An-droid applications.

Another challenge to optimizing task offloading in an application is how to

ac-curately predict the response time of an offloaded task. Calculating the exact task

response time of each task offloaded on to a cloudlet is highly complex, especially

in a system of where computation resources are shared by other tasks. While

calcu-lating the precise response time for each specific task is infeasible, the average task

response times can be accurately estimated using queueing theory [54], thus many

studies [17, 28, 42, 43, 62, 69, 84] have presented system models that rely on queueing

(23)

To control the complex offloading decision, most studies also assume that a task

offloading manager operates in the background on the mobile device, monitoring

network performance, predicting computation requirements of mobile applications,

and estimating execution times on both local devices and the cloud [20, 23]. Using

this information, the task offloading manager can coordinate with cloudlets to decide

which tasks to offload.

Studies so far have mainly focused on optimizing either the throughput of the

application task graph or the total delay under a single-user, single-cloudlet scenario.

However, when considering a more realistic edge-cloud scenario where a network of

cloudlets share the workload from multitudes of users competing for cloudlet

re-sources, it becomes a challenge to balance the needs of all the users. An obvious

approach is to optimize the average task response time, however this can be an

in-adequate solution as the algorithm may choose to deliberately disadvantage some

users in order to provide other users with resources to increase the global average.

To achieve a more egalitarian solution, it is important for our objective to minimize

the maximum response time of all user requests, ensuring that all users can benefit

from the edge-cloud.

1.1.3 The Remote Cloud

While cloudlets in an MEC environment can provide mobile users with a low-latency

computing environment, their resources are limited. During periods of peak

de-mand, the cloudlet layer can rely on the remote cloud layer and offload some of its

data processing and computation tasks. Remote distributed clouds (remote clouds)

are often defined as a network of small to medium-sized data centers inter-connected

by Wide Area Network and accessible to mobile users through the Internet [25].

Al-though remote clouds have a higher latency to end users compared to cloudlets,

they have far more abundant resources. In the case where the cloudlet layer is

over-whelmed by user demands, the remote cloud can act as a fallback and admit

sur-plus computation tasks from the cloudlet layer for processing. Furthermore, while

(24)

6 Introduction

collected will be too large to be persisted on the cloudlet layer in the long-term. The

remote cloud is thus an ideal solution for storing rapidly increasing amounts of social

media data and user information collected by cloudlets, due to their rich resources,

reliability, and disaster-resilience [6]. Remote clouds are therefore vital as the

des-tination for high volumes of data collected by cloudlets that need to be analyzed,

processed, or persisted.

Similar to research done on task offloading systems in cloudlet networks, many

studies that focus on optimizing operations for routing and executing task requests

in the remote cloud also rely on Queueing Theory [96]. This continuity makes it

convenient for the remote cloud and the edge-cloud to be formulated as a single

system [17, 43, 48, 49]. While the edge-cloud and the remote cloud work closely

together, these two layers in the MEC architecture are functionally independent and

sometimes even operated by different service providers. As a result there exists a

challenge in how requests from the edge-cloud should be admitted into the remote

cloud.

1.1.4 Data and Computation Transfer to the Remote Cloud

As the service provider of the edge-cloud network may differ from the service provider

of the remote cloud,requestsfor offloading data and computation tasks from cloudlets to the remote cloud will need to be processed by a specified sequence of network

functions such as firewalls, intrusion detection systems (IDSs), deep packet

inspec-tion (DPI), and so on, to protect the user’s data and ensure its integrity. Tradiinspec-tionally

such network functions were performed by hardware specific network devices at the

data center, however these devices are difficult to update and lead to inflexible data

pipelines that become bottlenecks [36]. To overcome the inflexibility of hardware

middle-boxes, Network Function Virtualization (NFV) emerged as a leading

solu-tion. By implementing network functions that previously ran on specific hardware

as software on generic machines, network function instances can be instantiated as

VMs and deployed anywhere within the data center.

(25)

Network-ing (SDN) to deal with unexpected link failures in the network under increasNetwork-ingly

heavy traffic. By reserving a portion of bandwidth to report link failures and other

important information, an SDN central controller can coordinate traffic around

bro-ken links as soon as they occur, avoiding congestion [2, 53, 71]. SDN techniques

can also be applied to manage computation resources in the data center, allowing

specially designed algorithms to deploy NFV instances to meet customer demands.

Unlike in the edge-cloud layer where QoS is delivered by minimizing task response

time, the QoS objective within a data center is typically to minimize operation costs,

or maximize throughput, sometimes with a delay requirement.

Due to the huge volume of data generated by cloudlets and the limited

band-width and data processing resources at the data centers, not all data transfer requests

issued by cloudlets can be immediately admitted into the data center. Furthermore,

many requests have an associated delay-constraint that must be met. However using

a combination of NFV and SDN techniques, data center operators are given a greater

control and flexibility to meet the increased demand provided by the edge-cloud. As

such, designing a request admission policy for data centers with the aim to maximize

the throughput of requests and minimize operation costs poses a new and interesting

challenge.

1.2 Supporting Network Services in MEC

Previously we discussed how deploying virtualized network functions in data

cen-ters can increase flexibility and responsiveness in distributed data cencen-ters. Similarly,

network functions, or network services, can also be deployed on cloudlets at the edge of the network. The emergence of IoT and the proliferation of sensors deployed in

ur-ban environments demands the support of middle-box software to provide solutions

to problems like heterogeneity, interoperability, security and dependability [41, 90].

Virtual Network Function (VNF) instances deployed on the cloudlets will be close to IoT devices and sensor nodes, allowing cloudlets to generate up-to-date and accurate

information of the local area [73]. This further enables a wide range of context-aware

(26)

8 Introduction

A potential example is the deployment of security cameras in crowded areas

to spot wanted criminals. The security cameras will stream their live video feed to

nearby cloudlets, where facial-recognition network service instances process multiple

video streams simultaneously, to screen out individuals faces from a police database.

MEC services could also potentially be deployed to manage fleets of self-driving

vehicles. Autonomous vehicles in urban areas can detect sudden changes in traffic

conditions due to accidents or pedestrian activities and notify a nearby cloudlet [66].

The cloudlet can then cross-reference the information with other vehicles in the area,

and re-optimize the recommended routes of some vehicles to avoid congested roads.

Supporting these kind of network services requires the traditional client-server

task offloading approach to be re-examined. While the client-server model of discrete

user tasks being offloaded to the cloudlet is accurate in many scenarios, it assumes

that cloudlet resources are dedicated to each individual task. Since there is no sharing

of resources between tasks, when several users are demanding the same service,

cloudlet resources will be inefficiently allocated. On the other hand, network services

can be deployed as VNF instances on cloudlets can serve multiple users at the same

time, and cross-reference user uploaded information to produce more useful results.

This model for processing multiple user requests simultaneously is also critical to

the success of Augmented Reality, which is one of the most anticipated use-cases for

MEC.

1.3 Augmented Reality: The killer MEC Application

Augmented Reality (AR) is a technology that superimposes interactive digital

ele-ments on top of the real world view of a user device, and has attracted considerable

investment from major technology companies. In 2017 at the F8 developer

confer-ence, Facebook CEO Mark Zuckerberg spoke at length about the potential of AR

and described a future where artists could display digital artwork in public spaces

and friends could share virtual signs and objects[1]. AR could also disrupt the work

environment, with AR headset displays like Microsoft Hololens, Google Glass, and

(27)

collab-orations among colleagues. However, AR has been particularly successful in games,

as demonstrated by the explosive popularity of the mobile AR game Pokemon Go.

Pokemon Go was released in July 2016 and became the most active mobile game in

the United States while generating more than 160 million US dollars through in-game

purchases before the end of the month [83].

An AR device displays digital elements to a user by proceeding inframes[34, 79, 89]. At the start of a frame, an image is captured on the device’s camera along with data from other sensors, and the user’s precise position and orientation are

aggre-gated from the raw data stream of sensors like accelerometers, as seen in Fig. 1.2.

The image frame may also be analyzed to identify surfaces, obstacles, landmarks

that may effect the appearance or behavior of the digital elements. A view of the

digital elements is then rendered according to the aggregated data and integrated

with the captured image. The image is displayed to the user and the next frame

begins. As AR devices are either hand held, or worn as a headset, there are

se-vere weight limitations on the device that strictly limit its computing resources. As

a result, AR devices strongly depend on cloudlets deployed throughout mobile

net-works to deliver cached contents and provide low latency computation environments

for the computation intensive steps of tracking the user, analyzing image frames, and

integrating digital elements into the user’s view [11, 15, 37].

Since AR combines virtual elements with real world environments, many AR

games and objects will exist in the context of a specific environment, e.g., digital

fish swimming in a real world fountain. These AR games and elements can thus

be hosted on nearby cloudlets and accessed by users in the area who connect to the

cloudlet [48, 79]. As many users in a single area are likely to request the same AR

ser-vice, instead of allocating resources to each individual user on the cloudlet, it is more

efficient to instantiate a service instance on the cloudlet to serve multiple users

simul-taneously. However, allocating existing service instances to new users and creating

new service instances where demand arises is a non-trivial tasks, especially where

user experience demands a very tight delay-requirement when processing user

re-quests for a particular service. Furthermore, public digital objects will be constantly

(28)

[image:28.595.107.439.100.297.2]

10 Introduction

Figure 1.2: Example of an Augmented Reality frame. First, a snapshot of the user view is captured and sent to a nearby MEC cloudlet. Digital elements are then

rendered on the cloudlet and transferred back to the user device.

computation-intensive and strain cloudlet resources, especially if there are multiple

users in close proximity to each other are acting simultaneously.

1.4 Research Topics

In this thesis, we study the provisioning of delay-aware services in MEC by

effi-ciently utilizing various resources of MECs and remote clouds to reduce the latency

of user offloaded tasks in different application scenarios, while meeting ever-growing

user demands. Specifically, we will address the following four main issues: (1) how

to balance the workload among cloudlets in the edge-cloud to reduce the average

response time of user offloaded tasks; (2) how to minimize the operation cost of

service providers while maximizing network throughput of tasks with specified

ser-vice chains; (3) how to deploy network serser-vice instances (instances of virtual network

functions) to meet the resource demands of mobile users and their Quality-of-Service

(QoS) requirements; and finally (4) how to coordinate a massive multi-player

Aug-mented Reality (AR) game among players in MEC networks to maximize the quality

(29)

1.4.1 QoS-Aware Task Load Balancing in the Edge-Cloud

One major issue that MEC planners face is how to allocate user task requests to

different cloudlets so that the workload among cloudlet is well balanced, thereby

shortening the response time delay of tasks and enhancing user experience in the

use of the service. A typical solution to this is to allocate user requests to their

closest cloudlets to minimize the network delay, however this approach has been

demonstrated to be inadequate in high user-density areas [43]. Specifically, the vast

number of users in the network means that the workload at each individual cloudlet

will be highly volatile. If a cloudlet is suddenly overwhelmed with user requests,

the task response time at the cloudlet will increase dramatically, causing lag in the

user applications and degrading user experiences. To prevent some cloudlets from

being overloaded, it is crucial to assign user requests to different cloudlets such that

the workload among the cloudlets is well balanced, thereby reducing the maximum

response time of offloaded tasks.

1.4.2 Routing Cost and Throughput Optimization of Requests in the Re-mote Cloud

In order to ensure data transfer security, system performance, and data integrity, re-quests to offload data and computation task from the cloudlets to the remote cloud must adhere to specific policy enforcement requirements to be admitted in to the

re-mote cloud environment. These policy enforcement requirements consist of a service

chain of network functions the request must past through before reaching the final

destination of the request in the remote cloud. Using NFV techniques, these network

functions can be instantiated on a VM running on generic hardware, allowing them

to be dynamically created anywhere within a data center, giving the network

admin-istrator great flexibility in handling large throughputs of NFV-enabled requests with

heterogeneous service chain specification.

However, admitting NFV-enabled requests in an SDN poses great challenges.

First, for each NFV-enabled request, we must determine not only a routing path

(30)

12 Introduction

Second, since NFV-enabled requests arrive into the system dynamically and

unpre-dictably, the response to each incoming request by either admitting or rejecting it

is crucial in order to maximize the network throughput. If a request is admitted, a

routing path and a set of data centers on the path should be found for the request

immediately. The dynamic nature of resource allocation in SDNs and

unpredictabil-ity of future request arrivals further increases the difficulty in tackling this dynamic

request admission problem.

1.4.3 QoS-Aware Virtual Network Service Deployment in the Edge-Cloud

Generally, the classical model for application offloading systems [20, 23] in mobile

cloud computing consists of a client component on a mobile device, and a server

component on the cloudlet to remotely execute offloaded tasks from the device. As

the options for user applications are too numerous for server components to be stored

in the cloudlet, most existing studies [18, 20, 57] assumed that each user connects to

a dedicated VM in the cloudlet, without consideration of whether an existing VM for

the same application could be used to serve multiple users. However, many popular

emerging applications and services are location-specific, and so it becomes realistic

to assume that multiple users in a local area will request the same computing service

from cloudlets. This is especially the case for AR experiences, for example in [79], the

authors introduced a mobile task offloading architecture specifically for mobile

aug-mented reality in a museum setting, where multiple users gaze at the same exhibit.

Meanwhile, Google has created a patent for “Location-based games and augmented

reality systems”[70] allowing players to engage in AR games designed for particular

real-world locations. Since AR technology functions by processing video frames, the

processing of each user video frame can be modeled as an individual task, making it

possible for a Virtual Network Function (VNF) instance on a cloudlet to serve

mul-tiple users. If the VNF instance for that service has already been instantiated, the

offloading cost will be less expensive and the service can be carried out immediately.

However, it then becomes a challenge to assign users to existing VNF instances, or

(31)

of each user is met, as they share computing resources on the cloudlet.

1.4.4 Multiplayer Augmented Reality Game Planning in Mobile Edge Com-puting

Supporting multiplayer interactions between multiple AR users in a shared virtual

environment is a new and difficult challenge that has yet to be addressed. Similar

to AR applications, real-time multiplayer games also proceed frame by frame, where

actions performed by players are taken as input at the beginning of the game frame,

and a sequence of events are generated as output at the end of each game frame.

The duration of a game frame is of critical importance to the user experience, as

it represents the length of time for a player to receive feedback from his or her

ac-tion. Long and erratic game frames can irritate the player and potentially render

the game unplayable. While network service instances deployed on cloudlets can

support multiple users, processing interactions between the users has a much higher

computational demand on the cloudlet. If too many users in close proximity to each

other are acting simultaneously, the limited resources of a cloudlet could be

over-whelmed. To support a large number of players at the same time, it is necessary that

the workload of processing user interactions in MEC is evenly distributed among the

cloudlets, to ensure that players receive feedback from their actions with short

de-lay. However, coordinating a decentralized multiplayer system with large numbers

of users is challenging.

1.5 Thesis Contributions

The main contributions of this thesis are to systematically study the provisioning of

Delay-Aware Services in MEC networks by formulating novel system models and

de-veloping optimization frameworks for the aforementioned problems. By efficiently

managing resources in MEC through developing efficient algorithms, we can

signifi-cantly reduce the delay for end-users as well as increase the throughput of requests

(32)

14 Introduction

• We first address how to balance the workload among cloudlets in the

edge-cloud to optimize mobile application performance. We introduce a novel

sys-tem model to capture the response time delays of offloaded tasks and formulate

an optimization problem with the aim to minimize the maximum response time

of all offloaded tasks. We propose two algorithms for the problem: one is a fast

heuristic, and another is a distributed genetic algorithm that is capable of

de-livering a more accurate solution compared with the first algorithm, but at the

expense of a much longer running time.

• We then study policy-aware unicast request admissions with and without

end-to-end delay constraints in a Software Defined Network (SDN). We aim to

mini-mize the operational cost of admitting a single request in terms of both

comput-ing resource consumption for implementcomput-ing the NFVs in the service chain and

bandwidth resource consumption for routing its data traffic, with a further aim

to maximize the network throughput for a sequence of requests without the

knowledge of future request arrivals. We first formulate four novel

optimiza-tion problems and provide a generic optimizaoptimiza-tion framework for the problems.

We then develop efficient algorithms for the admission of a single NFV-enabled

request with and without the end-to-end delay constraint, where NFV-enabled

requests are defined as the requests with policy enforcement requirements. We

also devise online algorithms with a guaranteed performance for dynamic

ad-missions of requests without the knowledge of future arrivals. In particular, we

provide the very first online algorithm with a provable competitive ratio for the

problem without the end-to-end delay requirement.

• We third investigate the deployment of virtualized network functions among

cloudlets to serve end-users, while meeting the resource demands of mobile

users and their Quality-of-Service (QoS) requirements. We formulate a novel

task offloading problem in a cloudlet network, where each offloaded task

re-quests a specific network function with a maximum tolerable delay, and

dif-ferent offloading requests may require difdif-ferent network function services. We

(33)

in-stance sharing and cost-effective creation of new VNF inin-stances, and develop

an effective prediction mechanism to predict idle VNF instance releases and

new VNF instance creations for further cost savings over time.

• We finally study how users can interact with each other in an AR game.

Sup-porting multiplayer interactions in an MEC environment brings many

chal-lenges. Processing user interactions can be computation-intensive especially

when multiple users in close proximity to each other are acting

simultane-ously; the limited resources of a cloudlet could be overwhelmed if there are

too many players involved. We envision a scenario in the near future where

players wearing AR heads-up display devices engage with other players over a

large area with densely deployed cloudlets, for which we first propose a novel

system model. We then formulate the Decentralized Multiplayer Coordination

(DMC) Problem with the aim of minimizing the game frame duration of all

players, and devise an efficient algorithm for the problem.

1.6 Thesis Organization

The remainder of this thesis is organized as follows. In Chapter 2, we explore the

topic of load balancing offloaded user tasks among cloudlets in the edge-cloud to

reduce the average response time. In Chapter 3, we study how to minimize

op-eration cost in a remote cloud while maximizing cloudlet request throughput with

a specified service chain. In Chapter 4, we investigate how cloudlets can deploy

NFV instances to serve end users while meeting their Quality-of-Service (QoS)

re-quirements. In Chapter 5, we examine how to coordinate a massive multiplayer

Augmented Reality (AR) game among a network of mobile cloudlets and the remote

cloudlet to maximize performance for all participating users. Finally in Chapter 6,

(34)

(35)

QoS-Aware Task Load Balancing in

the Edge-Cloud

2.1 Introduction

A major problem that Mobile Edge Computing service providers face is how to

al-locate user task requests to different cloudlets so that the workload among cloudlets

in the mobile edge network are well balanced, thereby shortening the response time

delay of tasks and enhancing user experience in the use of their service. A typical

so-lution to this problem is to allocate user requests to their closest cloudlets to minimize

the network delay, however this approach has been demonstrated to be inadequate

in an urban setting [43]. Specifically, the vast number of users in the network means

that the workload at each individual cloudlet will be highly volatile. If a cloudlet

is suddenly overwhelmed with user requests, the task response time at the cloudlet

will increase dramatically, causing lag in the user applications and degrading user

experiences. To prevent some cloudlets from being overloaded, it is crucial to assign

user requests to different cloudlets such that the workload among the cloudlets is

well balanced, thereby reducing the maximum response time of offloaded tasks.

In this chapter we deal with the QoS-aware load balancing problem among

cloudlets in the edge-cloud in response to the dynamic resource demands of user

re-quests, by devising efficient algorithms to allocate user requests to different cloudlets.

Specifically, we devise two load balancing algorithms for cloudlets within an

edge-cloud, to reduce the maximum response time of offloaded tasks from mobile users

that consists of queueing and processing time delays at each cloudlet and routing

(36)

18 QoS-Aware Task Load Balancing in the Edge-Cloud

time delays of packets between users and cloudlets.

The rest of the chapter is organized as follows. Section 2.2 discusses the related

works to this topic. Section 2.3 introduces the system model and problem definition.

Section 2.4 gives a detailed description of the fast heuristic algorithm. Section 2.5

pro-poses the distributed genetic algorithm. Section 2.6 presents the simulation results,

and a summary is given in Section 2.7.

2.2 Related Works

Although load balancing has been extensively studied in centralized data centers,

there are essential differences in load balancing between cloudlets and centralized

clouds. Specifically, in a centralized data center, there is a centralized queue for all

in-coming user requests, the workload balancing and task allocations among servers in

the data center is performed by a centralized scheduler called the hypervisor [21, 88].

In such a scenario, the task transfer delay and processing delay between servers in

the data center is several orders of magnitude less than the task transfer and

pro-cessing delays between different cloudlets in an edge-cloud network [7, 103], since

the bandwidth and computing resources within a data center is usually abundant.

In contrast, in a distributed cloudlet environment, user requests are admitted by the

network through their access points (APs), and cloudlets are usually co-located at

the APs. Associated with each cloudlet is a queue of waiting tasks, and the QoS

re-quirements of requests is now an important concern. Balancing the workload among

the waiting queues at different cloudlets while minimizing the maximum task

re-sponse time (delay) among offloaded tasks is challenging. In addition, due to limited

processing capacities of cloudlets, the wait time delay of a task at a cloudlet is a

non-linear function of the workload (processing capacity) of the cloudlet. How to

consider such a distributed queueing delay feature in the workload balance among

cloudlets is another challenge. Finally, the transfer delay by redistributing tasks from

one cloudlet to another cloudlet cannot be ignored. Although there are several

stud-ies on load balancing a network of cloudlets [43, 94, 103], none of these studstud-ies

(37)

among cloudlets.

2.3 Preliminaries

In this section we first introduce our system model. We then define the problem

precisely.

2.3.1 System Model

We assume that an edge-cloud service provider has set up K cloudlets{1, 2, . . . ,K}

at fixed locations in the edge-cloud network. We restrict the scope of the edge-cloud

to just cloudlets at the edge of the network, however including remote distributed

clouds as an offloading destination is trivially done by including an additional index.

The cloudlets are co-located at micro-base station access points (APs) in the network,

and all cloudlets are connected to each other via network connection. We assume

user applications are dynamically partitioned into discrete offloadable tasks that can

be processed at any of theKcloudlets. Users will offload tasks to a nearby AP with a cloudlet, and the cloudlet can either choose to add the task to its own queue or

redirect the task to another cloudlet in the network (See Fig. 2.1).

We model the cloudlets as M/M/n queues, where cloudlet i has a number of servers ni with service rate µi, for i ∈ {1, 2, . . . ,K}. Due to the rapidly changing nature of user demands, the rate of incoming requests can fluctuate wildly at each

cloudlet over time. As such, we assume that the incoming user tasks at each cloudlet

iarrive randomly according to the Poisson process with arrival rateλ_i. The average

wait time of a task in cloudleticonsists of the queueing time and the service time of the task at cloudleti. We useTi, which is a function of a given task arrival rateλ, to calculate the average task wait time at cloudleti:

Ti(λ) =

Cni,µλi

niµi−λ +

1

µi

(38)

[image:38.595.169.383.118.260.2]

Figure 2.1: Cloudlets in the edge-cloud collocated with micro-base station access points (APs)

where

C(n,ρ) =

₍

nρ)c

n! 1−ρ1

∑n−1

k=0

(nρ)k

k! +

₍_n

ρ)c

n! 1 1−ρ

. (2.2)

Eq. (2.2) is known as Erlang’sCformula [54].

As task arrival rates at different cloudlets can be significantly different, some

cloudlets may be overloaded while others may be underloaded. We assume that all

cloudlets are reachable from each other, and each cloudlet can redirect a fraction of

its tasks to another cloudlet. We use f(i,j) to denote the amount of task flow from cloudletito cloudletj, fori6= j(see Fig. 2.2). We thus have the following constraint on f(i,j).

f(i,j) =



 

 

−f(j,i) if i6=j,

0 otherwise

∀i,j∈ {1, . . . ,K} (2.3)

K

∑

i=1

K

∑

j=1

f(i,j) =0, (2.4)

K

∑

j=1

max{f(i,j), 0} ≤λi, ∀i∈ {1, . . . ,K}. (2.5)

(39)

Eq. (2.4) ensures that all flow is conserved, while Eq. (2.5) ensures that the sum of all

outgoing task flows from cloudlet i(we ignore the incoming flow by summing the maximum of f(i,j) and 0 for each cloudlet j) is less than its incoming task arrival rateλi.

We assume all offloaded packets are of equal size, and so the delay incurred in

transferring any packet between a pair of APs through the network is identical. To

model such a network delay in the edge-cloud, denote by c ∈ RK×K _{the network} delay matrix, where entry ci,j represents the shortest possible communication delay in relaying a task between cloudlet i and cloudlet j. We assume that the flow of incoming redirected tasks f(i,j) < 0 at cloudlet i has a delay of −f(i,j)·ci,j. We can then calculate the sumTnet(i)of all network delays of incoming tasks from other cloudlets to cloudletias

Tnet(i) = K

∑

j=1

max{f(j,i), 0} ·cj,i (2.6)

Having Eqs. (2.1) and (2.3), the average task response time D(i)of all tasks that are executed on cloudletican be calculated as follows.

D(i) =Ti λi

+Tnet(i), (2.7)

whereλi is the final incoming task flow that will be processed at cloudleti, which is defined as follows.

λi = λi− K

∑

i=1

f(i,j). (2.8)

2.3.2 Problem Definition

The Cloudlet Load Balancing Problem (CLBP) in an edge-cloud is defined as follows.

Given a set of cloudlets{1, . . . ,K}, where each cloudletiwith task arrival rateλiand

(40)

[image:40.595.136.421.113.292.2]

Figure 2.2: The flow of tasks from cloudletito cloudletj

minimized, i.e.,

min max

f D(i). (2.9)

In this chapter we propose two algorithms for the problem that strive for the

trade-off between the solution accuracy obtained by each algorithm and the running

time of the algorithm.

2.4 Heuristic Algorithm

In this section we propose a heuristic for CLBP. The approach is to identify a balanced

task response timeD and then decide the out-going or in-coming workload of each cloudlet, based on D. As we reduce the task response time of overloaded cloudlets by redirecting some of their tasks to underloaded cloudlets, the task response times

of underloaded cloudlets are increased. By carefully directing workload (flow)

be-tween cloudlets, the tasks processed among the cloudlets will have roughly the same

response time. To this end, we compute the task flow from overloaded cloudlets to

underloaded cloudlets, using a transportation algorithm [30]. This procedure

(41)

2.4.1 Balancing Task Response Time

To find the balanced task response time D and decide the amount of outgoing/in-coming workload for each cloudlet, we guess the value of Dand iteratively improve it until it is within a given accuracy bound. We begin by examining the range of the

average task wait time at cloudlets.

Let Tmax = max

1≤i≤K {Ti(λi)}and Tmin = 1min≤i≤K {Ti(λi)}, then the value of Dis in the range between Tmin and Tmax. We assign D = Tmax+₂Tmin as its initial value. We then partition all cloudlets into two disjoint sets, the setVsof overloaded cloudlets:

Vs=i|Ti(λi)> D ,

and the setVt of underloaded cloudlets:

Vt =

j|Tj(λj)≤D .

For each overloaded cloudlet ini∈Vs, we decide the task demandφ_i of cloudlet i, which is the amount of task flow that should be redirected from its arrival task flowλ_i, such that its task response time is within a given accuracyeofD,

D−T_i(λ_i−φ_i)

≤e, (2.10)

whereeis a given threshold.

For each underloaded cloudletj∈ Vt, we decide the task demandφj of cloudlet

j, which is the amount of task flow to arrive at cloudlet jsuch that its task response time is within the accuracyeofD,

D−T_j(λ_j+φ_j)

≤e. (2.11)

Once we have calculated φ_i for each overloaded cloudlet i ∈ Vs and φ_j for each

(42)

[image:42.595.174.387.116.288.2]

Figure 2.3: Finding optimal outgoing and incoming workload for each cloudlet in the edge-cloud

cloudlets to underloaded cloudlets. Fig. 2.3 is an illustrative diagram of the

calcula-tion ofφi andφj for alliandjwith 1≤i≤ |Vs|and 1≤ j≤ |Vt|.

Because redirecting tasks from overloaded cloudlets to underloaded cloudlets

will incur network delays at underloaded cloudlets, we should further adjustDsuch that the sum of the task response time at each underloaded cloudlet j ∈ Vt and the incoming network delay Tnet(j) of all tasks to cloudlet j is nearly equal to D, i.e., Tj(λj) +Tnet(j) ≈ D. Let D0 = max

j∈Vt

Tj(λj) . If the difference between D and D0 _{is within a certain bound of accuracy} _θ_{, we are done; otherwise, we need} to further refine D. If D < D0, this means that we must reduce the amount of outgoing tasks from overloaded cloudlets, and we need to increaseDby reducingφi for each overloaded cloudlet i ∈ Vs. Otherwise, we should increase the amount of outgoing tasks from overloaded cloudlets, and lowerDto allow overloaded cloudlets to redirect more tasks to underloaded cloudlets. We choose D ← 1₂ D+D0

, and

recursively search forDand continue this procedure until the difference between D

andD0 is within the accuracy boundθ.

(43)

Algorithm 2.1CLBP-Heuristic Algorithm

Input: NET, θ, e

Output: f(i,j),i,j∈ {1, 2, . . . ,K}. 1: Tmax ← max

1≤i≤K {Ti(λi)}; 2: Tmin ← min

1≤i≤K {Ti(λi)}; 3: D← Tmax+₂Tmin;

4: Vs ←

i|Ti(λi)>D ; 5: Vt ←j|Tj(λj)≤D ; 6: D0 ←∞;

7: while

D−D

0

>θ do

8: foreachi∈Vsdo 9: calculateφi such that

D−Ti(λi−φi) D

≤e;

10: foreachj∈Vt do 11: calculateφj such that

D−Tj(λj+φj) D

≤e; 12: Φ← {φk|k∈Vs∪Vt}

13: calculate f(i,j) for each i,j ∈ {1, 2, . . . ,K} by invoking Procedure

minLatencyFlow(Vs,Vt,Φ); 14: foreachj∈Vt do

15: calculate D(j)by Eq. (2.7); 16: D0 ←max

j∈Vt{D(j)}; 17: D← 1

2

D+D0;

2.4.2 Minimum-latency Flow

Once we have determined the amount of outgoing or incoming task flow φk for each cloudletk, we then determine for each outgoing task flow from an overloaded cloudletito each incoming task flow of underloaded cloudletj, the value of the redi-rected task flow f(i,j). To this end, we reduce the problem of routing the outgoing task flow from overloaded cloudlets to underloaded cloudlets to the minimum-cost

maximum flow problem in an auxiliary flow graph G = (V,E) derived from the original network as follows (see Fig. 2.4).

We first add a virtual source node s and a virtual sink node t to V. We then partition the cloudlets into two disjoint sets: set Vs of overloaded cloudlets and set

(44)

[image:44.595.149.406.117.288.2]

Figure 2.4: Auxiliary flow graph is generated to find minimum-latency flow

from node sto each node inVs, and a directed edge from each node inVtto nodet. This gives us the set of edgesE= {hs,ii |i∈Vs} ∪ {hj,ti |i∈Vt} ∪ {hi,ji |i∈ Vs, j∈

Vt}.

Denote byu(i,j)the capacity of edge hi,ji ∈ E. The edge capacity of each edge from the source node s to a cloudlet nodei ∈ Vs is set as u(s,i) = φ_i for each edge

hs,ii, and the edge capacity of each edge from an underloaded cloudlet node j∈ Vt to the sink node t is set as u(j,t) = φ_j for each edgehj,ti. The latency cost of each

edge from the source node s to an overloaded cloudlet node i ∈ Vs is set as zero, i.e. c_s_,_i = 0. Similarly, the latency cost of each edge from each underloaded cloudlet node j∈Vtto the sink nodetis set as zero, i.e. cj,t =0. For each edgehi,ji, i,j6= s,t from an overloaded cloudlet ito an underloaded cloudlet j, its edge capacity is set asu(i,j) =min {u(s,i),u(j,t)}.

Having constructed the flow graphG, it can be seen that the problem of routing outgoing task flow from overloaded cloudlets to underloaded cloudlets is reduced to

finding a minimum-cost maximum flow inGfromsto t, i.e.,

min

∑

(i.j)∈E

(45)

subject to the following constraints:

f(i,j)≤u(i,j), ∀i, j∈V (2.13)

∑

i∈V\{s}

f(s,i) =

∑

j∈V\{t}

f(j,t), i6=sorj6= t (2.14)

∑

j∈V\{s,t}

f(i,j) =0, i6=sorj6= t (2.15)

where f(i,j)·c_i_,_j is the amount of network delay incurred by transferring tasks from cloudlet i to cloudlet j. This is clearly an instance of the Hitchcock Transportation Problem [30], and can be solved within O K4

time, using a Transportation

Algo-rithm [24, 30], whereKis the number of cloudlets. The details are given in Procedure 2.1.

Procedure 2.1minLatencyFlow

Input: Vs, Vt, {φ_k|k∈ Vs∪Vt}

Output: fi,j,i,j∈ {1, 2, . . . ,K}.

1: /* Construct the flow network with latency weighted edges. */ 2: V← {1, 2, . . . ,K} ∪ {s, t};

3: E←_∅;

4: foreachi∈Vs do 5: E←E∪ {hs,ii}; 6: u(s,i)←φi; 7: cs,i ←0; 8: foreach j∈Vtdo 9: E←E∪ {hj,ti}; 10: u(j,t)←φj; 11: cj,t←0; 12: foreachi∈Vs do 13: for j∈Vtdo

14: E← E∪ {hi,ji};

15: u(i,j)←min {u(s,i),u(j,t)};

(46)

2.5 Distributed Genetic Algorithm

Although later experimental results indicate that the proposed heuristicAlgorithm2.1 in the previous section can deliver a feasible solution quickly, the solution may not

be sufficiently accurate. In this section we devise a distributed genetic algorithm for

CLBP that improves the accuracy of the solution at the expense of a longer running

time. The key to this distributed algorithm is to perform fine-grain workload

balanc-ing among cloudlets iteratively through gene mutations until the solution converges

on a given threshold.

2.5.1 Genetic Algorithm Operations

Genetic algorithms (GAs) have been widely used for combinatorial optimization

problems. Traditional GAs maintain a population of solutions encoded as “genes”

where only the fittest genes are bred to produce successively fitter generations of

genes. However, genetic algorithms are often computation intensive and scale poorly.

To overcome this, we design a distributed genetic algorithm for CLBP by leveraging

the computing power of cloudlets. We first partition the cloudlets into overloaded

and underloaded cloudlets, using each cloudlet p ∈ {1, . . . ,K}as a partition refer-ence. We then solve the CLBP problem using a distributed GA to find the fittest

gene for each partition. We finally select the fittest gene among all partitions as the

solution to the problem.

We begin by discussing the representation of genes. A gene is simply an encoded

version of the task flow matrix f. While it is possible to use f directly as the gene representation, it is not the most efficient way. Many potential solutions using f as the gene will contain cloudlets that both receive and redirect flow, which

guaran-tees that the solution is sub-optimal. By partitioning the cloudlets into overloaded

and underloaded cloudlets, and representing only the task flow from overloaded to

underloaded cloudlets, the potential solution space is substantially reduced. This in

turn significantly reduces the number of iterations to an acceptable solution. If we

sort the cloudlets according to their local task response times, we can partition the

(47)

the setsVsandVt of overloaded and underloaded cloudlets as defined in the previ-ous section. Letg(i,j)denote the amount of task flow from cloudleti∈Vsto cloudlet

j∈ Vt, fori6= j, wheregis a|Vs| × |Vt|matrix. It can be seen that ghas a one-to-one mapping to the task flow variable f, i.e., every gene has a unique corresponding flow matrix. We have the following constraint on g(i,j):

∑

j∈Vt

g(i,j)≤λi, ∀i∈Vs (2.16)

∑

i∈Vs

g(i,j)<nj·µj, ∀j∈Vt (2.17)

where Eq. (2.16) limits any given cloudlet i ∈ Vs from redirecting more tasks than available according to its task arrival rate, and Eq. (2.17) limits any given cloudlet

j ∈ Vt from having a total incoming task flow of more than nj·µj, as this would result in an infinite queue time at the cloudlet (see Eq. (2.1)).

Our initial gene population is constructed by randomly populating g_i with uni-formly selected random numbers in the range(0,λi). If a randomly generated gene violates one of the constraints, we randomly decrease values in the relevant row or

column until the constraints are met.

Denote byPthe given number of genes maintained in the gene population. The fitness of each gene is evaluated using our problem objective (defined in Formula. 2.9)

as our fitness function. We then select only the fittest genes to survive to the next

generation and repopulate the genepool. Two genes g1 and g2 breed to create an offspring gene gby taking the mean of each value:

g(i,j) = 1

2(g1(i,j) +g2(i,j)). (2.18)

Denote bySthe number of surviving genes that will persist into the next genera-tion. We use the roulette selection to selectSsurvivors and randomly crossbreed the survivors until the genepool population has been replenished. As we aim to

mini-mize the fitness function, we take the reciprocal of each gene’s fitness metric when

performing roulette selection.