• No results found

Computing at the Edge: Improving the Performance of Big Data and Mobile Apps with In-Network Computation

N/A
N/A
Protected

Academic year: 2021

Share "Computing at the Edge: Improving the Performance of Big Data and Mobile Apps with In-Network Computation"

Copied!
72
0
0

Loading.... (view fulltext now)

Full text

(1)

Peter R. Pietzuch

[email protected]

Computing at the Edge:

Improving the Performance of Big Data and

Mobile Apps with In-Network Computation

Peter Pietzuch

(joint work with Andreas Pamboris*, Lukas Rupprecht, Luo Mai, Abdul Alim, Paolo Costa*, Matteo Migliavacca, Alexander L. Wolf, Miguel Baguena Albaladejo,

Pietro Manzoni)

Large-Scale Distributed Systems Group

Department of Computing

(2)

Network Devices Have Changed…

2

1973: First portable mobile phone call in NYC

1946: First mobile phone call from car

(3)

Today’s Mobile Devices Are About Apps

Mobile applications require

low latency

access to services

Mobile networks: often slow and unpredictable

– Highly variable network conditions

Consider online gaming use case

– Different clients perceive updates to shared game state with lag

Multiplayer games

(4)

Mobile Users Have High Expectations

Mobile devices have

limited resources

(CPU, memory)

compared to laptops/desktops/servers

Consider limited device memory: eg 512MB (iPhone 4S)

– Application left with 213MB (rest for OS)

– Single 8-megapixel photo: over 30MB of bitmap data

– Image editing app cannot have more than 7 photos resident in memory

4 Real-time augmented reality

Speech recognition

(5)

Mobile Networks Are Dumb Pipes

Mobile Network

Call Of Duty

(6)

Data Centres Have Come A Long Way…

6

1997: Google’s first data centre at Stanford

Today: Google’s data centre in Oregon

(7)

Big Data Analytics is Network-Bound

Our digital interactions are stored, searched, analysed

– eg email, online shoppping, texts, tweets, pictures, Facebook updates

Explosion in data volume

Requires massively-parallel processing in many servers

Smart fridges Smartphones

(8)

Data Centre Networking Provides Dumb Pipes

8

Data Centre Network

(9)

Idea: Hosting Services within the Network

Network

Call Of Duty

End users

Backend services

Call Of Duty

1. NetAgg:

Higher

bandwidth

for distributed applications

2. EdgeCloud:

Lower

latency

to Internet backend services

Talk outline:

(10)

1. Increasing Bandwidth

10

Network

Call Of Duty

End users

Backend services

1. NetAgg:

Higher bandwidth for distributed applictions

Avoid bandwidth bottleneck

2. EdgeCloud: Lower latency to Internet backend services

3. CloudSplit: Access to more memory resources

(11)

Data Centre Networks Are Expensive

Cisco 3560-48TD: $11,995 Juniper Ex8216: $750,000 Cisco 6509E: $500,000

10 Gbps

20 Gbps

1 Gbps

Top of the rack switch

Aggregation

switch

Core

switch

(12)

Problem: Bandwidth Over-Subscription

1 Gbps

10 Gbps

20 Gbps

40 Gbps

20 Gbps

160 Gbps

40 Gbps

1:100 oversubscription

rate not uncommon, with

reported cases of 1:240

1:2

oversubscription

oversubscription

1:4

12

Performance of distributed applications limited by network

(13)

Network-as-a-Service: In-Network Processing

[HotICE’12]

Network switches augmented with

processing capabilities

– Multiple options: software-only middleboxes, programmable FPGAs

Applications deploy

processing functions

on each device

(14)

NetAgg: On-Path Data Aggregation

[CoNEXT’14] 14 ToR W S S S S AS W W W ToR W M W W ToR W W W W ToR W W W W AS

AS = L2 Aggr Switch

S = L2 Switch

ToR = Top-of-Rack Switch

W = Worker Server

M = Master Server

Input Data Flow

Outgoing Data Flow

A Single Layer 2 Domain

Bandwidth

bottleneck

ToR W S S S S AS W W W ToR W M W W ToR W W W W ToR W W W W AS

AS = L2 Aggr Switch

S = L2 Switch

ToR = Top-of-Rack Switch

W = Worker Server

M = Master Server

Input Data Flow

Aggr Data Flow

On-path

Aggregation

A Single Layer 2 Domain

Aggregate application data early

along network paths

to reduce traffic

(15)

Example: MapReduce Jobs

Chunk

0

Chunk

1

Chunk

2

Input file

Map Task

Map Task

Map Task

Reduce Task

Reduce Task

Reduce Task

Intermediate results Final results

Mainstream “Big Data” analytics framework

– E.g. PageRank (web search indexing)

– Open source version (Hadoop) by Yahoo/Apache

(16)

Shuffle Phase is Network Bottleneck

Split

0

Split

1

Split

2

Input file

Map Task

Map Task

Map Task

Reduce Task

Reduce Task

Reduce Task

Intermediate results Final results

Shuffle phase

challenging for data centre networks

– All-to-all traffic pattern with O(N2) flows – Bottleneck due to limted edge bandwidth,

over-subscription and TCP incast

Parent Server

(17)

Data Reduction in MapReduce

Final results typically much smaller than intermediate results

(e.g. WordCount)

– Most Facebook jobs final size is 5.4% of intermediate size

– In most Yahoo jobs, final size is 8.2%

Split

0

Split

1

Split

2

Input file

Map Task

Map Task

Map Task

Reduce Task

Reduce Task

Reduce Task

(18)

On-Path Aggregation Trees

Aggregation trees perform multi-step aggregation

Reduce Reduce Reduce Reduce Reduce Reduce Reduce Reduce Reduce Reduce Reduce Reduce

Reduce Reduce

(19)

In-Network Aggregation Increases Throughput

Pr ocessing thr oug hpu t (Gbp s) Number of clients

Bottlenecked by 1 Gbps

edge bandwidth

Plain Solr

NetAgg + Solr

(20)

2. Reducing Latency

20

Network

Call Of Duty

End users

Backend services

1. NetAgg: High bandwidth for distributed applications

1. EdgeCloud:

Lower latency to Internet backend services

Avoids high Internet delays

3. CloudSplit: Access to more memory resources

(21)

Edge Service Hosting in Mobile Networks

Edge services can reduce latency to clients

1. Challenge: Mobile clients roam over time

– Need to adapt placement decisions over time

2. Challenge: Need to migrate

service state

efficiently

(22)

Scenario: First Person Shooter (FPS) Game

22

Game clients

180ms 80ms 40ms 5ms 5ms 5ms

1

2

3

4

5

6

Possible game server

locations

(23)

Scenario: First Person Shooter (FPS) Game

180ms 80ms 40ms 5ms 5ms 5ms

1

2

3

=5ms

4

5

6

Idle servers

Active game server

=10ms

(24)

24 Client 1 Moving… Client 2 Looking at Client 1 10ms to Server 4 Client 3 Looking at Client 1 300ms to Server 4

(25)

E

DGE

C

LOUD

: Automatic Edge Service Migration

180ms 80ms 40ms 5ms 5ms 5ms

4

5

6

Game clients:

-

Monitor network connectivity to servers

-

Send updates to active server:

<server, latency>

2

3

1

(26)

26 180ms 80ms 40ms 5ms 5ms 5ms

1

2

3

Active server:

-

Checks for “better” server

-

Minimise maximum latency

4

5

6

Client 1 Client 2 Client 3

<4, 5ms> <4, 10ms> <4, 300ms> <5, 15ms> <5, 0ms> <5,

210ms> <6,

125ms> <6, 130ms> <6, 180ms>

(27)

180ms 80ms 40ms 5ms 5ms 5ms

1

2

3

4

5

6

Client 1 Client 2 Client 3

<4, 5ms> <4, 10ms> <4, 300ms> <5, 15ms> <5, 0ms> <5, 210ms> <6, 125ms> <6, 130ms> <6, 180ms>

 Server 6 is most fair choice

(28)

28 180ms 80ms 40ms 5ms 5ms 5ms

1

2

3

5

Create snapshot of game service

4

6

(29)

180ms 80ms 40ms 5ms 5ms 5ms

1

2

3

5

Send snapshot to Server 6 via Client 1

Restart game service on Server 6

4

6

(30)

30 Client 1 Moving… Client 2 Looking at Client 1 130ms to Server 6 Client 3 Looking at Client 1 180ms to Server 6

(31)

180ms 80ms 40ms 5ms 5ms 5ms

1

2

3

5

To minimise “dead time” during migration

-

Clients and servers cache previous snapshots received

4

6

cache cache cache cache cache cache
(32)

32 180ms 80ms 40ms 5ms 5ms 5ms

1

2

3

5

4

6

When game service migrates back to Server 4…

-

Only diffs from cached snapshot sent

-

cache

=

cache

cache

(33)

180ms 80ms 40ms 5ms 5ms 5ms

1

2

3

5

4

6

Less data

faster transmission

faster migration

+

cache

=

(34)

3. Extending Client Resources

34

Network

Call Of Duty

End users

Backend services

1. NetAgg: Higher bandwidth for distributed applications

1. EdgeCloud: Lower latency to Internet backend services

2. CloudSplit: Access to more memory resources

Offloads apps running on resource-constrained mobile devices

Code offloading
(35)

Code Offloading for Mobile Apps

Most existing offloading systems speed up execution of apps

– eg MAUI, COMET, ThinkAir, CloneCloud

– Offload compute-intensive app functions (eg game AI, image processing, …) – Exploits faster/more parallel CPU of remote server

Challenge: Increase

memory

available to application?

– Application memory footprint > available device memory – Offloading systems today migrate application state

(36)

State Migration

Vs State Partitioning

36

<offload request> +

Application state

transmit portion of state

Current state-of-the-art Maui ThinkAir COMET CloneCloud

(37)

State Migration

Vs State Partitioning

Application state

(38)

State Migration

Vs State Partitioning

38

Application state

<response> +

(39)

State Migration

Vs

State Partitioning

Internet

Local state

Remote state

(40)

State Migration

Vs

State Partitioning

40

<offload request>

(41)

State Migration

Vs

State Partitioning

<response>

(42)

State Migration State Partitioning

42

Pros

Simplifies handling of network

failures:

- Repeat failed call locally

Cons

Limited by device memory

Higher offloading overhead

Repeatedly offloaded calls require sending same state multiple times

Solved!

Need to recover

remote state after

(43)

Local state snapshot Lk-1

Local state

Remote state

Before offloading:

Snapshot-based Failure Recovery

(44)

44

Remote state snapshot Rk-1

Local state

Remote state

Before returning:

(45)

Local state

background task

Remote state

In the background:

<Rk>

Transmits most recent remote

(46)

Snapshot-based Failure Recovery

46

Local state

Remote state

After network failure during k

th

offloaded call:

-

Use

<R

k-1

>

and <L

k

> to restore state

-

Repeat failed k

th

offloaded call locally

<Rk-1>

Case A:

R

k-1

received before failure

(47)

snapshot-based fault-tolerance mechanism!

Local state

Remote state

Case B

:

R

k-1

not

received before failure

After network failure (during k

th

offloaded call):

Snapshot-based Failure Recovery

<Lk> <Lk-1>

<Rk-2>

-

Use

<R

k-2

>

and <L

k-1

> to restore state

(48)

Conclusions

Modern applications have new network requirements

– Interactive mobile apps need extremely low latency – Data centre applications require high bandwidth – Clients remain resource constrained

We need to change the network model:

Push application-specific computation into the network

– Exposes network locations for cloud service hosting – Blurs boundary between applications and networks – Three examples: NetAgg, EdgeCloud, CloudSplit

Lots of open research challenges

– Programming models, network management, security, …

Thank You! Any Question?

Peter Pietzuch

(49)
(50)

Cooling and Power Distribution

(51)

Developer’s View of the Network

POSIX network socket interface:

Strict separation between applications and network

No control over network

int

send(

int

sockfd

,

const void *

msg

,

int

len

,

int

flags

);

int

recv(

int

sockfd

,

void *

buf

,

(52)

Why Large DCs? Because Performance Matters

Online services

(e.g. Facebook, Google Search, Amazon)

– e.g. Facebook: 400,000,000 requests/s, 200 TB of data in RAM – Expected response time < 100 ms

• Amazon: every 10 ms of latency cost them 1% in sales • Google: extra 500 ms reduced traffic by 20%

Offline batch processing

(e.g. MapReduce, graph processing, ML)

– High throughput (e.g. Facebook 60-90 TB/day)

Internal services

to support above applications

– Google (GFS, BigTable, MapReduce), Yahoo (Hadoop, HDFS), Amazon (Dynamo, Astrolabe), Microsoft (Dryad), Facebook (Cassandra, Hbase)

(53)

Scalability through Partition/Aggregation

Many large-scale

services use

partition/aggregation

pattern

– e.g. Google Search 1. Request sent to

multiple workers 2. Workers process

request in parallel 3. Responses from

workers are combined

Every request on

Amazon might involve

10,000+ servers

Network critically determines application

(54)

Mismatch: Abstraction & Reality

Switches

Router

One logical hop is mapped to multiple physical hops

Two disjoint

logical paths share some physical links

(55)

Mismatch between App & Network Abstraction

Abstraction

Reality

Applications use logical

topologies to communicate

Dynamo MapReduce Tree Search Databus
(56)

Addressing Over-subscription

Solutions

– Keep traffic local to racks

– Reverse engineering of IP addresses to extract rack location

– Use multi-path topologies (fat trees) to increase bandwidth in core network [SIGCOMM’08, SIGCOMM’09]

(57)

Problem #2: TCP Incast

Distributed file system

Files are striped across

multiple servers

Client issues several synchronised reads

for given query

Servers respond (approx.) at same time

Switches drop entire windows of packets, causing TCP collapse

CONGESTIO N

Solutions

– Add random delays to each reply – Use small replies (<2.5 KB)

(58)

Problem #3: Path Collision

Network allocates paths independently

Applications cannot modify way packets are routed

(59)

Bridging the Application/Network Gap

Applications perspective

Network Perspective

Network only sees packets

No insights about application

behaviour

Has to infer application patterns

The network is a black box

for applications (and vice versa)

Applications only see IP addresses

− Hard to infer locality & congestion

No control over packet routing

Need to reverse-engineer

network

?

?

Why slow?

(60)

Single

administrative domain

Homogenous

HW and network

− x86 and Ethernet

Topology

known

− and can be customised

Trusted

components

− e.g. using virtualisation

Data Centre vs. Internet Networking

Internet

Data Centres

DC networks affected by Internet design…

…but data centres are not mini-Internets

Strict

layer

isolation

Multiple

administrative domains

Heterogeneous

hw and network

Topology

not

known

Malicious

software

(61)

Letting Applications Adapt the Network

Many proposals for

software-based routers and switches

– e.g. RouteBricks , ServerSwitch, PacketShader, SideCar, NetMap,…

Replace traditional (

application-agnostic

) network services

– e.g. IPv4 forwarding, deep packet inspection, firewalls

Why don’t use them to implement

application-specific

services?

– E.g. aggregate packets in distributed query or content-based routing

Goal:

Abstractions & mechanisms to enable applications to efficiently,

easily and safely

process data within the network

(62)

NetAgg Platform: Design

Agg nodes

connected to

switches via high capacity links

– High-performance aggregation in software

– Exploit associativity of aggregation computation for parallel execution

on multicore CPUs

Shim layers

at edge nodes

intercept network data

– Transparent aggregation for applications

– Requires few changes to existing applications Worker node Worker node Master node Shim layers 1 2 W W Agg node 1 AS Agg node 2

...

... S requests 62
(63)

NetAgg Aggregation I

Worker node Worker node Master node Shim layers 1 2 W W Agg node 1 AS Agg node 2

...

... S requests

1. Data processing request

received by

master node

2. Partial requests sent to all

worker nodes

for parallel

processing

(64)

NetAgg Aggregation II

3. Partial results redirected to

agg nodes

by

shim layers

on

worker nodes

Worker node Worker node 3 Agg node 1 AS Agg node 2 S W W ...

...

3 partial results 64
(65)

NetAgg Aggregation III

4.

Agg nodes

form

aggregation

tree

along network paths,

exchanging partially aggregated

results

5. Final aggregated result

returned to

master node

Master node Agg node 1 AS Agg node 2 S

...

4 5 aggregated results
(66)

Experimental Evaluation

Questions:

– What are the benefits for NetAgg applications? – What is the impact for non-NetAgg applications? – What is the Agg Node processing rate achieved?

Set-up: Packet-level simulator (OMNeT++)

– 1,024-server fat-tree topology (1 Gbps edge links) – Over-subscription of 1:4

– 10,000 simulated TCP flows

– Consider Flow Completion Time (FCT)

(67)

Individual Flow Completion Time

Includes 50%

non-NetAgg

flows

CDF

Flow Completion Time (seconds)

NetAgg

Edge-based

aggregation

(rack, chain,

binary tree)

(68)

Total Flow Completion Time

Faster Agg node

Worse

Better

Fl ow Co mpleti on

Time

R elativ e to R ack -lev el aggr egatio n

Agg node processing rate R (Gbps)

Even low aggregation rate enough

to achieve significant benefits

R=1 Gbps already

reduces time by 83%

91% time

reduction

(69)

E

DGE

C

LOUD

Features

Edge service migration occurs via clients

– Supports clients in multi-homed environments:

e.g. switching from LTE network to Wi-Fi LAN environment – Does not require edge locations to interact directly

No service modification required

– Edge services treated as “black boxes” running in containers

Asynchronous transfer of service state

– Service state transferred while service is running

(70)

Game events from Client 1

– Propagated to Clients 2 and 3 with some lag

• Client 2 – Server 4

10ms latency

• Client 3 – Server 4

300ms latency

Client 2 has an advantage over Client 3

– Can react to different events faster

– May lead to game inconsistencies

• Client 3 shoots at Client 1 thinking the latter is in its range

• Client 1 only appears to be in Client 3’s shooting range

»

due to increased network delays

Problems due to increased network latency

(71)

Game events from Client 1

– Received by Clients 2 and 3 in a synchronous fashion

• Client 2 – Server 6

130ms latency

• Client 3 – Server 6

180ms latency

Clients 2 and 3 playing on equal terms

(72)

Snapshot-based Fault-Tolerance Mechanism

72

Device periodically stores snapshot of remote state

– Used to reconstruct application state after failure

– Execution rolls back to last consistent snapshot received

User-level virtual memory scheme after failure

– Remote state may grow larger than available memory – Snapshots stored in flash memory

References

Related documents

Here we will discuss the importance of mobile applications, challenging for mobile apps testing, differences between mobile and desktop apps testing, types of mobile apps,

Network infrastructure for Big Data apps should be bandwidth capable, adaptable, and cost-efficient to handle the impact of ingesting large volumes of data and delivering it to

(1) This turns ON when the maximum and minimum values stored in buffer memory addresses 30 to 45 (Un\G30 to Un\G45) are reset by setting the maximum value/minimum value reset

split antigen A23(9) A23(9) Serology defined Serology defined broad antigen broad antigen A9 A9 Serologic mismatch Serologic mismatch A2 A2 Potential allele Potential allele

This study investigated the effect of the vaccine strain for tuberculosis, Bacille Calmette-Guerin (BCG) on neural tissue using the model of organotypic

The network needs application delivery optimization that can participate in the automated orchestration of virtual infrastructure to improve and guarantee performance

The scam was &#34;nearthed in 2ovember +99 which forced the then Chief &amp;inister of  &amp;aharashtra7 Ashok Chavan7 to resign. The allottees incl&#34;ded Dev'ani

We investigate the performance of variational approximations in the context of the mixed logit model, which is arguably one of the most used models for discrete choice data.. A