• No results found

Networking Architectures for Big-Data Applications

N/A
N/A
Protected

Academic year: 2021

Share "Networking Architectures for Big-Data Applications"

Copied!
14
0
0

Loading.... (view fulltext now)

Full text

(1)

Networking Architectures for

Networking Architectures for

Big

Big

-

-

Data Applications

Data Applications

Raj Jain

Washington University in Saint Louis Saint Louis, MO 63130

[email protected]

Eighth Annual Microsoft Research Networking Summit Woodinville, WA, June 18-19, 2012

These slides and audio/video recordings of this talk are at:

(2)

Overview

Overview

1.

Big Data: Why Now?

2.

Database Solutions and Networking Bottlenecks

3.

Intra-Cloud Solutions

(3)

Big Data: Why Now?

Big Data: Why Now?

“Big Data” first appeared as a problem in October 1997

Big Research Funding in Q2 2012  Big Academic Interest in all fields

Ref: http://whatsthebigdata.com/2012/06/06/a-very-short-history-of-big-data/

Database Analytics Storage Networking

(4)

Samples of Recent News about Big Data

Samples of Recent News about Big Data

http://cloudcomputing.sys-con.com/node/2286075 , May 30, 2012 http://www.accel.com/bigdata http://betabeat.com/2012/02/big-data-funding-gets-bigger-ia-ventures-raises-105-m-double-its-previous-fund/

+

+

+

NSF $80M, DoD $250M, DOE $25M

+

http://gigaom.com/cloud/obamas-big-data-plans-lots-of-cash-and-lots-of-open-data/

(5)

Magnitude of Data

Magnitude of Data

 2.5 Exa (=1018) bytes of information created per day

= 30k  US Library of Congress

 9.57 Zetta (=1021) bytes processed by servers in 2008  One Zetta byte traffic on the Internet

 Solutions:  Database  Analytics  Storage  Networking  … Ref: http://en.wikipedia.org/wiki/Big_data http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/ white_paper_c11-481360_ns827_Networking_Solutions_White_Paper.html

(6)

Database Solutions

Database Solutions

1. Share Nothing Architecture

2.

Map-Reduce

3. Apache Hadoop Distributed File System

Data1 Processor1 Datan Processor2 Map Reduce Output Map Map Reduce Input Output Shuffle Switch Name Node DN+TT Sec. NN Client Job Tracker

Switch Switch Switch

DN+TT DN+TT DN+TT

DN+TT DN+TT DN+TT DN+TT

Rack Rack Rack Rack

DN= Data Node TT = Task Tracker

(7)

Networking Bottlenecks

Networking Bottlenecks

1. Network Attached Storage (NAS):

 Opposite of Share-nothing architecture

 Networking Bottleneck Directly attached storage

2. IP Subnets:

 Routing much slower than switching

 Moving VMs between subnets Addressing issues eep communication in a subnet

3. L2 Fabric: Multi-switch latency  Single Rack Traffic

(8)

Intra

Intra

-

-

Cloud Solutions

Cloud Solutions

1. Flatter L2 Topologies:

2. Virtualization: Ethernet over IP (VXLAN, TRILL) Multiple IP domains look like one L2 domain

Does not solve the latency issue

3. Optimal Placement: E.g., Camdoop uses reducers in the core 4. Location Based Naming: As in Tashi from CMU

5. Programmable Networks: Virtual topology similar to a single rack even though a the systems are physically in different racks

Subnet Subnet Subnet

(9)

Really Big Data: Multi

Really Big Data: Multi

-

-

Cloud Issue

Cloud Issue

Example:

 Siri (Somewhat Intelligent Response Interpreter)

Needs to consult global databases

Where should the name tracker, data tracker, task tracker reside?

(10)

Our Solution: OpenADN

Our Solution: OpenADN

Open Application Delivery Networking Platform.

Platform = Clients, Servers, Switches, and

Middle-boxes

Servers = Name nodes, Task trackers, Data nodes,

Clients

Allows global Internet to appear as one datacenter

Message-specific sequence of middle boxes and

servers

Access ISP Access ISP Servers A1, B1 Clients Clients Internet OpenADN Switches Routers Servers A2 OpenADN middle-boxes

(11)

OpenADN Features

OpenADN Features

 Message-specific sequence of middle-boxes and Servers:

Name nodes, Task trackers, Data nodes, Clients

 Middle Boxes:

Load balancing, Fault tolerance, Intrusion Detection, … Control plane and data plane MBs

 Server mobility, User Mobility

 ISPs can offer proper routing and switching services without

visibility into data

CSPs (Cloud service providers) have visibility into data

(12)

Key Features of OpenADN

Key Features of OpenADN

1. Edge devices only.

Core network can be current TCP/IP based, OpenFlow and SDN based

2. Coexistence (Backward compatibility): Old on New. New on Old

3. Incremental Deployment

4. Economic Incentive for first adopters

5. Resource owners (ISPs) keep complete control over their resources

(13)

Summary

Summary

1. Big data = Big Funding  Big Interest

2. Need networking architectures that complement Shared Nothing Architecture, Map Reduce, Hadoop File Systems 3. NAS needs to be redesigned.

Flat topologies are helpful.

Virtualization needs to include latency issues 4. Programmable networks are potential solution

5. Solving multi-cloud is important for really big global data sets.

(14)

Thank You!

Thank You!

Comments?

Questions?

Comments?

Questions?

References

Related documents

administration that are currently located in Greenville at the site Clemson at the Falls, strengthening the interaction between the graduate students in other business degrees

Policy issues include: (i) estimating the carbon footprint of converting natural ecosystems to agricultural land; (ii) prioritising land for restoration of biocarbon stocks

•• Service contract description ( Service contract description (eg eg. Service domain description). Service domain description) –– Template and instructions in RIV Template

H2A: When firms in a portfolio of bonds experience an exogenous contagion, the structure of the connections between firms, not just the number of connections, decreases the firm

close to L 1 , and move towards a more equal one in which the share of skilled workers is higher, aggregate output increases because the positive effect playing through the increase

LUDAN Software & Control System Israel Symcotech Israel LUDAN Environmental Technologies SRL LUDAN Environmental Technologies LTD Israel LUDAN Energy Overseas LTD Israel

This study examines the similarities and differences in the attitudes and perceptions of university students at various academic levels (freshmen-doctoral) with

Find the area of the shaded region in this fi gure, correct to 2 decimal places... ABCD and BCEF