• No results found

Hadoop Optimizations for BigData Analytics

N/A
N/A
Protected

Academic year: 2021

Share "Hadoop Optimizations for BigData Analytics"

Copied!
24
0
0

Loading.... (view fulltext now)

Full text

(1)

Hadoop Optimizations for BigData Analytics

Weikuan Yu

(2)

Outline

Background

Network Levitated Merge

JVM-Bypass Shuffling

(3)

Emerging Demand for BigData Analytics

•  Big demand from many organizations in various domains –  Scalable computing power without worrying about system maintenance.

–  Ubiquitously accessible computing and storage resources.

–  Low cost, highly reliable, trusted computing infrastructure.

(4)

MapReduce

l 

A simple data processing model to process big data

l 

Designed for commodity off-the-shelf hardware components.

l 

Strong merits for big data analytics

l  Scalability: increase throughput by increasing # of nodes

l  Fault-tolerance (quick and low cost recovery of the failures of tasks)

l 

Hadoop, An open-source implementation of MapReduce:

l  Widely deployed by many big data companies: AOL, Baidu, EBay,

(5)

High-Level Overview of Hadoop

MapTasks ReduceTasks HDFS Shuffle Intermediate Data Task Tracker/Runner Task Tracker/Runner JobTracker Applications Job Submission

l  HDFS and the MapReduce Framework

l  Data processing with MapTasks and ReduceTasks l  Three main steps of data movement.

l  Intermediate data shuffling in the MapReduce is time-consuming

1 3

(6)

Outline

Background

Network Levitated Merge

JVM-Bypass Shuffling

(7)

Motivation for Network-Levitated Merge

Time

shuffle merge reduce map

Start First MOF Serialization

(8)

Repetitive Merges and Disk Access

more 1: merge 3: merge 4: to merge soon 2: insert

Hadoop data spilling controlled through parameters

–  To limit the number of outstanding files

(9)

Hadoop Acceleration (Hadoop-A)

Pipelined shuffle, merge and reduce

Network-levitated data merge

TaskTracker JobTracker ReduceTask TaskTracker MapTask MOFSupplier Data

Engine RDMA Server

Fetch Manager Merging Thread Merged Data Merge Manager RDMA Client NetMerger Java Hadoop C++ RDMA Interconnects Acceleration

(10)

Pipelined Data Shuffle, Merge and Reduce

Time

shuffle merge reduce map

start First MOF

header PQ setup

(11)

Network-Levitated Merge Algorithm

<k1,v1><k2,v2><k3,v3>,… <k1,v1>,… <k1’,v1’> <k2’,v2’>,… <k2,v2>,… <k3’,v3’>,… <k3,v3>,… Merged Data: S1 S2 S3 <k1,v1>,… <k1’,v1’>,… <k2’,v2’>,… <k2,v2>,… <k3’,v3’>,… <k3,v3>,… <k1,v1><k2,v2><k3,v3>,…,<k2’,v2’><k1’,v1’><k3’,v3’>,… Merged Data: S2 S1 S3 S1 S2 S3 <k1,v1>,… <k2,v2>,… <k3,v3>,… S2 S1 S3 Merge Point

(a) Fetching Header (b) Priority Queue Setup

(12)

Job Progression with Network-levitated Merge

a) Map Progress of TeraSort b) Reduce Progress of TeraSort

•  a) Hadoop-A speeds up the execution time by more than 47%

•  b) Both MapTasks and ReduceTasks are improved

0 20 40 60 80 100 120 140 0 500 1000 1500 2000 Progress (%) Time (sec) Hadoop-A (Map)

Hadoop on IPoIB (Map) Hadoop on GigE (Map)

0 20 40 60 80 100 120 140 0 500 1000 1500 2000 Progress (%) Time (sec) Hadoop-A (Reduce)

Hadoop on IPoIB (Reduce) Hadoop on GigE (Reduce)

(13)

Breakdown of ReduceTask Execution Time (sec)

Significantly reduced the execution time of ReduceTasks

Most came from reduced shuffle/merge time

–  An improvement of 2.5 times

Also improved the time to reduce data

–  An improvement of 15%

Category PQ-Setup Shuffle/Merge Reduce or

Merge/Reduce

Hadoop-GigE 1150.01 (65.0%) 599.97 (35.0%)

Hadoop-IPoIB 1148.26 (65.9%) 597.09 (34.1%)

(14)

Outline

Background

Network Levitated Merge

JVM-Bypass Shuffling

(15)

WBDB, Oct 2012, S-15

HDFS

JVM-Dependent Intermediate Data Shuffling

MOF1 MOF2 MOF1 MOF2 MapTask Map MOF1 MOF2 HttpServlet Staging ReduceTask Staging Staging Reduce MOFCopiers HttpServlet HttpServlet Sort/Merge

JobTracker

TaskTracker TaskTracker

Heavily relies on Java

(16)

JVM-Bypass Shuffling (JBS)

Data Analytics Applications

Sockets TCP/IP HTTP Servlet HTTP GET MOFSupplier

MapTask TaskTracker ReduceTask

RDMA Verbs, TCP/IP

NetMerger Java C JVM-Bypass JVM-Bypass Shuffling (JBS) C

•  JBS removes JVM from the critical path of intermediate data shuffling

(17)

WBDB, Oct 2012, S-17

Benefits of JBS: 1/10 Gigabit Ethernets

•  JBS is effective for intermediate data of different sizes

Ø  Using Terasort benchmark, size of intermediate data = size of input data

•  JBS reduces the execution time by 20.9% on average in 1GigE,

19.3% on average in 10GigE 0 500 1000 1500 2000 16 32 64 128 256

Terasort Job Execution Time (sec)

Input Data Size (GB)

Hadoop on 1GigE JBS on 1GigE 0 500 1000 1500 2000 16 32 64 128 256

Terasort Job Execution Time (sec)

Input Data Size (GB)

Hadoop on 10GigE JBS on 10GigE

(18)

Benefits of JBS: InfiniBand Cluster

•  JBS on IPoIB outperforms Hadoop on IPoIB and SDP by 14.1%,

14.8%, respectively.

•  Hadoop performs similarly when using IPoIB or SDP.

0 500 1000 1500 2000 16 32 64 128 256

Terasort Job Execution Time (sec)

Input Data Size (GB)

Hadoop on IPoIB Hadoop on SDP JBS on IPoIB

(19)

Outline

Background

Network Levitated Merge

JVM-Bypass Shuffling

(20)

Hadoop Fair Scheduler

•  Scheduler assigns tasks to the TaskTrackers

•  Tasks occupy slots until completion or failure

Job Arrival shuffle reduce J-3 J-2 J-3 J-3 J-1 J-1 J-1 Slot-R1 Slot-R2 Slot-R3 Slot-M1 Slot-M2 Slot-M3 Slot-M4 Slot-M5 J-3 J-3 J-2 J-2 J-2 J-2 J-2 J-2 1 2 3 Time

(21)

Fair Completion Scheduler

Prioritize ReduceTasks based on the shortest remaining

map phase

When remaining map phases are equal, prioritize

ReduceTasks of jobs with least remaining reduce data

Track the slowdown of preempted ReduceTasks

(22)

1 10 100 1000 10000 1 2 3 4 5 6 7 8 9 10

Av

er

ag

e

ReduceT

ask

W

ai

ti

n

g

T

im

e

(s

ec

)

Groups

FCS

HFS

Average ReduceTask Waiting Time

ReduceTasks in small jobs are significantly speedup

19,5 12,4 22 21 32.2 27.3 1.22 4.51 0.52 0.8

(23)

Conclusions

•  Examined the design and architecture of Hadoop MapReduce

framework and reveal critical issues faced by the existing implementation

•  Designed and implemented Hadoop-A as an extensible

acceleration framework which addresses all these issues

•  Provided JVM-Bypass Shuffling to avoid JVM overhead,

meanwhile we enable it to be a portable library that can run on both TCP/IP and RDMA protocols.

•  Designed and Implemented Fast Completion Scheduler for

(24)

References

Related documents

For a merger or acquisition to achieve its potential it is important for the deal to be well-structured in the fi rst place but it is equally important to plan comprehensively

The raw material cost is somewhat higher for the copper-clad blank laminate having the textured copper surface, when compared to Onsite recycling of spent solder stripper

Such steps might include revising tenure and promotion guidelines in ways that promote public scholarship and civic learning; providing practical support for faculty who wish to be

Since the algorithm re-evaluates the panel operating point every dithering cycle, and adjusts the duty ratio every dithering cycle, the exact response time will be dependent on the

“miscellaneous,” and “publication cost.&#34; Debtor objected, asserting that the fees could not be allowed without an application complying with Bankruptcy Rule 2016. Debtor

By analysing the Isabel Anderson entries, I hope to prove that Montgomery’s representation of her relationship with Isabel, as in the case of Herman Leard and Frede Campbell, is

However, the increased scramjet wall temperature for the flight case ensured 27.7% less heat was absorbed by the flow path walls.. With these counteracting changes in heat release

large compared to the estimates of financial means available to terrorist organizations. While substantial amounts of assets were frozen in the aftermath of the