Evaluating MapReduce and Hadoop
for Science
Lavanya Ramakrishnan [email protected]
Lawrence Berkeley National Lab
Computation and Data are critical parts of the scientific process Experiment Theory Computation Data (Fourth Paradigm)
Three Pillars of Science
Advance Light Source Data Rates
2009 65 TB/yr 2011 312 TB/yr 2013 1900 TB/yr
Internet BigData led to the MapReduce and Hadoop Evolution
3
Map
A central component of the MapReduce model is its file system
HDFS GPFS and Lustre
Typical Replication 3 1
Storage Location Compute Node Servers Access Model Custom (except with
Fuse)
POSIX
Stripe Size 64 MB 1 MB Concurrent Writes No Yes
Scales with # of Compute Nodes # of Servers Scale of Largest
Systems
O(10k) Nodes O(100) Servers
Evaluating the Hype from Reality 5 MapReduce Clusters HPC NoSQL Cloud Hadoop on HPC MongoDB +Hadoop Hadoop on VM
Streaming adds a performance overhead
6
Evaluating Hadoop for Science, IEEE Cloud 2012
High performance parallel file systems can be used with Hadoop for small to medium
concurrency 0 2 4 6 8 10 12 0 500 1000 1500 2000 2500 3000 T im e (m in u te s) Number of maps Teragen (1TB) HDFS GPFS Linear (HDFS) Expon. (HDFS) Linear (GPFS) Expon. (GPFS) 7 Better
We evaluate three data-intensive operations with different testbed configurations
Filter Merge Reorder
Data operations impacts the performance
differences across file systems: Wikipedia (2TB)
0 5 10 15 Pr o ce ss in g ti m e (1 00 0s ) WriteTime ProcessingTime ReadTime Better
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 200 400 600 800 Size (TB) Processing time (s) HDFS GPFS
Read-intensive applications benefit from HDFS
Scientific Ensembles have similarities with MapReduce structure
11
A large number of loosely coupled tasks, each with their own internal parallelism.
All patterns could be implemented in Hadoop but with varying levels of difficulty
low high
There are challenges when using Hadoop for scientific applications 13 High throughput workflows Scaling up from desktops
File system: non POSIX
Language: Java
Input and output formats:
mostly line-oriented text
Streaming mode: restrictive i/p and o/p model
Data locality: what happens when multiple inputs?
File permissions: jobs run as user hadoop
Tigres: Design templates for common patterns of parallelism "LightSrc" Domain templates Base Tigres templates Scale up Application "LightSrc-1" Application "LightSrc-2" Create and Debug Share Create and Debug
Implement templates as a library in an
existing language
Templates
• Sequence ( name, task_array, input_array )
– e.g., output [ ] = Sequence (“my seq”, task_array_12,
input_array_12)
• Parallel ( name, task_array, input_array )
– e.g., output[ ] = Parallel(“abc”, task_array_12,
input_array_12)
• Split ( split_task, split_input_values, task_array, task_array_in )
– e.g., Split( task_x1, input_value_1, spl_t_arr, spl_i_arr)
• Merge ( task_array, input_array, merge_task, merge_input_values)
Evaluating the Hype from Reality 16 MapReduce Clusters HPC NoSQL Cloud Hadoop on HPC MongoDB +Hadoop Hadoop on VM
Reorder and Merge: Writes to Mongo
can be expensive
MongoDB HDFS 4.6 Million Input Records
Processing time (s) 0 200 400 600 800
*Sharded MongoDB vs HDFS on a 8 node Hadoop cluster (R=W)
Read Time Processing Time Write Time
MongoDB HDFS
4.6 Million Input Records
Processing time (s)
0
200
400
600
800 *Sharded MongoDB vs HDFS on a 8 node Hadoop cluster R<W
Read Time Processing Time Write Time Reorder Merge Better
4.6 9.3 18.6 37.2 Number of Input Records (Million)
Processing Time(min)
50
100
150
Hadoop−MongoDB MapReduce (2 workers)
MongoDB MapReduce
Filter: Hadoop MapReduce provides a way
to scale up analysis on MongoDB
Data analysis with Hadoop and MongoDB: Offload the MapReduce writes to HDFS
Better Sharding helps Reading from MongoDB Writing to MongoDB Move data to HDFS
Evaluating the Hype from Reality 20 MapReduce Clusters HPC NoSQL Cloud Hadoop on HPC MongoDB +Hadoop Hadoop on VM
Teragen and Terasort take longer on virtual machines 21 0 100 200 300 400 500 600 100 GB 200 GB 300 GB 400 GB 500 GB
Execution time (= sec)
Teragen performance Physical Virtual 0 500 1000 1500 2000 2500 3000 100 GB 200 GB 300 GB 400 GB 500 GB
Execution time (= sec)
Terasort performance
Physical Virtual
Reorder on virtual machines is faster (still investigating) 22 0 500 1000 1500 2000 34 GB 74 GB 111 GB
Execution time (= sec)
Wikibench reorder performance
Physical Virtual
Physical and virtual have different power profiles but correlate with maps and reduces
23 5 6 7 8 0 200 400 600 800 0 10 20 30 40 50 60 70 80 90 100 Power (= kW) Left percentage (= %) Time (= sec)
Wikibench reorder power consumption - Physical
37 GB Map Reduce 5 6 7 8 0 200 400 600 800 0 10 20 30 40 50 60 70 80 90 100 Power (= kW) Left percentage (= %) Time (= sec)
Wikibench reorder power consumption - Virtual
37 GB Map Reduce
Configuring Hadoop on Virtual Machines can benefit from different configurations
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 30D 30C 30D 80C 30D 130C 80D 30C 80D 80C 130D 30C T im e (s ec o n d s) Different Configurations Filter Reorder Merge Better
Reorder (virtual) needs more compute nodes than data nodes
0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 30-30 80-30 130-30 30-80 80-80 30-130 collocation Performance/power Wikibench on VMs, reorder 37GB 74GB 111GB 25 Better
Filter (virtual) can benefit from more data nodes 0 0.2 0.4 0.6 0.8 1 1.2 1.4 30-30 80-30 130-30 30-80 80-80 30-130 collocation Performance/power Wikibench on VMs, filter 37GB 74GB 111GB 26 Better
FRIEDA: Storage and Data Management on VMs 27
http://frieda.lbl.govSummary
•
MapReduce and Hadoop ecosystem are
powerful paradigms for science
– But may not be out of box solutions
•
It is possible to run Hadoop in
non-traditional configurations to enable use
in existing environments
Questions?
•
Email:
[email protected]
•
Collaborators
– Shane Canon, Elif Dede, Zacharia Fadika,
Madhu Govindaraju, Daniel Gunter, Eugen Feller, Christine Morin