Big Data Challenges in Simulation-based Science

(1)

Big Data Challenges in Simulation-based

Science

Manish Parashar*

Rutgers Discovery Informatics Institute (RDI

2

₎

Department of Computer Science

*Hoang Bui, Tong Jin, Qian Sun, Fan Zhang, ADIOS Team, and

other ex-students and collaborators….

(2)

Outline

• Data Grand Challenges

–

 

Data challenges of simulation-based science

• Rethinking the simulations -> insights pipeline

• The DataSpaces Project

(3)

RDI

2

Astronomy: LSST _{Physics: LHC}

Oceanography: OOI

Sociology: The Web

Biology: Sequencing

Economics: POS terminals

Neuroscience: EEG, fMRI

Data-driven Discovery in Science

Nearly every field of discovery is transitioning from “data

poor” to “data rich”

(4)

Scientific Discovery through Simulations

• Scientific simulations running on high-end computing systems generate

huge amounts of data!

–  If a single core produces 2MB/minute on average, one of these machines could

generate simulation data between ~170TB per hour -> ~700PB per day ->

~1.4EB per year

• Successful scientific discovery depends on a comprehensive understanding

of this enormous simulation data

How we enable the computation scientists to

e

ﬃ

ciently manage and explore extreme scale data:

(5)

Scientific Discovery through Simulations

• Complex, heterogeneous

components

• Large data volumes and data rates

• Data re-distribution (MxNxP), data

transformations

• Dynamic data exchange patterns

• Strict performance/overhead

constraints

• Complex work

ﬂ

ows integrating

coupled models, data management/

processing, analytics

• Tight / loose coupling, data

driven, ensembles

• Advanced numerical methods (E.g.,

Adaptive Mesh Re

ﬁ

nement)

• Integrated (online) analytics,

uncertainty quanti

ﬁ

cation, …

(6)

Traditional

Simulation -> Insight

Pipelines Break Down

• Traditional

simulation ->

insight

pipeline

–  Run large-scale simulation

workflows on large supercomputers –  Dump data to parallel disk systems

–  Export data to archives

–  Move data to users’ sites – usually

selected subsets

–  Perform data manipulations and

analysis on mid-size clusters –  Collect experimental /

observational data –  Move to analysis sites

–  Perform comparison of

experimental/observational to validate simulation data

Figure. Traditional data analysis pipeline

Analysis/Visualization Cluster Simulation Machines Storage Servers

…

Simulatio n Raw Data

(7)

Challenges Faced by Traditional HPC Data Pipelines

• Data analysis challenge

•  Can current data mining, manipulation

and visualization algorithms still work effectively on extreme scale machine?

The

costs of data movement

are increasing and dominating!

Figure. Traditional data analysis pipeline

• I/O and storage challenge

•  Increasing performance gap: disks are outpaced

by computing speed

• Data movement challenge

•  Lots of data movement between simulation and analysis machines, between

coupled multi-physics simulation components -> longer latencies

•  Improving data locality is critical: do work where the data resides!

• Energy challenge

•  Future extreme systems are designed to have low-power chips – however,

(8)

• The energy cost of moving data is a

significant concern

From K. Yelick, “Software and Algorithms for Exascale: Ten Ways to Waste an Exascale Computer”

The Cost of Data Movement

Energy _ move _ data= bitrate * length

2

cross_section_area_of_wire

performance

gap

• Moving data between node

memory and persistent

storage is slow!

(9)

Rethinking the Data Management Pipeline – Hybrid Staging +

In-Situ & In-Transit Execution

Issues/Challenges

• Programming abstractions/systems

• Mapping and scheduling

• Control and data flow

• Autonomic runtime

• Link with the external workflow

Systems

(10)

Design space of possible workflow architectures

•  Loca%on of the compute resources

–  Same cores as the simulaHon (in situ)

–  Some (dedicated) cores on the same nodes

–  Some dedicated nodes on the same machine

–  Dedicated nodes on an external resource

•  Data access, placement, and persistence

– Direct access to simulaHon data structures

– Shared memory access via hand-‐oﬀ / copy

– Shared memory access via non-‐volaHle near

node storage (NVRAM)

– Data transfer to dedicated nodes or external

resources

•  Synchroniza%on and scheduling

–  Execute synchronously with simulaHon

every nth_{simulaHon Hme step}

–  Execute asynchronously

Processing data on remote nodes

Using distinct cores on same node Sharing cores with the simulation

DRA M DRA M DR A M Simulation Node DR A M Staging Node Network Communication Analysis Tasks Simulation Visualization DRAM NVRAM SSD Hard Disk CPUs DRAM NVRAM SSD Hard Disk CPUs Network Node 1 Node 2 Node N ... Staging option 1 Staging option 2 Staging option 3

(11)

DataSpaces: A Scalable Shared Space Abstraction for Hybrid

Data Staging [JCC12]

App1 put()/pub()

Shared Space Model

Simula�on get()/sub() put() App3 get() App2 get() m d d d d d: Data object m: Meta-data object m m m d l 

Shared-space programming

abstraction over hybrid staging

l 

Simple API for coordination,

interaction and messaging

l 

Provides a global-view

programming abstraction

l 

Distributed, associative,

in-deep-memory object store

l 

Online data indexing, flexible

querying

l 

Exposed as a persistent service

l 

Autonomic cross-layer runtime

management

l 

High-throughput/low-latency

asynchronous data transport

0

10 20 30 40 50 60 70 80 90 100 110 120 512/32 1024/64 2048/1284096/2568192/51216K/1K 32K/2K 64K/4K 2GB 4GB 8GB 16GB 32GB 64GB 128GB 256GB Throughput(GB/s)

Aggregate data redistribution throughput

D ata Sp ac es s ca la b ility o n O R N L T ita n

(12)

DataSpaces Abstraction: Indexing + DHT

[HPDC10]

•  Dynamically constructed structured

overlay using hybrid staging cores

•  Index constructed online using SFC

mappings and applications attributes

•  E.g., application domain, data,

field values of interest, etc.

•  DHT used to maintain meta-data

information

•  E.g., geometric descriptors for

the shared data, FastBit indices, etc.

•  Data objects load-balanced

(13)

Multphysics Code Coupling at Extreme Scales [CCGrid10]

PGAS Extensions for Code Coupling [CCPE13]

DataSpaces: Enabling Coupled Scientific

Workflows at Extreme Scales

Data-centric Mappings for In-Situ Workflows [IPDPS12]

Simulation

in-situ analytics

DART

...

Data Pool of computation bucket in the staging area

Simulation in-situ analytics DART Simulation in-situ analytics DART Simulation in-situ analytics DART DataSpaces "dat_a-read y" ev_ent "buck et-read y" even t Assig ned ta sk & data d escri ptors DART in-transit analytics DART in-transit analytics DART in-transit analytics DART in-transit analytics

compute nodes running simulation and in-situ analytics

Data Interaction and coordnation

Resoure scheduler

Dynamic Code Deployment In-Staging [IPDPS11]

kernel_min { for i = 1, n for j = 1, m for k = 1, p if (min > A(i, j, k)) min = A(i, j, k) } data_kernels.c 0x69 0x20 0x61 0x6d 0x20 0x63 0x6f 0x6f 0x6c 0x0 data_kernels.o _{Applications.o} Applications executable apr un apr un Compute nodes Link gcc -c Staging nodes rexec() return() for i = 0, ni-1 do for j = 0, nj-1 do for k = 0, nk-1 do val = input:get_val(i,j,k) if min > val then min = val end end end data_kernels.lua

Runtime execution system (Rexec)

(14)

Integrating In-situ and In-transit Analytics [SC’12]

• S3D: First-principles direct

numerical simulation

• Simulation resolves features on

the order of 10 simulation time

steps

• Currently on the order of every

400

th

_{time step can be written}

to disk

• Temporal fidelity is

compromised when analysis is

done as a post-process

Recent data sets generated by S3D, developed at the CombusHon Research Facility, Sandia NaHonal Laboratories

(15)

Integrating In-situ and In-transit Analytics – Design

23

Topology

StaHsHcs

Identify features of

interest

Volume Rendering

*J. C. Bennett et al., “Combining In-Situ and In-Transit Processing to Enable Extreme-Scale Scientific Analysis”,

SC’12, Salt Lake City, Utah, November, 2012.

(16)

In-situ/In-transit Data Analytics

• Online data analytics for large-scale simulation workflows

–

 

Combine both in-situ and in-transit placement to r

educe impact on

simulation, reduce network data movement, reduce total execution time

–

 

Adaptive runtime management:

Optimize the performance of

simulation-analysis workflow through adaptive data and analysis placement, data-centric task mapping

primary resources shared data structures secondary resources Simulation In-situ processing In-transit Task Executors asynchr onous in-s itu to in-trans it data tr ansfers Workflow Manager data ready

Pull task & data descriptors

Figure. Overview of the in-situ/in-transit

(17)

Simulation case study with S3D: Timing

results for 4896 cores and analysis every

simulation time step

16.85& 2.72& 0.73& 1.7& 1.69& 2.06& 119.81& 5.07&

0& 2& 4& 6& 8& 10& 12& 14& 16&

simula3on& hybrid&topology& hybrid&visualiza3on& in@situ&visualiza3on& hybrid&sta3s3cs& in@situ&sta3s3cs& seconds&

(18)

Simulation case study with S3D: Timing

results for 4896 cores and analysis every

simulation time step

16.85& 2.72& 0.73& 1.7& 1.69& 2.06& 119.81& 5.07&

0& 2& 4& 6& 8& 10& 12& 14& 16&

simula3on& hybrid&topology& hybrid&visualiza3on& in@situ&visualiza3on& hybrid&sta3s3cs& in@situ&sta3s3cs& seconds&

S3D&

in@situ&

data&movement&

in@transit&&

10.0% 10.1% 4.3%

.04%

(19)

Simulation case study with S3D: Timing

results for 4896 cores and analysis every 10

th

simulation time step

168.5& 119.81&

0& 20& 40& 60& 80& 100& 120& 140& 160&

simula1on& hybrid&topology& hybrid&visualiza1on& in>situ&visualiza1on& hybrid&sta1s1cs& in>situ&sta1s1cs& seconds&

S3D&

in>situ&

data&movement&

in>transit&&

1.0% 1.0% .43%

.004%

(20)

Simulation case study with S3D: Timing

results for 4896 cores and analysis every 100

th

simulation time step

1685% 0% 200% 400% 600% 800% 1000% 1200% 1400% 1600% simula/on% hybrid%topology% hybrid%visualiza/on% in<situ%visualiza/on% hybrid%sta/s/cs% in<situ%sta/s/cs% seconds%

S3D%

in<situ%

data%movement%

in<transit%%

.1% .1% .04%

.004% .16%

(21)

In-Situ Feature Extraction and Tracking using

Decentralized Online Clustering (DISC’12)

DOC workers executed in-‐situ on simula%on machines

SimulaHon Compute Nodes

One compute node

Processor core runs simulaHon Processor core runs DOC worker

Beneﬁts of runtime feature extraction and tracking

(1) Scientists can follow the events of interest (or data of interest) (2) Scientists can do real-time monitoring of the running simulations

(22)

Pixplot

8 cores

Pixmon

1 core

(login node)

ParaView Server

4 cores

In-situ viz. and

monitoring with

staging (SC’12)

Pixie3D

1024 cores

DataSpaces

record.bp

pixie3d.bp

(23)

Scalable Online Data Indexing and Querying

(BDAC’14, CCPE’15)

• Enable query-driven data analytics to extract insights from the

large volume of scientific simulation data

N 3 Index-‐1 Index-‐4 Index-‐3 Index-‐2 Query1 Query2 Query3 Query4 Coord-‐based Coord-‐based Value-‐based Value-‐based Hilbert SFC Z -‐ o r d e r SFC Bitmap Index Tree Index N 1 N 2 N 4 Hash: H2 Hash: H4 Hash: H3 Hash: H1

Figure. Conceptual view of the

programmable online indexing & querying using multiple index

Figure. Value-based online index and query framework using FastBit

(24)

Fig. Conceptual framework for realizing multi-tiered staging for coupled data-intensive simulation workflows.

Multi-tiered Data Staging with Autonomic

Data Placement Adaptation – Overview (IPDPS’15)

• A multi-tiered data staging approach that uses both DRAM and

SSD to support data-intensive simulation workflows with large

data coupling requirements.

• Efficient application-aware data placement mechanism that

(25)

get()/sub() put()/_pub() ADIOS/ Dataspaces ADIOS/ Dataspaces App App TCP/IP RDMA, Gridftp Control Node Data Tx/ Rx Node Dataspaces WA

Wide-area Staging (Demo at SC’14)

• Wide-area shared staging abstraction achieved by

interconnecting multiple staging instances

–  Leverages SDN for provisioning; Workflow/Grid technologies

• Data placement optimized based on demand, availability,

(26)

Combus%on: S3D uses chemical model to simulate combusHon process in order to understand the igniHon characterisHc of bio-‐fuels for automoHve usage.

PublicaHon: SC 2012.

Fusion: EPSi simulaHon workﬂow provides insight into edge plasma physics in magneHc fusion devices by simulaHng movement of millions parHcles.

PublicaHon: CCGrid 2010, Cluster 2014

Subsurface modeling: IPARS uses mulH-‐physics, mulH-‐model formulaHons and coupled mulH-‐block grids to support oil reservoir simulaHon of realisHc, high-‐ resoluHon reservoir studies with a million or more grids.

PublicaHon: CCGrid 2011

Computa%onal mechanics: FEM coupled with AMR is oben used to solve heat transfer or ﬂuid dynamic problems. AMR only performs reﬁnements on

important region of the mesh. (work in progress)

Material Science: QMCPack uHlizes quantum Monte Carlo (QMC) studies in heterogeneous catalysis of transiHon metal nanoparHcles, phase transiHons, properHes of materials under pressure, and strongly correlated materials. (work in progress)

Material Science: SNS workﬂow enables integraHon of data, analysis, and modeling tools for materials science experiments to accelerate scienHﬁc

(27)

Summary & Conclusions

• Complex applications running on high-end systems

generate extreme amounts of data that must be managed

and analyzed to get insights

–

 

Data costs (performance, latency, energy) are quickly dominating

–

 

Traditional data management/analytics pipelines are breaking

down

• Hybrid data staging, in-situ workflow execution, etc.

can

address this challenges

–

 

Users to efficiently intertwine applications, libraries, middleware

for complex analytics

• Many challenges; Programming, mapping and scheduling,

control and data flow, autonomic runtime management

….

(28)

Thank You!

Manish Parashar

Email: [email protected]

WWW: parashar.rutgers.edu

WWW: dataspaces.org

!

EPSi

(29)