• No results found

Addressing Crosscutting Compute/Data Challenges

N/A
N/A
Protected

Academic year: 2021

Share "Addressing Crosscutting Compute/Data Challenges"

Copied!
13
0
0

Loading.... (view fulltext now)

Full text

(1)

Addressing Crosscutting Compute/Data

Challenges

Manish Parashar

[email protected]

© Manish Parashar, Rutgers University

Rutgers  Discovery  

Informatics  Institute  (RDI

2

)    

New  Jersey

s  Center  for  Advanced  

Computation

 





2

CDSE - Many crosscutting challenges

Computing

–  Multicore; large and increasing core counts, deep memory

hierarchies (TH-2: 54.9 PF, 3.12 M Cores, 1.4 PB, 25 MW)

–  New prgm. model, concerns (fault tolerance, energy, etc)

–  New models & technologies: Clouds, grids, hybrid

manycore, accelerators, deep storage hierarchies, …

Data

–  Generating more data than in all of human history: preserve,

mine, share?

Software

–  Complex applications on coupled compute-data-networked

environments, tools needed

–  Modern apps: 106+ lines, many groups contribute, take

decades to develop, very long lifetimes • 

People

–  Multidisciplinary expertise essential!

(2)

Complex Coupled Application Workflows

Tight coupling

–  Coupled simulations/models run concurrently and exchange data frequently – E.g., Coupled fusion simulations, multi-block subsurface modeling

simulations

Loose coupling

–  Coupling and data exchange are less frequent, and often asynchronous and possibly opportunistic

– E.g., Combustion simulations – DNS + LES coupling, material science simulations

Data-flow (data-driven) coupling

–  Workflows expressed as dataflow based DAG where data produced by a simulation flows through one or more data processing pipelines – E.g., End-to-end simulations workflows with data analytics

Ensemble workflows

•  Multiple fan-out stages composed of large numbers of loosely coupled or independent simulations

•  E.g., Uncertainty quantification, history matching, data assimilation





(3)





Scientific Simulations: A Big Data Challenge

Scientific simulations running on high-end computing systems generate

huge amounts of data!

–  If a single core produces 2MB/minute on average, one of these machines could generate simulation data between ~170TB per hour -> ~700PB per day ->

~1.4EBper year

Successful scientific discovery depends on a comprehensive understanding

of this enormous simulation data

How we enable the computation scientists to

e

ciently manage and explore extreme scale data:

nd the needles in haystack” ??





Challenges Faced by Traditional HPC Data Pipelines

The

costs of data movement

are increasing and dominating!

Figure. Traditional data analysis pipeline

Data analysis challenge

•  Can current data mining, manipulation and visualization algorithms still work effectively on extreme scale machine?

I/O challenge

•  Increasing performance gap: disks are outpaced by computing speed

Data movement challenge

•  Lots of data movement between simulation and analysis machines, between coupled mutli-physics simulation components -> longer latencies

•  Improving data locality is critical: do work where the data resides!

Energy challenge

•  Future extreme systems are designed to have low-power chips – however, much greater power consumption will be due to memory and data movement!

(4)

Coupling Cycles of Innovation

Interoperability/Standards?

7





Organization Organization Organization

Grid Middleware Infrastructure

Virtual Organization

Applications

Programming System

(Programming Model + Abstract Machine)

Virtualization

(Create and manage VOs and VMs)

Abstraction

(Support abstract behaviors/assumptions)

Con ce ptual Im pl ementati o n

Virtual Machine Virtual Machine Virtual Machine

Organization Organization Organization

Organization

Organization OrganizationOrganization OrganizationOrganization

Grid Middleware Infrastructure

Virtual Organization

Applications

Programming System

(Programming Model + Abstract Machine)

Virtualization

(Create and manage VOs and VMs)

Abstraction

(Support abstract behaviors/assumptions)

Virtualization

(Create and manage VOs and VMs)

Abstraction

(Support abstract behaviors/assumptions)

Con ce ptual Im pl ementati o n

Virtual Machine Virtual Machine Virtual Machine

Addressing Complexity

–  Programming System

•  programming model, languages/

abstraction – syntax + semantics •  abstract machine – execution

context and assumptions •  infrastructure, middleware &

runtime –  Hide v/s expose?

•  virtual homogeneity – ease of programming

•  robustness ?

•  cross-layer adaptation & optimization

– the inverted stack …

Abstractions?

Conceptual and Implementation Models for the Grid, M. Parashar and J.C. Browne, Proceedings of the IEEE, Special Issue on Grid Computing, IEEE Press, Vol. 19, No. 3, March 2005.!

(5)





Software Engineering and CDS&E

•  Many sources of complexity

–  complexity from the problem (complex physics) –  complexity of the end-to-end application workflow

–  complexity from coordination & coupling across codes and research teams –  complexity from the codes and how they are developed and implemented –  complexity from the underlying disruptive infrastructure

•  Challenge: Manage complexity while maintaining performance/scalability

•  Coupling cycles of innovation

•  Software Engineering in the Large v/s Software Engineering in the Small ?

–  Software management of large systems v/s smaller but highly complex systems –  Well defined v/s rapidly evolving requirements/execution environments –  Primary complexity from size v/s application requirements

–  Relative importance of performance –  …

•  Design Principles

–  Separation of Concerns –  Abstractions

•  Interoperability and standardization





Complexity suggests a SOA approach

Service Oriented Architecture (SOA): Software as a composition of

“services”

–  Used to deal with system/application complexity, rapidly changing requirements, evolving target platforms, and diverse teams

–  Service: “… a well-defined, self-contained, and independently developed software element that does not depend on the context or state of other services.”

–  Abstraction & separation

•  Computations from compositions and coordination •  Interface from implementations

–  Existing and proven concept - widely accepted/used by the enterprise computing community

•  For example, extreme-scale data analysis services routinely run at Yahoo & Google, for near real-time web data analysis, for rapid deployment of new web services, etc.

(6)

Examples of some relevant services

•  I/O services: Efficient and adaptable I/O support

–  I/O services should be componentized so that it allows codes to switch between and tune different methods easily.

–  Example: a user may easily switch between file-based and

memory-based I/O or switch formats without changing any code, all the while maintaining the same data model when switching I/O components.

•  Code-coupling services: Code-coupling services includes codes that can run on the same or different machines, codes that are tightly coupled (memory-to-memory), as well as codes that are loosely coupled (exchange of data through files).

•  Automatic online (in-transit) data processing services: Efficient data movement services for distributing and archiving data

–  Example: generating summary statistics, graphs, images, movies, etc.

•  Monitoring Services: dynamic real-time monitoring of large-scale simulations (during execution)

•  Automated provenance collections services: Data, system workflow, application, etc.





ADIOS:  Adaptable  I/O  System  

Provides  portable,  fast,  scalable,  

easy-­‐to-­‐use,  metadata  rich  output  

Simple  API    

Layered  so;ware  architecture:  

– Allows  plug-­‐ins  for  different  I/O  

implementaCons  

– Abstracts  the  API  from  the  method  used  

for  I/O  

Beyond  I/O  –  coupling,  

coordinaCon,  in-­‐situ/in-­‐transit,  

etc.  

Open  source:  

– hGp://www.nccs.gov/user-­‐support/

center-­‐projects/adios/  

Research  methods  from  many  

groups:  

–  Georgia Tech, Rutgers, NCSU, Emory, Auburn, Sandia, LBL, PPPL   Interface)to)apps)for)descrip/on)of)data)(ADIOS,)etc.)) Buffering) Feedback) )Schedule) Mul/Bresolu/on)

methods) Data)Compression)methods) (FastBit))methods)Data)Indexing) Data)Management)Services)

Workflow))Engine) Run/me)engine) Data)movement) Provenance) Plugins)to)the)hybrid)staging)area) Visualiza/on)Plugins) Analysis)Plugins) Parallel)and)Distributed)File)System) IDX) HDF5)

AdiosBbp) pnetcdf) “raw”)data) Image)data) Viz.)Client) 0 5 10 15 20 25 30 35 S3D SEC Fine/Turbo G B/ s

Simulations with and without ADIOS I/O performance of the Combustion S3D code (96K

cores), the SCEC PCML3D (30K cores), and the Fine/Turbo (4K cores) codes.

MPI-IO OR PHDF5 ADIOS

(7)





Application Code Contact

Astrophysics+ Chimera+ T.+Mezzacappa+

Combus6on+ S3D+ J.+Chen+ AMR+ Chombo+ B.+Van+Straalen+

CFD+ Fine/Turbo+ M.+Gon6er+ FusionCEdge+ XGC1+ C.+S.+Chang+ FusionCEdge+ XGC0+ S.+H.+Ku+ Geoscience+ AWPCODC+ Y.+Cui+ Materials+ LAMMPS+ A.+Frachioni+ Weather+ GRAPES+ W.+Xue+ Nuclear+ HFODD+ H.+Nam+ Rela6vity+ Maya+ P.+Laguna+ Sub+surface+ M.+Wheeler+ Materials+ M.+Parashar+ Fusion+ GTCCP+ S.+Ethier+

Application Code Contact

Fusion+ GTS+ W.+Wang+ Fusion+ M3DCC1+ S.+Jardin+ Fusion+ M3DCK+ G.+Y.+Fu+ Fusion+ GTC+ Z.+Lin+ Fusion+ GEM+ S.+Parker+ Quantum+ QLG2Q+ M.+Soe+

Image+Analysis+ T.+Kurc+

AMR+ Boxlib+ J.+Bell+ Rela6vity+ Cactus+ G.+Allen+ Weather+ NASA+ T.+Clune+ Materials+ QMC+ J.+Kim+

Climate+ J.+Fu+

Climate+ CAM+ K.+Evans+ Application Code Contact

Astrophysics+ Chimera+ T.+Mezzacappa+

Combus6on+ S3D+ J.+Chen+ AMR+ Chombo+ B.+Van+Straalen+

CFD+ Fine/Turbo+ M.+Gon6er+ FusionCEdge+ XGC1+ C.+S.+Chang+ FusionCEdge+ XGC0+ S.+H.+Ku+ Geoscience+ AWPCODC+ Y.+Cui+ Materials+ LAMMPS+ A.+Frachioni+ Weather+ GRAPES+ W.+Xue+ Nuclear+ HFODD+ H.+Nam+ Rela6vity+ Maya+ P.+Laguna+ Sub+surface+ M.+Wheeler+ Materials+ M.+Parashar+ Fusion+ GTCCP+ S.+Ethier+

Application Code Contact

Fusion+ GTS+ W.+Wang+ Fusion+ M3DCC1+ S.+Jardin+ Fusion+ M3DCK+ G.+Y.+Fu+ Fusion+ GTC+ Z.+Lin+ Fusion+ GEM+ S.+Parker+ Quantum+ QLG2Q+ M.+Soe+

Image+Analysis+ T.+Kurc+

AMR+ Boxlib+ J.+Bell+ Rela6vity+ Cactus+ G.+Allen+ Weather+ NASA+ T.+Clune+ Materials+ QMC+ J.+Kim+

Climate+ J.+Fu+

Climate+ CAM+ K.+Evans+

ADIOS Application

Partners

Over 158 Publications from 2006 - 2013





The ADIOS Approach

The overarching design philosophy of our framework is

based on the

Service-Oriented Architecture

Used to deal with system/application complexity, rapidly

changing requirements, evolving target platforms, and

diverse teams

Applications constructed by assembling services based

on a universal view of their functionality using a

well-defined API

Service implementations can be changed easily

Integrated simulation can be assembled using these

(8)

ADIOS SOA: I/O Componentization

•  End users should be able to select the most efficient I/O method for their code, with minimal effort in terms of code alternations/updates.

•  Performance-driven choices should not prevent data from being stored in the desired file format, since this is crucial for later data analysis.

•  Have efficient ways of identifying and selecting certain data for analysis, to help end users cope with the flood of data being produced.

•  Make it easy to introduce new research I/O methods, without changing your code. –  allow the “best” CDS&E researchers to contribute in application science projects. •  Make it easy to allow I/O to do more than just I/O à code coupling, in situ

visualization.





Exploit multi-levels in-memory

Hybrid Data Staging

to:

–  Decrease the gap between CPU and IO speeds

–  Dynamically deploy and execute data analytical or pre-processing operations either in-situ or in-transit

–  Improve IO write performance

Rethinking the Data Management Pipeline – Hybrid Staging

+ In-Situ & In-Transit Execution

(9)





In-situ/In-transit Data Management & Analytics

l 

Virtual shared-space programming abstraction

l 

Simple API for coordination, interaction and messaging

l 

Distributed, associative, in-memory object store

l 

Online data indexing, flexible querying

l 

Adaptive cross-layer runtime management

l 

Hybrid in-situ/in-transit execution

l 

Efficient, high-throughput/low-latency asynchronous data transport





Overview of Research Work @ RU

Programming abstractions / system

–  DataSpaces/XpressSpaces: Interaction, coordination and messaging abstraction for coupled scientific workflow [HPDC10, CCGrid10,11, HiPC12, CCPE13]

–  ActiveSpaces: Dynamic code deployment for in-staging data processing [IPDPS11]

Runtime mechanisms

–  Data-centric Task Mapping: Reduce data movement and increase intra-node data sharing [IPDPS12, DISC12]

–  In-situ & In-transit Data Analytics: Simulation-time analysis of large volume data by combining in-situ and in-transit execution [SC’12]

–  Cross-layer Adaptation: adaptive cross-layer approach for dynamic data management in large scale simulation-analysis workflow

High-throughput/low-latency asynchronous data transport

–  DART: Network independent transport library for high speed asynchronous data extraction and transfer [HPDC08]

(10)

DataSpaces: A Shared Space Abstraction for the

Hybrid Staging Area [HPDC’10]

Semantically-specialized

virtual

shared space abstraction in the

staging area

–  Shared (distributed) data objects –  Simple put/get/query API –  Supports coupled app. workflows –  Provide a global-view programming

abstraction consistent with the PGAS model (UPC, GA)

DataSpaces (HPDC10)

–  Constructed on-the-fly on hybrid staging nodes

•  Indexes data for quick access and retrieval

•  Provides asynchronous coordination and interaction support

–  Complements existing interaction/ coordination mechanisms

Code coupling using DataSpaces

–  Transparent data redistribution –  Complex geometry-based queries –  In-space (online) data

transformation and manipulations

           

Figure. Graphical view of the shared space model (derived from Tuple space) for coordination/ interaction





The Replica Exchange Algorithm

A powerful sampling algorithm that preserves canonical distributions

–  Allows for efficient crossing of high energy barriers that separate thermodynamically stable states

–  Can significantly reduce sampling times as compared to formulations based on constant temperatures

Algorithm overview

–  Several copies, or replicas, of the system are simulated in parallel at different temperatures using “walkers”

–  Walkers occasionally swap temperatures to allow them to bypass enthalpic barriers by moving to a high temperature

•  Exchanges governed by a probability condition to ensure detailed balance

Replica exchange can significantly impact the fields of structural

biology and drug design

–  structure based drug design associated with protein misfolding, for example, structure based drug design and binding affinity optimization –  molecular basis of human diseases associated with protein misfolding

(11)





Parallel Replica Exchange

General formulation requires dynamic and complex

coordination and communication patterns between

the walkers

Pair-wise and asynchronous

Dependent on the current state (temperature, energy) of

replicas, which are changing dynamically

Implementation based on commonly used parallel

programming frameworks is challenging

Message passing frameworks, e.g. MPI, require matching

sends and receives to be explicitly defined for each

interaction

Implementations use restrictive formulations using

synchronous exchanges

•  Exchanges between neighboring temperatures only





Asynchronous Replica Exchange using

DataSpaces

DataSpaces provides a semantically specialized virtual

shared space abstraction to support scalable

asynchronous replica exchange for molecular dynamics

applications

Enables general non-nearest neighboring temperature exchanges

Exchanges are decoupled and asynchronously and dynamically

determined

Communications are decentralized and peer-to-peer, and occur in

parallel

Operation

Walkers post temperature range of interest to the

shared space using post

Walkers discover partners and negotiate exchange

(12)

Example: Salsa Operation





Summary

•  Unprecedented opportunities for dramatic insights through

computation!

•  Challenge: Manage complexity while maintaining performance/scalability.

–  complexity from the problem (complex physics)

–  complexity from the codes and how they are developed and implemented –  complexity of underlying infrastructure (disruptive hardware trends) –  complexity from coordination across codes and research teams

•  Overarching philosophy

–  Abstraction & Separation through Service Orientation •  Independence in development, deployment, execution

•  Computations from compositions and coordination; Interface from implementations –  Existing and proven concepts - widely accepted/used by the enterprise

computing community

•  ADIOS

–  Reducing barriers from a scientists perspective •  Ease-of-use, simple code integration and maintainability –  Minimizing performance impact

(13)





Thank You!

Manish Parashar, Ph.D.

Prof., Dept. of Electrical & Computer Engr. Rutgers Discovery Informatics Institute (RDI2) Cloud & Autonomic Computing Center (CAC) Rutgers, The State University of New Jersey Email: [email protected]

WWW: rdi2.rutgers.edu

!

EPSi

Edge Physics Simulation

References

Related documents

As the GSE points out, the curriculum should serve the objectives of the Division, by connecting the philosophy of liberal arts and specialized education, realized through

But although The Unfortunate Traveller does contain travestied elements of the genre of apocalypse (most notably in connection with Esdras of Granada),

According to the international experience, federal authorities can carry out six groups of functions for support of mechanisms of development of innovative

Whether grown as freestanding trees or wall- trained fans, established figs should be lightly pruned twice a year: once in spring to thin out old or damaged wood and to maintain

Creating an entire game in Roblox does take a lot of efforts - you will need to implement in-game purchases, which players can choose to purchase using Robux, and you will then

It reflects the acquisition less disposal of financial assets by the general government sector in the form of Currency and deposits (F.2), Debt securities (F.3), Loans granted

Players can create characters and participate in any adventure allowed as a part of the D&D Adventurers League.. As they adventure, players track their characters’

Review of previous studies indicates they have been conflicting results and this study sought to determine the relationship of organizational structure and internal