• No results found

Application Requirements for Petaflop Systems

N/A
N/A
Protected

Academic year: 2020

Share "Application Requirements for Petaflop Systems"

Copied!
38
0
0

Loading.... (view fulltext now)

Full text

(1)

Application Requirement

Petaflop Computing

Geoffrey Fox

Computer Science, Informatics, Physics

Indiana University, Bloomington IN 47404

(2)

Petaflop Studies

• Recent Livermore Meeting on Processor in Memory Systems

• http://www.epcc.ed.ac.uk/direct/newsletter4/petaflops.html 1999 • http://www.cacr.caltech.edu/pflops2/ 1999

• Several earlier special sessions and workshops

• Feb. `94: Pasadena Workshop on Enabling Technologies for Petaflops Computing Systems

• March `95: Petaflops Workshop at Frontiers'95 • Aug. `95: Bodega Bay Workshop on Applications

• PETA online: http://cesdis.gsfc.nasa.gov/petaflops/peta.html • Jan. `96: NSF Call for 100 TF "Point Designs"

• April `96: Oxnard Petaflops Architecture Workshop (PAWS) on Architectures

(3)

Crude Classification

Classic Petaflop MPP

– Latency 1 to 10 microseconds

– Single (petaflop) machine

– Tightly coupled problems

Classic Petaflop Grid

– Network Latency 10-100 or greater milliseconds – Computer Latency <1 millisecond (routing time) – e.g. 100 networked 10 teraflop machines

– Only works for loosely coupled modules

(4)

Styles in “Problem Architectures” I

Classic Engineering and Scientific Simulation:

FEM,

Particle Dynamics, Moments, Monte Carlo

– CFD Cosmology QCD Chemistry …..

– Work  Memory4/3

– Needs classic low latency MPP

Classic Loosely Coupled Grid:

Ocean-Atmosphere,

Wing-Engine-Fuselage-Electromagnetics-Acoustics

– Few-way functional parallelism

– ASCI

– Generate Data – Analyse Data – Visualize is a “3-way” Grid

(5)

Classic MPP Software Issues

Large scale parallel successes mainly using MPI

MPI Low level and initial effort “hard” but

– Portable

– Package as libraries like PETSc

– Scalable to very large machines

Good to have higher level interfaces

– DoE Common Component Architecture CCA “packaging

modules” will work at coarse grain size

– Can build HPF/Fortran90 parallel arrays (I extend this with

HPJava) but hard to support general complex data structures – We should restart parallel computing research

Note

Grid

is set up (tomorrow) as set of

Web services

this is a

totally message based

(as is

CCA

)

– Run time compilation to inline a SOAP message to a MPI message to a Java method call

(6)

Styles in “Problem Architectures” II

• Data Assimilation:

Combination of sophisticated

(parallel) algorithm and real-time fit to data

Environment: Climate, Weather, Ocean

Target-tracking

Growing number of applications (in earth science)

Classic low latency MPP with good I/O

• Of growing importance due to “Moore’s law

applied to sensors” and large investment in new

instruments by NASA, NSF ……

(7)

Styles in “Problem Architectures” III

• Data Deluge Grid: Massive distributed data analyzed in “embarrassingly parallel” fashion

– Virtual Observatory

– Medical Image Data bases (e.g. Mammography) – Genomics (distributed gene analysis)

– Particle Physics Accelerator (100 PB 2010)

• Classic Distributed Data Grid

• Corresponds to fields X-Informatics (X=Bio, Laboratory, Chemistry …)

• See http://www.grid2002.org

• Underlies e-Science initiative in UK

• Industrial applications include health, equipment monitoring (Rolls Royce generates gigabytes data per engine flight), transactional

databases

(8)

Styles in “Problem Architectures” IV

• Complex Systems: Simulations of sets of often “non-fundamental” entities with phenomenological or idealized interactions. Often

multi-scale and “systems of systems” and can be “real Grids”; data-intensive simulation

– Critical or National Infrastructure Simulations (power grid) – Biocomplexity (molecules, proteins, cells, organisms)

– Geocomplexity (grains, faults, fault systems, plates) – Semantic Web (simulated) and Neural Networks

• Exhibit phase transitions, emergent network structure (small worlds) • Data used in equations of motion as well as “initial conditions” (data

assimilation)

• Several fields (e.g. biocomplexity) are immature and not currently using major MPP time

(9)

Styles in “Problem Architectures” V

Although problems are hierarchical and multi-scale, not

obvious that can use a Grid (putting each subsystem on a

different Grid node) as ratio of

Grid latency to MPP latency

is typically

10

4

or more and most algorithms can’t

accommodate this

– X-Informatics is data (information) aspect of field X; This is X-complexity integrates mathematics, simulation and data

Military simulations (using HLA/RTI from DMSO) are of

this style

– Entities in complex system could be vehicles, forces – Or packets in a network simulation

(10)

Societal Scale Applications

Environment:

Climate, Weather, Earthquakes

Heath:

Epidemics

Critical Infrastructure:

– Electrical Power

– Water, Gas, Internet (all the real Grids) – Wild Fire (weather + fire)

– Transportation –Transims from Los Alamos

All parallelize well

due to geometric structure

Military: HLA/RTI (DMSO)

HLA/RTI usually uses event driven simulations

but

future could be

“classic time-stepped simulations” as

these appear to work in many cases IF you define at fine

enough grain

size

(11)
(12)

Data Intensive Requirements

Grid like:

accelerator, satellite, sensor from distributed

resources

Particle Physics

– all parts of process essentially

independent – 10

12

events giving 10

16

bytes of data per

year

– Happy with tens of thousands of PC’s at ALL stages of analyze – Size reduction as one proceeds through different stages

– Need to select “interesting data” at each stage

Data Assimilation:

start with Grid like gathering of data

(similar in size to particle physics) and reduce size by a

factor of 1000

– Note particle physics doesn’t reduce data size but maintains embarrassingly parallel structure

– Size reduction probably determined by computer realism as much as by algorithms

(13)
(14)
(15)

Particle Physics Web Services

Accelerator Data as a We

service (WS) Data Analysis WS Experimen Managemen WS Visualization WS Calibration WS PWA WS Detecto Model WS Monte Carlo WS Physics Model WS ML Fit WS

A Service is just a “computer process” running on a (geographically distributed) machine with a “message-based” I/O model

It has input and output ports – data is from users, raw data sources or other services

(16)

Particle Physics

104

wide

Petaflo MPP for Data

104

wide

Teraflo Analysis Portal

(17)
(18)
(19)
(20)
(21)
(22)

USArray

US Seismic Array

a continental scale seismic array to provide a coherent 3-D image of the lithosphere and deeper Earth

SAFOD

San Andreas Fault Observatory at Depth

a borehole observatory across the San Andreas Fault to directly measure the physical conditions under which earthquakes occur

PBO

Plate Boundary Observatory

a fixed array of strainmeters and GPS receivers to measurereal-time deformation on a plate boundary scale

InSAR

: Interferometric Synthetic Aperture Radar images of tectonically active regions providing spatially continuous strain

measurements over wide geographic areas.

(23)

Structural Representation

Structural Geology Field Investigations

• Seismic Imaging (USArray)

• Gravity and Electromagnetic Surveying

Kinematic (Deformational) Representation

Geologic Structures

• Geochronology

• Geodesy (PBO and InSAR)

• Earthquake Seismology (ANSS)

Behavioral (Material Properties) Representation

Subsurface Sampling (SAFOD)

Seismic Wave Propagation

Structures + Deformation + Material properties

(24)

a Facility

Data for Science and Education Funding and Management

NSF Major Research Equipment Account Internal NSF process

Interagency collaboration

Cooperative Agreement funding Community-based management

MRE - $172 M / 5 years

Product - Data

Science-appropriate Community-driven

Hazards and resources emphasis

Fundamental Advances in Geoscience

Funding and Management

Science driven & research based Peer reviewed

Individual investigator

Collaborative / Multi-institutional

Operations - $71 M / 10 years Science - $13 M / year

Product - Scientific Results

Multi-disciplinary trend

Cross-directorate encouragement

Fundamental research and applications

an NSF Science Program

(25)
(26)

S an

A ndreas

F ault

O bservatory at

(27)

PBO – A Two-Tiered

Deployment of Geodetic

Instrumentation

A backbone of ~100 sparsely distributed

continuous GPS receivers to provide a

synoptic view of the entire North American plate boundary deformation zone.

Clusters of GPS receivers and

(28)

a

Topography 1 km

Stress Change

PBO

Site-specific Irregular

Scalar Measurements Constellations for Plate Boundary-Scale Vector Measurements

a

a

Ice Sheets Volcanoes

Long Valley, CA

Northridge, CA

(29)

Computational Pathway

for Seismic Hazard Analysis

Full fault system dynamics simulation

FSM = Fault System Model RDM = Rupture Dynamics

Model

AWM = Anelastic Wave Model SRM = Site Response Model

RDM AWM SRM MotionsGround FSM

Intensity Measures

Earthquake Forecast Model

Unified Structural Representation

Faults Fault zone structure Velocity structure

Earthquake Forecast

Paleoseismicit y

(30)

Seismic

Hazard

(31)
(32)
(33)
(34)

Societal Scale Applications Issues

Need to overlay with Decision Support as problems

are often optimization problems supporting tactical or

strategic decision

Verification and Validation as dynamics often not

fundamental

Related to ASCI Dream – physics based stewardship

Some of new areas like Biocomplexity,

Geocomplexity are quite primitive and not even

moved to today’s parallel machines

(35)

Interesting Optimization Applications

• Military Logistics Problems such as Manpower Planning for Distributed Repair/Maintenance Systems

• Multi-Tiered, Multi-Modal Transportation Systems • Gasoline Supply Chain Model

• Multi-level Distribution Systems

• Supply Chain Manufacturing Coordination Problems • Retail Assortment Planning Problems

• Integrated Supply Chain and Retail Promotion Planning • Large-scale Production Scheduling Problems

• Airline Planning Problems

(36)

Generic Routines Simulated Annealing Genetic Algorithms Other Algorithms Mathematical Prgrming Models

LP IP NLP

Parameter Estimation Output Analysis

Grid Infrastructure

HPC Resources

Decision Analysis Object Space

Multi-Purpose Tools data structures

distributed application scripting

Process Model

Decision Application Object Framework

• Support Policy Optimization and Simulation of Complex Systems

– Whose Time Evolution Can Be Modeled Through a Set of Agents Independently

Engaging in Evolution and Planning Phases, Each of Which Are Efficiently Parallelizable,

• In Mathematically Sound Ways • That Also Support

Computational Scaling

(37)

Intrinsic Computational

Difficulties

Large-scale Simulations of Complex Systems

Typically Modeled in Terms of Networks of Interacting

Agents With Incoherent , Asynchronous Interactions

Lack the Global Time Synchronization That Provides the

Natural Parallelism Exploited As Data Parallel Applications

Such As Fluid Dynamics or Structural Mechanics.

Currently, the Interactions Between Agents are Modeled by

Event-driven Methods that cannot be Parallelized

Effectively

But increased performance (using machines like the

Teragrid) needs massive parallelism

(38)

Los Alamos SDS Approach

• Networks of particles and (partial differential equation) grid points interact “instantaneously” and simulations reduce to iterating calculate/communicate phases

“calculate at given time or iteration number next positions/values” (massively parallel) and then update

• Complex systems are made of agents evolving with irregular time steps (cars stopping at traffic lights; crashing; sitting in garage while driver sleeps ..)

This lack of global time

synchronization stops natural parallelism in old approache

SDS combines iterative local planning with massively parallel update

References

Related documents

tevékenységük jelentős része nyilvánvaló társadalmi szükségletet elégít: egyének és szervezetek (például nyugdíjintézetek) igénylik megtakarításaik

Requirements Business QA/IT(Dev) • Define the Project Type (Minor Enhancement, Major Enhancement, New Capabilities – Large Scale, New.. Published by QAvantage Publish

Treatment of idiopathic thrombocy- topenic purpura (ITP) in patients with refractoriness to or with contraindication for corticosteroids and/or splenectomy with

Thirteen years of follow-up in patients with adjustable silicone gastric banding for obesity: weight loss and constant rate of late specific complications. Laparoscopic

It is a paradox that Turkey, as a developing country, has no competitiveness advantage on complexity index-high products in global markets because high-complexity index is a

122 BEL AIR HEALTH AND REHABILITATION CENTER 3 123 BRINTON WOODS POST ACUTE CARE CENTER 3 124 BROOKE GROVE REHAB. NSG &amp; REHAB

It offers a full range of testing services of soils, aggregates, bitumen, asphalt, concrete and geotechnical investigations for, amongst others, new roads,

a) Anyone using mobile devices and related software for network and data access will, without exception, use secure data management procedures. All mobile devices must be protected