Networks and Grids in Solid Earth Science

(1)

Geoffrey Fox

Andrea Donnellan

May 3, 2004

Network and Grid

Computing

(2)

Solid Earth Science

Questions

From NASA’s Solid Earth Science Working Group Report, Living on a Restless Planet, Nov. 2002

•What is the nature of deformation at plate

boundaries and what are the implications for earthquake hazards?

•How do tectonics and climate interact to shape the Earth’s surface and create natural hazards?

•What are the interactions among ice masses, oceans, and the solid Earth and their implications for sea level change?

•How do magmatic systems evolve and under what

conditions do volcanoes erupt?

•What are the dynamics of the mantle and crust and how

does the Earth’s surface respond?

(3)

The Solid Earth is

Complex, Nonlinear, and Self-Organizing

Relevent questions that Computational technologies can help answer:

• How can the study of strongly correlated solid earth systems be enabled by space-based data sets?

• What can numerical simulations reveal about the physical processes that characterize these systems?

• How do interactions in these systems lead to space-time correlations and patterns?

• What are the important feedback loops that mode-lock the system behavior?

• How do processes on a multiplicity of different scales interact to produce the emergent structures that are observed?

• Do the strong correlations allow the capability to forecast the system behavior in any sense?

(4)

Characteristics of Computing for

Solid Earth Science

• Widely distributed heterogeneous datasets

• Multiplicity of time and spatial scales

• Decomposable problems requiring

interoperability for full models

• Distributed models and expertise

(5)

Objectives

• IT approaches:

Integrate multiple

scales into computer simulations.

• Web services:

Simplified access to

data, simulation codes, and flow

(6)

What are Grids Good for?

• They are “Internet Scale Distributed Computing” and

support the linking of globally distributed entities in

e-Science

concept

– Computers

– Data from repositories and sensors

– People

• Early Grids focused on

metacomputing

(linking computers

together) but recently e-Science has highlighted

integration of data and building communities

• Grid technology naturally build

Problem Solving

(7)

Some Relevant Grid/Framework

Projects

• QuakeSim and Solid Earth Research Virtual Observatory

SERVOGrid (JPL …)

• GEON: Cyberinfrastructure for the Geosciences (San Diego,

Missouri, USGS ..)

• CME: Community Modeling Environment from SCEC

• CIG: Computational Infrastructure for Geodynamics

• Geoframework.org Caltech/VPAC

• ESMF: Earth System Modeling Framework (NASA)

• NERCGrid: Natural Environment Research Council UK

e-Science

(8)

Earth Science Computing

(9)

(10)

(11)

Large Scale Parallel Computers

Metacomputing Grid

Analysis and Visualization

NO Capability: Spread a single large Problem over multiple supercomputers

YES Capacity: Seamless access to multiple computers

(12)

Database Database

Researc Simulation

s _{Analysis and}

Visualizatio Portal Repositorie Federated Databases Data Filte Services

Field Trip Data

Streaming Data Sensor s

?

Discovery Services SERVOGrid Research Education Customization Services From Researc to Education Educatio Grid Computer Farm

(13)

More General Material on

Grids

• Grids today are built in terms of Web Services – a technology designed to support Enterprise Software and e-Business

– Provides wonderful support tools

– Provides a new software engineering model supporting interoperability

• Grids do not compete with parallel computing

– They let MPI run untouched so your parallel codes run as fast as they used to do

• Grids do “control/management/metadata management” where

higher latency (around 10 milliseconds – thousand times worse than MPI) acceptable

(14)

(15)

Grids provide

• “Service Oriented Architecture” supporting distributed programs in scalable fashion with clean software engineering

• “Multi-tier” architecture supporting seamless access with brokers mediating access to diverse computers and data sources

• “Workflow” integrating different distributed services in a single application

• Event services to notify computers and people of issues (earthquake struck, job completed)

• Easy support of parameter searches and other pleasingly parallel applications with many related non-communicating jobs

• Security (Web Services), Database access (OGSA-DAI), Collaboration (Access Grid, GlobalMMCS)

(16)

Web Services

• Web services are the fundamental pieces of distributed Service Oriented Architectures.

• We should define lots of useful services that are remotely available

– Archival data access services supporting queries, real time sensor access, and mesh generation all seem to be popular choices.

• Web services have two important parts:

– Distributed services – Client applications

• These two pieces are decoupled: one can build clients to remote services without caring about the programming language implementation of the remote service.

(17)

Web Services, Continued

• Clients can be built in any number of styles

– We build portletclients: ubiquitous, can combine – One can build fancier GUI client applications.

– You can even embed Web service client stubs (library routines) in your application code, so that your code can make direct calls to remote data sources, etc.

• Regardless of the client one builds, the services are the same in all cases:

– my portal and your application code may each use the same service to talk to the same database.

• So we need to concentrate on services and letclients bloom as they may:

– Client applications (portals, GUIs, etc.) will have a much shorter lifecycle than service interface definitions, if we do our job

correctly.

– Client applications that are locked into particular services, use proprietary data formats and wire protocols, etc.,

(18)

Data Deluged Science

• During the HPCC Initiative 1990-2000, we worried about data in the form of parallel I/O or MPI-IO, but we didn’t consider it as an enabler of new

algorithms and new ways of computing • Data assimilationwas not central to HPCC

• DoE ASCI (Stockpile Stewardship)set up because didn’t want/have test data! • Nowparticle physics will get 100 petabytes from CERN LHC

– Nuclear physics (Jefferson Lab) in same situation – Use continuously ~30,000 CPU’ssimultaneously 24X7

• Weather, climate, solid earth (EarthScope)

• Bioinformatics curated databases

(19)

Data

Information

Ideas Simulation

Model

Assimilation

Reasoning

Datamining

Computationa Science

Informatics

Data Deluge

Scienc

(20)

HPC Simulation Data Filter Data Filter Data Filter Data Filt er Data Filt_er Distributed Filters massage data For simulation Other Gri

(21)

Some Questions for Data

Deluged Science

• A new trade-off: How to split funds between sensors and simulation engines

• No systematic study of how best to represent data deluged sciences without known equations at resolution of interest • Data assimilation very relevant

• Relationship to “just” interpolating data and then extrapolating a little

• Role of Uncertainty Analysis – everything (equations, model, data) is uncertain!

• Relationship of data mining and simulation

• Growing interest in Data curation and provenance

(22)

Recommendations of NASA’s

Computational Technologies

Workshop (May 2002)

• Create a Solid Earth Research Virtual Observatory (SERVO)

• Numerous distributed heterogeneous real-time datasets

• Seamless access to large distributed volumes of data

• Data handling and archiving part of framework

• Tools for visualization, datamining, pattern recognition, and data fusion

• Develop an Solid Earth Science Problem Solving Environment (PSE)

• Addresses the NASA specific challenges of multiscale modeling

• Model and algorithm development and testing, visualization, and data assimilation

• Scalable to workstations or supercomputer depending on size of problem

• Numerical libraries existing within a compatible framework

• Improve the Computational Environment

• PetaFLOP computers with Terabytes of RAM

• Distributed and cluster computers for decomposable problems

(23)

SERVOGrid

Requirements

• Seamless Access to Data repositories and large scale computers • Integration of multiple data sources including sensors, databases,

file systems with analysis system

– IncludingfilteredOGSA-DAI (Grid database access)

• Rich meta-data generation and access with SERVOGrid specific

Schema extending openGIS (Geography as a Web service) standards and using Semantic Grid

• Portals with component model for user interfaces and web control of all capabilities

(24)

Solid Earth Research Virtual Observatory

(SERVO)

Tier2 Center Archive SERVO

…

Goddard Ames JPL

Institute Institute

Institute

Institute

Fully functional problem solving environment

•Plug and play composing of parallel programs from algorithmic modules

•On-demand downloads of 100 GB in 5 minutes

•106 _{volume elements rendering in real-time}

•Program-to-program communication in milliseconds

•Approximately 100 model codes

Data cache

~TBytes/day

Tier2 Center Tier2 Center

Tier2 Center

Tier 0 +1

Tier 1

Tier 3

Tier 4

Tier2 Center

1 PB per year data rate in 2010 Distributed Heterogeneous

Real-Time Datasets Observations Archive Downlink Archive Downlink Downlink

…

_…

…

_…

…

100 TeraFLOPs sustained

Tier 2

Workstations, other portals

(25)

Virtual Observatory Project

2003 2004 2005 2006 2007 2008 2009 2010 Timeline Capabilit y Architecture & technology approach Decomposition into services with requirements

Prototype cooperative federated data base service integrating 5 datasets of 10 TB each

Prototype data analysis service Prototype modeling service capable of integrating 5 modules

Prototype 1920x1080 pixels at 120 frames per second visualization service

Scaled to 100 sites

•Solid earth research virtual observatory (SERVO) •On-demand downloads of 100 GB files from 40 TB datasets within 5 minutes.

•Uniform access to 1000 archive sites with volumes from 1 TB to 1 PB

(26)

Problem Solving Environment Project

2003 2004 2005 2006 2007 2008 2009 2010 Timeline Capabilit y Isolated platform dependent code fragments

Prototype PSE front end (portal) integrating 10 local and remote services

Extend PSE to Include

•20 users collaboratory with shared windows

•Seamless access to high-performance computers linking remote processes over Gb data channels.

Integrated visualization service with volumetric rendering

•Fully functional PSE used to develop models for building blocks for simulations. •Program-to-program communication in milliseconds using staging, streaming, and advanced cache replication

•Integrated with SERVO

•Plug and play

composing of parallel programs from

algorithmic modules Plug and play composing

of sequential programs from algorithmic modules

(27)

Computational Environment

2003 2004 2005 2006 2007 2008 2009 2010 Timeline

Capabilit

y

100’s GigaFLOPs 40 GB RAM

1 Gb/s network bandwidth

~100 model codes with parallel scaled efficiency of 50

~104PetaFLOPs throughput per subfield per yea

~100 TeraFLOPs sustained capability per model

~106volume

elements rendering in real time

Access to mixture of platforms low cost clusters (20-100) to supercomputers with massive memory and thousands of

processors

NASA CT Workshop, May 2002

(28)

Solid Earth Research Virtual

Observatory (iSERVO)

Web-services (portal) based Problem Solving Environment (PSE) Couples data with simulation, pattern recognition software, and visualization software

Enable investigators to seamlessly merge multiple data sets and models, and create new queries.

Data

•Spaced-based observational data

•Ground-based sensor data (GPS, seismicity) •Simulation data

•Published/historical fault measurements

Analysis Software

•Earthquake fault

•Lithospheric modeling

(29)

Philosophy

•Store simulated and observed data

•Archive simulation data with original simulation code and analysis tools

•Access heterogeneous distributed data through cooperative federated databases

•Couple distributed data sources, applications, and hardware resources through an XML-based Web Services framework. •Users access the services (and thus distributed resources) through Web browser-based Problem Solving Environment clients.

(30)

SERVOGrid Basics

• Under development in collaboration with

researchers at JPL, UC-Davis, USC, and Brown

University.

• Geoscientists develop simulation codes, analysis

and visualization tools.

• We need a way to bind distributed codes, tools,

and data sets.

• We need a way to deliver it to a larger audience

(31)

SERVOGrid Application Descriptions

• Codes range from simple “rough estimate” codes to parallel, high performance applications.

– Disloc: handles multiple arbitrarily dipping dislocations (faults) in an elastic half-space.

– Simplex: inverts surface geodetic displacements for fault parameters using simulated annealing downhill residual minimization.

– GeoFEST: Three-dimensional viscoelastic finite element model for calculating nodal displacements and tractions. Allows for realistic fault geometry and

characteristics, material properties, and body forces.

– Virtual California: Program to simulate interactions between vertical strike-slip faults using an elastic layer over a viscoelastic half-space

– RDAHMM: Time series analysis program based on Hidden Markov Modeling. Produces feature vectors and probabilities for transitioning from one class to another.

– PARK: Boundary element program to calculate fault slip velocity history based on fault frictional properties.a model for unstable slip on a single earthquake fault.

(32)

iSERVO Web Services

• Job Submission: supports remote batch and shell invocations

– Used to execute simulation codes (VC suite, GeoFEST, etc.), mesh generation (Akira/Apollo) and visualization packages (RIVA, GMT). • File management:

– Uploading, downloading, backend crossloading (i.e. move files between remote servers)

– Remote copies, renames, etc. • Job monitoring

• Apache Ant-based remote service orchestration

– For coupling related sequences of remote actions, such as RIVA movie generation.

• Database services: support SQL queries

• Data services: support interactions with XML-based fault and surface observation data.

– For simulation generated faults (i.e. from Simplex)

– XML data model being adopted for common formats with translation services to “legacy” formats.