Grid-based Information Architecture for
iSERVO International Solid Earth Researc
Virtual Organization
Western Pacific Geophysics Meeting (WPGM) Beijing Convention Center
July 26 2006 Geoffrey Fox
Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401
http://grids.ucs.indiana.edu/ptliupages/presentations/
APEC Cooperation for Earthquake Simulation
n ACES is a seven year-long collaboration among scientists interested in earthquake and tsunami predication
• iSERVO is Infrastructure to suppor
work of ACES
• SERVOGrid is (completed) US Grid that is
a prototype of iSERVO
• http://www.quakes.uq.edu.au/ACES/
Participating Institutions
n CSIRO Australia
n Monash University Australia
n University of Western Australia, Perth,
Australia
n University of Queensland Australia n University of Western Ontario Canada n University of British Columbia Canada n China National Grid
n Chinese Academy of Sciences
n China Earthquake Administration n China Earthquake Network Center n Brown University
n Boston University
n Jet Propulsion Laboratory n Cal State Fullerton
n San Diego State University
n UC Davis n UC Irvine n UC San Diego
n University of Southern California n University of Minnesota
n Florida State University n US Geological Survey
n Pacific Tsunami Warning Center PTWC
Hawaii
n National Central University, Taiwan
(Taiwan Chelungpu-fault Drilling Project)
n University of Tokyo
n Tokyo Institute of Technology (Titech) n Sophia University
n National Research Institute for Earth
Science and Disaster Prevention (NIED) Japan
Role of Information Technology
and Grids in ACES
Numerical simulations of physical, biological and social systems
Engineering design
Economic analysis and planning Sensor networks and sensor webs High performance computing
Data mining and pattern analysis Distance collaboration
Distance learning
Public outreach and education
Emergency response communication and planning Geographic Information Systems
Grids and Cyberinfrastructure
n Grids are the technology based on Web services that implementCyberinfrastructure i.e. support eScience or science as a team sport
• Internet scale managed services that link computers data
repositories sensors instruments and people
n There is a portal and services in SERVOGrid for
• Applications such as GeoFEST, RDAHMM, Pattern
Informatics, Virtual California (VC), Simplex, mesh generating programs …..
• Job management and monitoring web services for running
the above codes.
• File management web services for moving files between
various machines.
• Geographical Information System services • Quaketables earthquake specific database • Sensors as well as databases
• Context (dynamic metadata) and UDDI system long term
metadata services
Database Database
Analysis and Visualizatio Portal
Repositorie Federated Databases
Data Filte Services
Field Trip Data
Streaming Data Sensor
s
?
Discovery Services
SERVOGrid
Researc Simulation s
Research Education
Customization Services
From Researc to Education
Educatio Grid
Computer Farm
Grid of Grids: Research Grid and Education Grid
GI Grid
Sensor Grid Database Grid
SERVOGrid has a portal
Semantically Rich Services with a Semantically
Rich Distributed Operating Environment
Database S S S S S S S S S S S S S S S S S S S
S SS SS SS SS SS SS SS SS
F S F S F S F S F S F S F S F S F
S SF
F S F S F S F S F S F S F S F S F S F S F
S Portal
F S O S O S O S O S O S O S O S O S O S O S O S O S MD MD MD MD MD MD MD MD MD
MetaData Filter Service Sensor Service Other Service SOAP Message Streams SOAP Message Streams
Raw Data Raw Data
Raw Data Raw Data Data Data Data Data Information Information Knowledge Knowledge Wisdom Decisions Information Anothe Servic e Anothe Servic e Anothe
Grid Grids of Grids Architecture AnotheGrid
is same as outward facing applicatio
Linking Grids and Services
n Linkage of Services and Grids requires that messages sent by one
Grid/Service can be understood by another
n Inside SERVOGrid all messages use
• Web service system standards we like (UDDI, WS-Context, WSDL,
SOAP) and
• GML as extended by WFS so that data sources and simulations all
use same syntax
n All other Web service based Grids use their favorite Web service
system standards but these differ from Grid to Grid
• Further there is no agreement on application specific standards –
not all Earth Science Grids use OGC standards
• OGC standards include some capabilities overlapping general Web
Services
• Use of WSDL and SOAP is agreed although there are versioning
issues
n So there is essentially there is no service level interoperability between
Grids but rather interoperation is at diverse levels with shared technology
• SQL for databases, PBS for Job scheduling, Condor for job
Grids in Babylon
n Presumptuous Tower of Babel (from the web)
• In the Bible, a city (now thought to be Babylon) in Shinar where God
confounded a presumptuous attempt to build a tower into heaven by confusing the language of its builders into many mutually
incomprehensible languages.
n For Grids, everybody likes to do their own thing and Grids are complex multi-level entities where no obvious points of
interoperation
• so one does not need divine intervention to create multiple Grid specifications
• But data in China, Tsunami sensors in Indian ocean and simulations in
USA etc. will not be linked for better warning and forecasting unless the national efforts can interoperate
n Two interoperation strategies:
• Make all Grids use the same specifications (divine harmony)
• Build translation services (filters!) using say OGF standards as a common
target language (more practical)
n Don’t need computers (jobs) to be interoperable (although this would be good) as each country does its own computing
Interoperability Summary
n Need to define common infrastructure and domain specific
standards
• Build Interoperable Infrastructure gatewayed to existing legacy applications and Grids
n Generic Middleware
• Grid software including workflow
• Portals/Problem Solving environments incl. visualization
• We need to ensure that we can make security, job submission, portal, data access (sharing) mechanisms in different economies interoperate
n Geographic Information Systems GIS
• Use services as defined by Open Geospatial Consortium (Web Map and Feature Services) http://www.crisisgrid.net/
n Earthquake/Tsunami Science Specific
• Satellites, sensors (GPS, Seismic)
Pacific Rim Universities
(APRU ) PRAGMA SERVOGrid GEON SCECGrid Vlab Earth Simulator Naregi
China National Grid Access Infrastructure Institutions IMS International TeraShake Pattern Informatics ALLCAL GeoFEST, PARK, VirtualCalifornia QuakeTables Sesismic InSAR PBO (GPS) U.S.A. FORMOSAT-3/COSMIC (F/C) Chines Taipei JST-CREST GeoFEM GPS Seismic Daichi (InSAR) Japan CAS LURR Seismic GPS P.R. China Pattern Informatics Polaris Radarsat Canada prototype Finley, LSM PANDAS Seismic data, fault database, GPS Australia Wave Motion Earthquake Forecast/Model Data (shared
as part of collaboration) Country
and/or Economies
National Earthquake Grids of Relevance
n APAC –GT2 GT4 gLiten ACcESS – Some link to SERVOGrid n China National Grid – GOS GT3 GT4 n ChinaGrid – CGSP built on GT4
n CNGI – China’s Next Generation Internet has significant earthquake data component
n Naregi – Uses GT4 and Unicore with much enhancements n Japanese Earthquake Simulation Grid – unclear
n K*Grid Korea Enhanced SRB, GT2 to GT4
n TIGER Taiwan Integrated Grid for Education and Research unclear technology and unclear earthquake relevance
n SERVOGrid – Uses WS-I+ simple Web Services
TeraGrid: Integrating NSF Cyberinfrastructure
TeraGrid is a facility that integrates computational, information, and analysis resources at the San Diego Supercomputer Center, the Texas Advanced Computing Center, the University of Chicago / Argonne National Laboratory, the National Center for Supercomputing Applications, Purdue University,Indiana University, Oak Ridge National Laboratory, the Pittsburgh
Supercomputing Center, and the National Center for Atmospheric Research.
Today 100 Teraflop; tomorrow a petaflop; Indiana 20 teraflop today.
SDSC
TACC
UC/ANL
NCSA
ORNL
PU IU
PSC NCAR
Caltech
USC-ISI Utah
Iowa
Cornell Buffalo
QPSF
ANU VPAC
ac3
TPAC
CSIRO
Network:
GrangeNet / AARNet
APAC Private Network (AARNet) Security:
APAC CA MyProxy VOMRS
APAC National Gri
Core Grid Services
Portal Tools: GridSphere
Info Services: APAC Registry INCA2?
IVEC
SAPAC
APAC National
Facility
Systems: Gateways Partners’ systems
QPSF
(JCU)
National “Grid Projects” in China
Net-based Res. Env.
Plan
Research
Develop
Production
Procur Deplo Operat Manage
CAS e-Science
Science and Technology R &D Assets Foundation Platform
Next-Generation Network Initiative Edu. & Res. Grid Chin
National Grid
Stat
Council
NSF
CAS MoE MoST Nationa Plannin
Commission Semantic
Grid China e-Nation Strategy (2006-2020)
Virtual Comp. Env.
CNGrid (2006-2010)
•
HPC Systems
– 100 Tflop/s by 2008, Pflop/s by 2010?
•
Grid Software Suite: CNGrid GOS
– Merge with
international efforts
– Emphasize production
•
CNGrid Environment
– Nodes, Centers, Policies
•
Applications
– Science
– Resource & Environment
– Manufacturing
– Services
Cyber Science Infrastructure toward Petascale Computing (planned 2006-2011)
Cyber-Science Infrastructure(CSI)
(IT Infra. for Academic Research and Education)
Operation/ Maintenan ce (Middlewa re) Networkin g Contents
NII
Collaborative Operation CenterDelivye r
Delivery
Networking Infrastructure (Super-SINET)
Univ./National Supercomputing VO Domain Specific VO (e.g ITBL) Feedback Feedback R&D Collaboration Operaontional Collaborati Middlewa re CA NAREGI Site Research Dev.(
)βver.V1.0 V2.0 International Collaboration - EGEE - UNIGRIDS -Teragrid -GGF etc. Feedback Deliver y Project-Oriented VO Delivery Domain Specific VOs Customization Operation/Maintenance
ナノ分野 実証・評価
分子研 ナノ分野 実証・評価
分子研 Nano Proof of al.Concept Eval. IMS Nano Proof, Eval. IMS Joint Project
(Bio)
Osaka-UJoint Project AIST R&D Collaboration Industrial Projects Project-oriented VO
Note: names of VO are tentative) Peta-scale System VO Core Site R&D Collaboration Operation/ Maintenance
(UPKI,CA)
Japanese Earthquake Simulation Gri
Integrated
Observation-Simulation Data Grid
PC Cluster
ERI, 64xOptero
n paraAVS
Data-Server
GSI 8xOpteron
20TB
Data-Server
NIED 48xG5,
15TB
PC Cluster
EPS, 64xOptero
n paraAVS
Super SINET (10Gbps)
Earth Simulator
GONET Hi-net K-NET
Database for Model Construction Plate
Motion
Platform for Integrated Simulation
Data Processing, Visualization, Linear Solvers
Simulation Output
PC clusters for small-intermediate problems
Earth Simulator for large-scale problems
GIS Urban Information Tectonic
Loading
Earthqua keRuptur
e Structure
Oscillatio n Wave
Propagati on Tsunami Generatio
n
Earthquake Generation
Strong Motion and Tsunami Generation
JST-CREST Integrated Predictive Simulation System
Artificial Structure Oscillation
Crustal Movement
Data Analysis
Seismic Activity
Data Analysis
Current PTWC Network of Seismic Stations
The NCES/WS-*/GS-* Features/Service Areas I
Portlets JSR168, NCES Capability Interfaces NCES7
WS10
FS11: Portals and Users
ECS WS9 FS10: Policy CIM NCES1 GS6 WS8 FS9: Management Globus MDS
Semantic Grid, WS-Context WS7
FS8: System Metadata & State
UDDI and extensions NCES4
WS6
FS7: Discovery
Grid-Shib, Permis Liberty Alliance ... NCES2 GS7 WS5 FS6: Security Grid Programming NCES5 WS4 FS5: Workflow
JMS, MQSeries, WS-Eventing, Notification NCES3
WS3
FS4: Notification
Core Infrastructure including reliability, publish-subscribe messaging cf. FS13C
NCES3 WS2
FS3: Service Internet, Messaging
B: Core Services (Mainly Service Infrastructure and W3C/OASIS focus)
Strategy for legacy subsystems: modular architecture
FS2: Grid of Grids
Core Service Architecture, Build Grids on Web Services. Industry best practice
WS1
FS1: Use SOA: Service Oriented Arch. A: Broad Principles
Comments NCES
(DoD)
GS-*
The NCES/WS-*/GS-* Features/Service Areas II
NCES3, 8
WS 2,3 GS4 GridFTP or WS Interface to non SOAP transport
FS13C: Data Transport
B: Core Services (Mainly Higher level and OGF focus)
VOSpace for IVOA, JBI for DoD, WFS for OGC Federation at this layer major research area
NCOW Data Strategy
NCES8
GS4
FS14B: Information, Knowledge, Wisdom part of D(ata)IKW
OGC SensorML
FS13B: Data as Sensors and Instruments
Current work only addresses scheduling “batch jobs”. Need networks and services
GS3
FS18: Scheduling and matching of Services and Resources
XGSP, Shared Web Service ports
NCES6
GS7
FS17: Collaboration and Virtual Organizations
Ad-hoc networks; Network Monitoring GS5
FS16: Resources and Infrastructure
Standalone Services Proxies for jobs
NCES9
GS2
FS15: Applications and User Services
Major Grid effort for job status etc. GS4
FS14A: Information as Monitoring
Distributed Files, OGSA-DA Managed Data is FS14B
NCES8
GS4
FS13A: Data as Repositories: Files and Databases
Job Management major Grid focus GS3
FS12: Computing
Comments NCES
GS-* WS-*