Computing as a Peer to Peer Grid Service
PTLIU Laboratory for Community Grid
Geoffrey Fox
Computer Science, Informatics, Physics Indiana University
Some Technology Trends
l Increasing performance of Internet backbone and last
mile (access)
l Hand-held devices and wireless Pervasive Access
l Peer to peer technologies enable new ways of
collaborating and blurs distinction between clients and servers
l Client-Server Multi-tier Architectures
l XML Schema and tools All data defined as objects
l Separation of client, system and persistent storage
models for information
l Development of (application) service model to capture
common (maybe centralized) capabilities
l Semantic Web, Grid or … “Next Generation Web”
2
Small Devices Increasing in Importance
l There is growing
interest in wireless
portable displays in the
confluence of cell phone and personal digital assistant
markets
l By 2005, 60 million
internet ready cell
phones sold each year
l 65% of all Broadband
Internet accesses via non desktop appliances
CM5
3
Technology Trends and Principles
l All performance and capability measures of infrastructure
continue to improve
l Gilder’s law says that network bandwidth increases 3 times
faster than CPU Performance (Moore’s Law)
l The Telecosm eclipses the Microcosm ….
George Gilder
Telecosm : How
Infinite Bandwidth Will Revolutionize Our
World (September 2000, Free Press; ISBN: 0684809303, #146(3883) in Amazon Sales Jan 15 2001(July 29 2001))
4
What is a Grid Service?
l The Grid is distributed system allowing communities to access
seamlessly heterogeneous resources from heterogeneous clients
– Resources are web-pages, instruments, Object repositories,
Simulation codes running on supercomputers ….
l A Service is a generic application or capability respecting
standards (general web and application specific) allowing multiple providers to compete on a given service
Back en Capababilit y
Middle Tie Broker
Portal is
customizable User
interface Resourc
e
The Grid is essentially is the future Web
IBM just announced they were investing aroun
$1 Billion in Grid
5
Some General Grid Services
l Business is developing “web service” concept to support
areas like e-commerce where one composes atomic services like
– Security
– Payment
– Catalog
– Goods supply
Securit
y Catalog
Paymen Credit
Card
Warehous e
shipping
Each of these services could allow Multiple choices of provider
In a given session
WSDL is new standard for web services
6
Grid Services support Communities
l Grid Communities (PTLIU, NSF, Earth Science, High
School Classes) are groups of communicating
individuals sharing resources implemented as Grid Services
l Access Grid from Argonne/NCSA is best Audio/Video
conferencing technology
l Peer to Peer networking describes a set of technologies
supporting community building with an emphasis on less structured groups than classic “users of a
supercomputer”
l Peer to peer Grids combine the technologies and support
“small worlds” – optimized networks with short links between each community member
l Collaborative Grid Service Framework allows one to
build community not individually oriented Grid Services
7
Architecture of Grid: Commodity
Science
l Commerce, Entertainment, Healthcare, Science,
Computing, Education …. will be Grid Services
Science Portals & Workbenches
Twenty-First Century University and laboratory Computational Services P e r f o r m a n c e
Networking, Devices and Systems Grid Services (resource independent)
Grid Fabric (resource dependent)
Research Services & Technology
Research
Grid ComputationalGrid
Community Portals Next Generation Consumer Web Education Services Business Services Commerce
Grid EducationGrid
Examples of Grid or Web Services
l There are generic Grid system services: security, collaboration,
persistent storage, universal access
l An Application Service is a capability used either by another
service or by a user
– It has input and output ports – data is from sensors or other
services
l Consider NASA Space Operations (CSOC) as a Grid Service
– Spacecraft management (with a web front end) – Each tracking station is a service
– Image Processing is a pipeline of filters – which can be
grouped into different services
– Data storage is an important system service
– Big services built hierarchically from “basic” services
l Portals are the user (web browser) interfaces to Grid
services
9
Data base Matrix Solver MPP MPP Parallel D Proxy Senso Contro Origin 200 Proxy NetSol v Linear Alg Server
Integration of Grid Services
IBM S Proxy Grid Gateway Supportin Seamles Interface Agent-base Choice o Compute Engine Multidisciplinar Control
Object Grid Programming Environment
Classic HPCC Resources
The Application Service Model
l As bandwidth of communication (between) services increases one
can support smaller services
l Some fields such as Education do not have stringent
latency/bandwidth requirements on inter-service communication
– Computing services must often have high performance communication
l A service “is a component” and is a replacement for a library in
case where performance allows
l Services are a sustainable model of software development – each
service has documented capability with standards compliant interfaces
– XML defines interfaces at several levels
– WSDL at Grid level and XSIL or equivalent for scientific data format
l A service can be written in Perl, Python, Java Servlet, Enterprise
Javabean, CORBA (C++ or Fortran) Object …
l Communication protocol can be RMI (Java), IIOP (CORBA) or
SOAP (HTTP, XML) ……
11
Classic Grid Architecture
Database Database
Netsolv e
Neo s
Securit y Porta
l
Compositio n
Porta l
Resources
Client
s Users and Devices
Middle Tie Brokers Service Providers
Typically separate Clients Servers Resources
12
Peer to Peer Network
User Resource Service Routing User Resource Service Routing User Resource Service Routing User Resource Service Routing User Resource Service Routing User Resource Service Routing PeersPeers are Jacks of all Trades linked to “all” peers in communityTypically Integrated Clients Servers and Resources
13
Services GMS Routing
Peer to Peer Grid
Peers on the Edge of the Internet
Servers at th
center of the world
P2P Grid wit
Peers and Servers
15
HPCC Background
l The 1990 HPCC 10 year initiative was largely aimed at
enabling large scale simulations for a broad range of computational science and engineering problems
l It was in many ways a success and we have methods and
machines that can (begin to) tackle most 3D simulations
– ASCI simulations particularly impressive
– DoE still putting substantial resources into basic software and
algorithms from adaptive meshes to PDE solver libraries
l Machines are still increasing in performance
exponentially and should achieve petaflops in next 7-10 years
l Each computing community needs to harness these
capabilities in customized fashion
– ASCI(DoE), Earth Simulator(Japan), Teragrid(NSF) …..
16
Some HPCC Difficulties
l An Intellectual failure: we never produced a better
programming model than message passing
– HPCC code is hard work
– “High point” of ASCI software is “Grid FTP”
l An institutional problem: we do not have a way to produce
complex sustainable software for a niche (1%) market like HPCC.
– POOMA support just disappeared one day – DoE is
funding efforts for their critical missions – not to support general communities
– One must adopt commodity standards and produce
“small” sustainable modules.
– Note distributed memory becoming dominant again with
complex hybrid clustered SMP architecture – not clear that “wise” to exploit advantages of shared memory
architectures
17
Personal HPCC Advice
l
KISS:
K
eep
i
t
Simple
and
Sustainable
l
Use
MPI
and
openMP
if needed for performance
on shared memory nodes
l
Adaptive Meshes
l
Load Balancing
l
PDE Solvers including
fast multipoles
l
Particle dynamics
l
Other areas such as datamining, visualization
and data assimilation quite advanced but still
significant research
}
Are well understoo
to get high performanc parallel simulation
Use broad communit expertise
18
Use of Object Technologies
l The claimed commercial success in using Object and
component technology has not been a clear success in HPCC
– Object technologies do not naturally support either
high performance or parallelism
– C++ can be high performance but CORBA and Java
are not
– There is no agreed HPCC component architecture to
produce more modern libraries (DoE has very large
CCA – Common Component Architecture – effort which should be followed)
l Fortran will continue to decline in importance and
interest – the community should prefer not to use it
– It’s use will not attract the best students
19
Application Structure
l Modern applications are typically scale and
multi-disciplinary
– i.e. a given simulation is made of multiple components with
either different time/length scales and/or multiple authors from possibly multiple fields
l I am not aware of a systematic “Computational
renormalization group” – a methodology that links different scales together
l However composition of modules is an area where
technology of growing sophistication is becoming available
– Needed commercially to integrate corporate functions
– CCA tackles challenging “small grain size”; Gateway example
of clearly successful large grain size integration
20
Object Size & Distributed/Parallel Simulations
l All interesting systems consist of linked entities
– Particles, grid points, people or groups thereof
l Linkage translates into message passing
– Cars on a freeway
– Phone calls
– Forces between particles
l Amount of communication tends to be proportional to
surface area of entity whereas simulation time proportional to volume
l So communication/computation is surface/volume and
decreases in importance as entity size increases
l In parallel computing, communication synchronized; in
distributed computing “self contained objects” (whole programs) which can be scheduled asynchronously
21
Community HPCC and Grid Strategy I
l Decide what services are well enough understood and
useful enough to be encapsulated as application services
– Parallel FEM Solvers
– Visualization
– Parallel Particle Dynamics – Access to Sensor Data
l Make as small as possible – smaller is simpler and more
sustainable but with higher communication needs
l Establish teams to design and build services
l Use a framework offering needed Grid System services l Build electronic community for each field with
collaboration tools, resources and world wide networking linking community members
22
Community HPCC and Grid Strategy II
l Some capabilities – such as fast multipole or adaptive
grids package – should be built as classic libraries or templates
l Other services – such as datamining or support of
multi-scale simulations – need research using a toolkit approach if one can design a general structure
l Need “hosts” for major services – access and storage of
sensor data
l Need funds to build and sustain “infrastructure” and
research services
l Use electronic community tools to enhance
Collaboration
23
Sensor Grid Service
Distributed Sensor Service
in
ports
out por universal sensor acces people/computers
24
Peer to Peer Grid Community
APAN Network linkin