Component-Based Portals for
Grid Computing
Marlon Pierce
OGCE
Consortiu m
NSF NMI Project for Reusable Portal
Components: Who We Are
•
University of Chicago
–
Gregor von Laszewski
•
Indiana University
–
Marlon Pierce, Dennis Gannon, Geoffrey Fox, and
Beth Plale
•
University of Michigan
–
Charles Severance, Joseph Hardin
•
NCSA/UIUC
–
Jay Alameda, Joe Futrelle
•
Texas Advanced Computing Center
What Is Grid Computing?
•
Grid Computing provides an overlay infrastructure that
can be used to bind computing and data resources
from multiple organizations into “virtual
organizations”.
–
Security, information services, resource access protocols, file
transfer, etc.
•
Open Grid Services Architecture recasts Grid
capabilities as Web Services
–
WSDL descriptive conventions, advanced features for
transient services, etc.
–
Service hosting environments manage service lifecycles,
interactions with requestor agents.
•
But what about the clients?
OGCE
Consortiu m
Towards A Common Grid
Client Hosting Environment
Grid portal background and
What Is a Computing Portal?
•
Browser based user interface for accessing grid and
other services
–
“Live” dynamic pages for accessing grid services
–
Use(d) Java/Perl/Python COGs
–
Manage credentials, launch jobs, manage files, etc.
–
Hide Grid complexities
–
Can run from anywhere
–
Unlike user desktop clients, connections go through portal
server, so overcome firewall/NAT issues
•
Combine “Science Grid” with traditional web
capabilities
–
Get web pages for news feeds
–
Post and share documents
–
And other more traditional web page features
OGCE
Consortiu m
Let 10000 Flowers Bloom
•
Many portal projects have been launched since
late ’90s.
–
HotPage from SDSC, NCSA efforts, DOD, DOE
Portals, NASA IPG
–
2002 Special Issue of Concurrency and
Computation
•
Continue to be important component of many
large projects
–
NEESGrid, DOE SciDAC projects, NASA, NSF, many
international efforts
•
Global Grid Forum’s Grid Computing
Environments Research Group
Port
al
User
Int
erf
ace
Grid Resource Broker Service Grid and Web Protocol s Information and Data Services Database Service Database HPC or Compute Cluster Grid Information Services, SRB Portal Client Stub Portal Client Stub Portal Client Stub JDBC, Local, or Remote Connectio nThree-Tiered Architecture
OGCE
Consortiu m
Problem with Portals
•
GCE revealed two things
–
Everyone was doing the same thing
• Not quite, but significant
• Everyone builds secure logins, remote file manipulation, command execution, access to info servers.
• Everyone would at least like support for multiple user roles (administrators, users) and customization
–
No one could share components with other groups
• No well defined way of sharing UI components or making services interoperate.
• No well defined interfaces to portal services.
•
A research opportunity!
–
Two levels of integration: user interfaces and services
•
Our challenges
–
Stop reinventing things and provide ways for groups to reuse
components.
–
Provide a portal marketplace for competing (advanced) services.
A Solution based on components
•
A software component is object defined by
–
A precise public interface
–
A semantics that includes a set of “standard”
behaviors.
•
A Software component architecture is:
–
A a set of rules for component behavior &
–
A framework in which components can be easily
installed and interoperate.
•
The component architecture of choice for the
Portal community is the one based on portlets
–
Java components that generate content, make local
and remote connections to services.
OGCE
Consortiu m
A Portlet Approach to Grid Services
•
A Portlet is a portal server component that provides
basic services rendered in a user-configurable window
in a portal pane.
Portal
Server
MyProx
y
Server
Metadat
a
Director
y
Service(
s)
Director
y
& index
Service
s
Applicati
on
Factory
Service
The Grid Portal
•
Provides Portlets for
–
Management of user proxy
certificates
–
Remote file Management via
Grid FTP
–
News/Message systems
• for collaborations
–
Grid Event/Logging service
–
Access to OGSA services
–
Access to directory services
–
Specialized Application
Factory access
• Distributed applications
• Workflow
–
Access to Metadata Index
tools
OGCE
Consortiu m
Portlet Component and Container
Technologies
•
Jakarta Jetspeed
–
Open source Java portlet project
–
Jetspeed is both a framework and reference
implementation
–
Defines portlets, portal service APIs (login,
authorization, customization, etc.)
•
CHEF from University of Michigan
–
Uses Jetspeed as a framework
•
Reimplements many of the core classes
–
Basis for UM CourseTools
–
NEESGrid portal
OGCE
Consortiu m
Background
•
CHEF is organized around groups of users
•
Portals in CHEF are group based (a group
can consist of only one person!)
•
A user sees the Portals for each group of
which that person is a member
•
The Portal is a collection of Portal pages
Portal
Engine:
Jetspeed
Velocity
CHEF
Teamlets:
Written in JAVA
Responsible for GUI
Operate in the
context of a session.
Rely on services for
any persistent or
“cross-user”
information.
Services
Persistent
System-wide
Multiple
implementations of
services
Configurable as to
what
implementation
provides what
service
Servlets
Access services outside of the
portal engine: AccessServlet and
OGCE
Consortiu m
What is a Teamlet?
•
A teamlet is a portal-like presentation of
information and possible user actions
•
It can be placed in multiple places with a
portal in across multiple portals; each
placement is independent
•
Each placement is configurable
Design Process - Elements
•
The design of a teamlet consists of three
elements
–
A service (the Java class or classes that implement
the interface to a source/store of information)
–
An action (the Tool in CHEF; one of more Java
classes that present information to the user and
respond to user actions)
OGCE
Consortiu m
Java CoG Kit
•
Provides interfaces to elementary Grid
functionality
–
Copy a file from here to there
–
Execute a remote job on the Grid
–
Authenticate to the Grid
•
Provides interfaces to more advanced Grid
functionality such as simple job queues and
task graphs
•
Provides a convenient API level interface that
What does the user see?
Portlet
Java CoG Kit High-Level
GT2
GT3
GT4
Condor
SSH
Java CoG Kit Low-Level
interface
interface
OGCE
Consortiu m
Portal Capabilities
Access context services for managing metadata GridContext Portlets
Access to Anabas shared display applets Anabas
View, interact with HPC status, job, etc information.
GPIR Portlets
Run simple executables on remote hosts GRAM Job Submission
Live chat services and interfaces Chat
Persistent topic-based discussion for groups Discussion
Interactive individual and group calendars Schedule
WEBDav based document system for group file sharing
Document managers
Upload, download, crossload remote files. GridFTP
Basic Globus MDS browsing and navigating MDS/LDAP Browsers
Get MyProxy certs after logging in. Grid Proxy Certificate Manager
Description
Portal Capabilities
OGCE
Consortiu m
Grid Portlet Examples
•
We’ll next overview several portal
capabilities.
•
Jetspeed/CHEF acts as a clearing house
for portal capabilities
–
User interface components can be added in
well defined ways.
–
First level of integration
Example Capability: Portals for Users
•
The MyProxy Manager
–
The user contacts the portal
server and asks it to do
“grid” things on behalf of the
user.
–
To make this possible the
server needs a “Proxy
Certificate”
• The user has previously stored a proxy cert in a secure MyProxy Server stored with a temporary password.
• User give the portal server the password and the portal server contacts the proxy server and loads the proxy.
• The portal server will hold the proxy for the user for a “short amount of time” in the
Portal
Server
1. Load my Proxy
Certificate!
User “Beth”
MyProxy
Server
OGCE
Consortiu m Java CO GExample Capability: File Management
•
Grid FTP portlet– Allow
User to manage remote
file spaces
–
Uses stored proxy for
authentication
–
Upload and download files
–
Third party file transfer
•
Request that GridFTP
server A send a file to
GridFTP server B
•
Does not involve traffic
through portal server
Example Capability: Grid Context
Service
•
User’s want to be able to use the portal to keep
track of lots of things
–
Application and experiment records
•
File metadata, execution parameters, workflow scripts
–
“Favorite” services
•
Useful directory services, indexes, links to important
resources
–
Notes and annotations
OGCE
Consortiu m
XDirectory: A Grid Context Service
•
XDirectory is itself a Grid Service that is
access by the portal.
–
An index over a relational database
–
Each node is either a “directory node” or a leaf.
Portlet Interfaces to Grid Context
Services
•
A Remote Service
Directory Interface
–
Holds references and
metadata about application
services.
•
User selects interface to
application service from
the directory browser.
•
Examples: (near
completion)
–
Select a link to a Dagman
document and invoke the
Condor service on the script.
–
Same for GridAnt/Ogre or
BPEL workflow script.
OGCE
Consortiu m
Example Capability: Topic Based
Messaging Systems
•
Indiana University has implemented a
XML metadata system based on
messages.
•
Newsgroups
–
Topic based posting and administration
•
Citation/reference browsers
–
Topic based, export/import bibtex
OGCE
Consortiu m
User Privileges for Group Channels
•
Users request access to specific
topics/channels.
–
Granted by administrator for that topic
•
Can request
–
Read/write by browser
–
Read/write by email (newsgroups)
–
Receive/don’t receive attachments.
•
Topic admin can edit these requests.
GPIR Data
•
Load - aggregated CPU
•
Downtime data for a
machine
–
Jobs: aggregated queue
•
MOTD
•
Nodes: job usage for
each machine node
•
NWS: based on VO and
Click model
•
Grid Monitoring
–
Based on TACC GMS
System
–
Custom providers
–
Plans to include MDS3.0
and INCA data uderway
•
Expanding to include:
–
queuing system
–
application profiles
–
performance data
–
Application profiles
–
Doc links
•
Model allows generic
inclusion of any XML
data from any
recognized source
–
Need schema
OGCE
Consortiu m
GPIR Components
•
Web Services Ingestor
–
Web Services Ingestor and clients
–
XML Schemas - can be changed
•
Data Repository
–
Local Cache
–
Archival --> PostgreSQL
•
Web Service Query
–
retrieve data – XML Queries
–
Retrieving current snapshot and archived data
•
Clients
–
GridPort services
–
Portal/Web Interface (Portlets, servlets, JSP)
–
Command line
OGCE
Consortiu m
Major Theme: Grid Application
Support
•
Current portal’s job submission
capabilities are vanilla
–
Type desired machine, executable, output
file
–
Generates RSL, runs command
•
Actual job management requires more
–
Integration of information, scheduling
OGCE
Consortiu m
Capability: Job Sequencer Portlets
User uses Portal to generate XML description of sequence. " xsi:schemaLocation="http://grids.tacc.utexas.edu/schemas/sequencer/jobSequence C:\DOCUME~1\Maytal\Desktop\Maytal\Work\GP-IR\GP-IRX~1\motd.xsd"> <<Status>New</Status> <Step> <Status>Unscheduled</Status> <Type>CSFJob</Type> <Parameter name="jobFactoryServiceHandle">http://129.116.218.36:15080/ogsa/services/metasche duler/JobFactoryService</Parameter> <Parameter name="queue">normal</Parameter> <Parameter name="executable">pam</Parameter> <Parameter name="arguments">-g 1 mpichp4_wrapper /home/monitor/mpi_jobs/mpimd_5</Parameter> <Parameter name="directory">/home/monitor/mpi_jobs</Parameter> <Parameter name="count">4</Parameter> <Parameter name="stdIn">/dev/null</Parameter> <Parameter name="stdOut">/home/monitor/mpi_jobs/tomislavSequencerJobOut</Parameter> <Parameter name="stdErr">/home/monitor/mpi_jobs/tomislavSequencerJobErr</Parameter> </Step> <Step><Status></Status> <Type>GridFTP</Type> <Parameter name="fromHost">[Previous]</Parameter> <Parameter name="toHost">blanco.tacc.utexas.edu:2811</Parameter> <Parameter name="fromFileFullName">/home/monitor/mpi_jobs/tomislavSequencerJobOut</Paramet er> <Parameter name="toFileFullName">/home/monitor/mpi_jobs/tomislavSequencerJobOutCopied</">/h ome/monitor/mpi_jobs/tomislavSequencerJobErr</Parameter><Parameter name="toFileFullName">/home/monitor/mpi_jobs/tomislavSequencerJobErrCopied</Par ameter> </Step> </JobSequence> Currently, sequence steps can consist of File Transfers and Job
Submissions to the CSF meta scheduler
GPIR
The XML is then decomposed and persisted to GPIR where the status information of each step in the sequence
and of the sequence as a whole can be
stored
Sequencer
GridPort returns a Sequence ID to the Portal immediately and
then begins executing the Sequence to completion or to error. Status information can be obtained at any time
Capability: Community
Scheduling Framework Portlets
CSF Use Case
•
Researcher submits job through User Portal
•
User Portal uses GridPort to
–
authenticate user
–
optionally make advanced reservation to visualization system
–
submit job to CSF
•
CSF selects compute cluster with best fit and forwards job
•
Gridport sends results to visualization system
User
Workstation User PortalGridPort
CSF
Visualization System
Bandera Blanco
OGCE
Consortiu m
O.G.R.E.—A Job Management
Engine
•
See Thursday Demo
•
O.G.R.E. =
O
pen
G
rid Computing Environments
R
untime
E
ngine
•
What Ant lacked, but we needed:
•Broader conditional execution,
•Ant: based on write-once String properties.
•A general “loop” structure for Task execution.
•Data-communication between Tasks (and with their containers).
•Specialized tasks
•File reading and writing
•Local and remote file management (gridftp)
•Web service related tasks
Data and Metadata Management
•
When the job is through…
•
Simulations, experiments generate both data
and metadata
–
Metadata includes from code input parameters, host
machines, data formats, owners of data, generators
of data,…
•
NEESGrid metadata system will be integrated
into the portal release.
•
Another example of integrated Grid services
OGCE
Consortiu m
Metadata Repository Capabilities
•
Data store
–
Files
–
Logical naming
–
Format translation
•
Metadata store
–
Structured (RDF-like
schemas)
–
Random-access (tuple
store)
–
Version control
•
Archiving
–
Mass store
–
“nar” archive format
•
Security
–
Single signon
–
Secure reliable file transfer
with GridFTP
–
Authorization via CAS
•
Grid service
interfaces
–
NFMS: NEESgrid File
Management Service
–
NMDS: NEESgrid Metadata
Service
–
Repo. service (Façade)
Portal
Repository architecture
NFMS
NMDS
Fil
system
GridFT
P
Rep
browser
File xfe
servlet
user
repository
HTTPS
JDBC
File
I/O
Repo
servic
(Façade
)
GSDL
GridFTP
Java API
GridFTP
OGCE
Consortiu m
Architectural Upgrades
OGCE
Consortiu m
“A Bag of Portlets…”
•
Portlet/container systems provide a
simple level of user interface integration.
–
A clearing house for pluggable components
of all sorts
•
User interfaces are actually to a diverse
set of backend services.
–
A mixture of UIs to Web services, grid
services, communication/collaboration
services,….
•
We are a portlet marketplace…
OGCE Initial Architecture
Po rtal Local Portlets Teamlets Proxy Portlets Jetspeed Internal Services Jav aCO GAPI Java Co GKi t Grid Services Grid Protocols GRAM, MDS-LDAD MyProx y Service API ServicesCHEF
Remote Interface s
CoG Stub s HTTP Grid Services Other Services SOA P
OGCE
Consortiu m
Integration Points and Service
Abstractions
•
Internal portal service abstractions
–
Service layer abstractions to define how to
interact with in-memory proxy certificates.
•
Authorization
–
Internal and external roles need to be
integrated.
•
Events
–
Share events between services
–
Job submissions should automatically
TeraGrid Integrated Architecture
Diagram demonstrates how existing software projects (such as GridPort) can
be adapted to support NMI Portals software system
Po
rtal
Portlets
and
Teamlets
Jetspeed
Internal Services
Grid
Service
Stubs
Remote
Content
Services
Remote Content Servers HTTP GridServices CoG KitJava
Local
Portal
Services
Service
API
OGCE
Consortiu m
Portlet Standards
•
Current portal uses Jetspeed portlet API
•
Other portlet systems available
–
Websphere->GridSphere
•
Portlet standard: JSR 168
–
A common API for all next generation portlet
systems.
–
Compliant portlet components may be shared
between systems.
•
Open Source Implementation (Pluto) is
available
–
We will be adopting this, will be part of our SC2004
release
OGCE Portals in Action
OGCE
Consortiu m
New Starts: TeraGrid Portal
•
Access to TeraGrid Services
–
Version 0: Collecting Initial Services
•
Public Information about
Resources
•
Private Information for the
developers.
–
Version 1: A User centered portal
(Q2 2004)
•
Hotpage/Gridport style access
to user accounts, credentials,
job submission & management.
–
Version 2: Portals for Science
Collaborations (Q3 2004)
OGCE
Consortiu m
OGCE Collaboratory
We Can’t Do It All
OGCE Collaboratory
•
Hopefully, we have convinced you to not
rebuild portals from scratch.
–
Time to use pluggable components in consistent
frameworks.
•
Our award is not just to release our own
software.
–
We want to foster the portal community
–
Contributed third party components will be sought.
•
Initial contributions will be from similar
projects
OGCE
Consortiu m