Building Problem Solving
Environments with Application Web
Service Toolkits
Choonhan Youn and Marlon Pierce
Presentation Outline
•
Introduction
– What is the Computational web portal? – Gateway: computing web portal
– Limitations of traditional approach
•
Web Service-Based Computing Portal Architecture
•
Core Web services for Computing Portals
– Job submission – File Manipulation – Context Management – Script Generation – Job monitoring
•
Application Web services
•
Web service negotiation.
Computational Web Portals
• Computational Web Portals provide seamless access to HPC resources
– You can log in anywhere through any general web browser.
• Portals simplify the use of HPCs for novice users.
– Basics: batch script generation, job submission and monitoring, file service and ……
– Computational grid services: Globus, Condor
• Portals can simplify the use of unfamiliar codes.
– GEM code: disloc, simplex
• Provide a work management environment for all users.
– You can see what you did last week.
• Other PSEs Web portals
– NASA Information Power Grid LaunchPad – NPACI Hotpage
Gateway project
• Gateway is a computational web portal project funded through:
– DoD HPC MO PET Portal: Kerberos security in computational web portal – GEM science: Support codes developed by earthquake modeling
consortium
– Alliance: Contribute to NCSA portal
– SciDAC (Scientific Discovery through Advanced Computing): DOE project to build portal services for Plasma physics
• Our goal is to provide building block components that can be used to build specific portals.
• We also develop browser-based interfaces for basic services and specific science codes.
• Developed to support typical, if simple, high performance computing services
– Batch script generation, job submission and monitoring, file management and transfer.
Problems with Traditional Portal Architecture
• Portals accesses heterogeneous back ends and grids through a particular middle tier.
• Most portal projects are not interoperable
– Middle tier software incompatible – Wide range of protocols.
• Why do we need the portal interoperability?
– Portal developers don’t have to reinvent every single important service (lesson from GGF GCE). – Users will have access to more
services than any one project can provide.
– Users will be able to pick up the best available implementation of a service.
services
Web browser Web browser
services
Back end resources Back end resources
?
…
…
Web Service-Based Computing
Portal Architecture
JS:Job submission
JM:Job Monitoring
FT:File Transfer
CM:Context Manager
SG:Script Generation
AWS:Application Web Service
HIS:Host Independent Service
HSS:Host Specific Service
Backend Resources Middle Tier (Web Server) Simulation Component JS JM FT HPC SOAP Data Component FT JS JM Data Base
… Web Services Provider
Web Browser Service Repository … Publish Publish SOAP SOAP SOAP SOAP HTTP HTTP Portal Server CM SG AWS Middle Tier (Web Server) HIS SOAP SOAP
User Interface Server
SOAP
Client RepositoryClient
SOAP HSS
HSS
Core Web services – 1
•
Given WSDL and SOAP, what can you build?
•
Host-Specific Services (HSS)
– Instances of these services are bound to particular hosts. – Job Submission
– File Transfer
– Job & Host Monitoring
•
Host-Independent Services (HIS)
– Informational services that are not tied to specific service points – The service provided does not depend on the location.
– Context Management – Script Generation
Core Web services - 2
•
Job Submission
– Allow users to execute scientific applications
– Execute operating system calls directly or may interact with Grid services through, for example, the CoG client API to Globus.
– We use Java Runtime processes to run external (non-Java) commands, for example, PBS qsub.
•
File Manipulation
– Upload and download files between their desktops and various backend destinations.
– Allow users to transparently move, rename, and copy files on remote back-ends and crossload between different backend sites.
– File uploading and downloading service illustrate the use of SOAP messages with attachments in the RPC messaging style.
Core Web services - 3
• Context Management (CM)
– Archives interactions with the computational portal and stores all of the metadata associated with user sessions.
– Provides simplest possible data model
• CM provides an easy interface to an arbitrarily deep and complex tree-shaped data structure.
• Context data nodes are defined by recursive schema that hold optional, unbounded name/value pairs and child nodes.
– We use CM to store locations of job scripts, miscellaneous file URIs, user’s application instance XML files, etc.
– CM metadata stored on file systems, XML-native databases, ….
• Actual data may be anywhere.
– Actual service interface for manipulating contexts and the context data
• Add one or more contexts.
• Search and store the context data with XPath queries. • Remove the specified context.
Context Manager Architecture
Client
Axis Servlet
SOAP/HTTP
Context Manager Shared
WSDL Interface
FS XMLDB
Internal Communication
Core Web services - 4
•
Script Generation
– For users who are unfamiliar with HPC systems.
– The information about user’s choice with the portal interaction is stored as user’s application instance XML document.
– Generate the job script which could be broken down into two parts: a queue script for a particular queuing system such as PBS, LSF and LoadLeveler and a user script for running the application code.
•
Job monitoring
– Has been built in the polling method.
– Monitor the execution of a job running in a queuing system. – Return the array of the generated a WSDL complex type,
List user files on selected host, Solar. File operations include Upload, download, Copy, rename, crossload
Job monitoring service
Application Web Services
(AWS)
•
Application: specifically some code developed by the
scientific community.
– Example: Finite element codes, grid generation codes and so on.
•
AWS are designed to make scientific applications (i.e.
earthquake modeling codes) into Grid Resources.
•
An actual application is wrapped by a Java program.
•
We need a meaningful metadata model for applications
– Describe application-specific requirements
– Describe bindings of applications to host environments and to Web services in a general way that is independent of the particular
portal.
•
Scientific applications consist of several core Web
services.
AWS Lifecycle
• Applications can exist in four stages:
–
Abstract state: describes optional choices and
configurations that are available.
–
Ready state: Specific choices are made
–
Submitted: Application is running
–
Completed: Application is finished, but we
AWS Schema Structure
•
Two sets of XML schema:
–
Application Descriptors:
• describe abstract state.
• describe application options. Used by the application developer to deploy his/her service into the portal.
–
Application Instance Descriptors:
• describe particular instance states (ready, running, archived). • describe particular user choices and archive them for later
browsing and resubmission.
•
Schema sets are arranged hierarchically
–
Applications contain hosts
–
Schema are designed to be pluggable
AWS XML Descriptors
• Application description schema
– A “basic information” element that contains information such as application name, version, option flags.
– An “internal communication” element that contains child elements for describing input, output, and error fields for the code.
– An “execution environment” element that contains a list of core services needed to execute the application.
– An optional, generic parameter to hold arbitrary information about the application.
• Host description schema
– Contains information about the resource such as DNS name and IP address
– All of the information needed to invoke the parent application on that resource such as location of the executable, location of the workspace or scratch directory, and so on.
• Queue description schema
Sample generated user view of application code, Simplex: this form is
Portal Stack
• Core services provide the
basic connection to back
end “Grid” services.
• Application services
combine core services and
application metadata.
• User interface portlets are
built for each service.
• Portals aggregate portlet
components into portals.
Core Web Services User Interfaces
Application Web Services and Workflow
Aggregate Portals Messa
ge
Se
curi
ty,
Inform
Portlets for User Interface
Components
•
Web services define XML interfaces for accessing
services.
•
User interface components (such as JSPs) combine service
stubs into useful objects for human interaction.
•
So we actually have
two
points of interoperability:
– At the WSDL interface – At the user interface
•
Portlets combine HTML (and other) user interfaces into
aggregate portal interfaces.
Reliability of Distributed
Services
•
Distributed service systems have some important reliability
problems
– Information must be up to date.
• The system adjust when servers become available or unavailable. • Service metadata should match the actual capabilities of the system.
– Messages should reach the services.
•
We are automating application service metadata through
publish/subscribe mechanisms.
– Servers contain embedded publisher/subscriber clients
– Information aggregators publish requests for information to JMS-style brokers.
Bridging Between Client-Serve
and Messaging Services
Conclusions
• Traditional portals have “stovepipes” with interoperability problems.
• By designing and implementing several core portal services and Application Web Services around Web services, we gain interoperability and reusability.
• The emphasis on the development of reusable services that can form the basis for multiple PSEs.
• The portal developer can construct specific implementations and composites of primitive service components and can also provide services that may be shared among different portals.
• Application-specific services and data models that can be used to encapsulate entire applications independently of the portal implementation.
• User interfaces to application services become distributed portlets. • Everything is distributed
– Core Web Services->Application Web Services->User Interfaces Portlets->Portals
– Uses HTTP, SOAP, WSDL, …. • It all has to be secured.
– A flexible, message-based security system that can be bound to multiple mechanism and multiple message formats.
– The general approach: to use assertion – SAML, WS-Security