GTLAB: Grid Tag Libraries
Supporting Workflows within
Science Gateways
Mehmet A. Nacar, Marlon E. Pierce,
Geoffrey C. Fox
Now to Portals
2
Grid-style portal as used in Earthquake Grid
Introduction to GTLAB
• Grid Tag Libraries and Beans (GTLAB)
• Encapsulates clients to common Grid services as XML tag libraries and server side Java beans.
– Embedded by portlet developers in their portlet pages to invoke common tasks
– Specification of the composite action you want to occur when a user hits the submit button.
– Allows portal developers to concentrate on the user interface components.
• Tags can be arranged in directed acyclic graphs (dependency chains).
– These represent simple workflows.
• Based on extensions to Java Server Faces (JSF) and the Java CoG Kit.
Motivation
• OGCE Grid portlets typically wrap each single Grid capability in a separate portlet
– GridFTP-->GridFTP Portlet
• Gateway portlets encapsulate sophisticated but specialized functionality.
– Such as submitting multiple linked jobs
– Each functionality requires a separate portlet
• We need a middle way to “automatically” build complex functionality from component parts
– Traditional portlets don’t work as no agreed way to combine component portlets into an integrated user capability
– We need a component model for portlets: reusable portlet parts
• JSF is our starting point
– Removes dependencies on the Servlet API.
– Backend software is just beans, so can be reused more easily outside of web and portlet applications.
• JSF also provides an extensible framework (tag libraries)
• Apache JSF portlet bridge allows you to convert standalone JSF
Architectur
Overview
• Grid portals are client to backend codes through Web/Grid services.
• Grid tags are part of user
interface tier and embedded into portlet container.
• Grid tags use local services in
Apache Tomcat to manage sessions and handlers.
• Grid beans implement a layer on top of Grid client APIs such as Java CoG
• Implies that “portal” is quite sophisticated as must support integration
• Note CoG kit is here to act as a broker to hide complexities of evolving services (e.g. different versions of Globus)
• “violates service model” in that core software centralized in portal application
GTLAB Example
<html><body> < f:view>
<!-- Other user interface tags go here--> <f:form>
<o:submit id=”test” action=”next_page”>
<o:myproxy id=”pr” hostname=”gf1.ucs.indiana.edu” port=”7512” lifetime=”2” username=“mnacar” password=”***” />
<o:jobsubmit id=”task” hostname=”cobalt.ncsa.teragrid.org”
provider=”GT4” executable=”/bin/ls” stdout=”tmp/result stderr=”tmp/error” /> </o:submit>
</f:form>
</f:view> </body> </html>
• Grid tags are associated with Grid services via Grid beans
• Grid Beans wrap the Java COG Kit (version 4)
• We show an example JSF page section below.
GTLAB features
• GTLAB provides common components for building portlets using tags and reusable parts.
• The goal of GTLAB to simplify Grid portlet development
– Enable rapid development from reusable components
• GTLAB capabilities include Grid operations with XML based tags within JSF framework.
• Grid tag libraries are built using JSF custom component development techniques
• Grid tags are interfaces to backing Grid beans
– End users pass values to Grid beans by using tag attributes.
• Each backing Grid bean has equal capability with a portlet application in case of Grid portlet approach.
– GridFtp tag generates GridFTP Portlet
• But tags allow service backend component model to map into an intermediate UI component models – each tag set is a component
– The components tags can easily be composed into a rich UI which portlets cannot do
Grid Tags Associated Grid Beans Features
<submit/> ComponentBuilderBean Creating components, job
handlers, submitting jobs. This is visually rendered as HTML
<handler/> MonitorBean Handling monitoring page actions
<multitask/> MultitaskBean Constructing simple workflow
<dependency/> MultitaskBean Defining dependencies among sub jobs
<myproxy/> MyproxyBean Retrieving myproxy credential
<fileoperation/> FileOperationBean Providing Gridftp operations
<jobsubmission/> JobSubmitBean Providing GRAM job submissions
<filetransfer/> FileTransferBean Providing Gridftp file transfer
(Other JSF UI
Tags) ResourceBean Describes common propertiesamong all tags and beans. Passingvalues given by standard visual
VLAB: The Virtual Laboratory for Earth and
Planetary Materials
• Primarily a traditional job submission, monitoring, and management portal.
– Originally tried to use first generation of OGCE where all capabilities are portlets
• Collaborative Grid services and portals support computational material science.
• VLAB Challenges:
– Generic GRAM job submission portlet does not fit with VLab requirements • Quantum Espresso requires parameter entries and checking
– Grid Portlets must be easy to develop using component libraries.
• Transfer data files in and out of the desired remote host. • Run one or more executables.
• Keep track of job progress
• Store all of the information as “job archive” for reproducibility.
Vlab portlet pages
• VLab portlets include job preparation and submission, job monitoring
Case Study: VLAB Portal
Encoding DAGs in Portlets
• Multitask
provides a simple Directed Acyclic Graph (DAG)
• This example
demonstrates a composite Grid job using multi-staged multitask
• GTLAB handles
lifecycle of DAG within JSF
<o:submitid=”test” action=”next_page” />
<o:multitaskid=”mytask” taskname=”test” persistent=”true” >
<o:myproxyid=”pr” hostname=”gf1.ucs.indiana.edu” port=”7512” lifetime=”2” username=“nacar” password=”***” />
<o:fileoperation id=”taskA” command=”mkdir” hostname=”cobalt.ncsa.teragrid.org” path=”/home/manacar/tmp/” />
<o:filetransfer id=”taskB”
from=”gridftp://gf1.ucs.indiana.edu:2811/home/manacar/input_file” to=”gridftp://cobalt.ncsa.teragrid.org:2811/home/manacar/input_file” /> <o:jobsubmit id=”taskC” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4”
executable=”/bin/execute”
stdin=”tmp/input_file” stdout=”tmp/result” stderr=”tmp/error” /> <o:filetransfer id=”taskD”
from=”gridftp://cobalt.ncsa.teragrid.org:2811/home/manacar/tmp/result” to=” gridftp://gf1.ucs.indiana.edu:2811/home/manacar/result” />
<o:dependency id=”dep1” task=”taskB” dependsOn=”taskA” /> <o:dependency id=”dep2” task=”taskC” dependsOn=”taskB” /> <o:dependency id=”dep3” task=”taskD” dependsOn=”taskC” /> </o:multitask>
</o:submit>
DAG Example JSF Page
•
Grid tags are embedded into JSF view pages and decorated
with standard JSF form, input, output and button
components
– Grid components are non-visual.
15
GTLAB
DAG extensions: Condor DAGMan
•
GTLAB architecture is extensible
– Besides Globus Grid services support, Grid community
also use other common Grid services such as Condor
•
We extend GTLAB to support Condor capabilities
– Condor DAGMan is a tool for complex application
workflows on Condor
– Birdbath is Web services provider of Condor capabilities
– GTLAB tags for DAGMan use Birdbath service to create
client stubs
•
Grid tags integrate DAGMan with the following tags:
– <o:condorDagman/> and <o:condorSubmit/>
•
GTLAB executes and monitors DAGs by Condor
•
Composing DAGMan workflow is out of scope.
Taverna workflows
• DAGs provide advantages to pipe Grid invocations (dataflow)
• But DAGs cannot support full-fledged workflow capabilities like conditional branches and loops.
– We studied Taverna as a test case.
– We can investigate Kepler and BPEL implementations as future extensions
• Workflow systems have different features:
– Composing – Enacting – Monitoring
• GTLAB supports Taverna enactment and monitoring.
• GTLAB imports well studied built-in workflows collected by the community
– Bioinformatics workflows and their metadata is available
• Workflow composition is out of scope of this work
– There is much ongoing research in this area
Taverna use case
•
A user interacts with a
workflow portlet (left of
picture) to utilize
Taverna enactor.
•
User provides
parameters by
submitting a web form
that start the sequence
of events.
Advantages of GTLAB
•
GTLAB provides simplicity to develop science portals
– Rapid development
– Easy deployment
•
Grid tags provide rich selection of attributes to initialize
Grid beans.
•
Composite tasks can contain an unlimited number of
subtasks
•
GTLAB gives flexibility to developers to use their own Grid
beans library or add more Grid beans to the existing ones.
– Following the method name convention of GTLAB
•
Grid beans also can be imported to any presentation logic
– GCEShell is a command-line tool
• Total overhead = Tsubmit-Trequest = 156 msec (that includes JSF overhead) • Average overhead of GTLAB is about few milliseconds
• GTLAB does not add up significant delay on processing the requests. 20
GTLAB Processing JSF Processing Handler storing Submitting Time (msec) 2 153 1 410
GTLAB Related Work
•
GridSphere’s Grid Portlets 1.3
– Grid Portlets 1.3 provide API and User Interface (UI) tags to build Grid portlets
– An effort called Vine (Portlet Vine) refactors Grid portlets and decouples the portlets from GridSphere
•
Karajan
– XML based workflow language and engine for Grid computing
– Built on Java CoG Kit
– Requires additional effort to aggregate it into Grid portals
•
MyGrid Portlet Interface (MPI)
– Based on Taverna workflows (Scufl)