Part II: e-Science and its Technologies
7.9 Other Grid Services
Here we discuss some interesting services that often are generally useful for several types of Grid and so were not discussed in the previous two sections on Information and Compute/File Grids.
7.9.1
Grid Shell
The concept of a Grid Shell can be discussed at two levels. As remarked in section 7.8.5, a generalization of UNIX Shell commands underlies both Globus and Legion systems supporting job and file related capabilities. However the concept of a set of “atomic” commands embodied in the Shell can be generalized to a set of important Grid services of wide-ranging applicability. Bill Joy exploited this in the peer-to-peer system JXTA [JXTA] whose Shell supports collaborative functions. Thus as well as referring to the Grid enhanced UNIX commands, we can use the shell terminology to describe a suite of Grid services. Further in analogy to UNIX, one can develop a general Grid Service interface equivalent to sh in that it can invoke “any” other Service. This latter idea has been superficially explored at the Grid Computing Environments working group at GGF and at Indiana University with a prototype implementation [Fox03D] [Nacar03A].
7.9.2
Accounting and Grid Economies
Very important areas are accounting and the economics of Grid computing. Accounting itself is closely linked to security issues as suggested by the AAA (Authentication, Authorization and Accounting) category used in the Grid Forum and other places. More exciting but not necessarily more important is the idea that the Grid suggests new
approaches to charging and accounting for the use of resources. This idea has been extensively explored by the peer-to- peer community [Oram01A] with novel concepts of bartering, “the tragedy of the commons” (what happens if a resource is totally free) and digital cash. Computational economies or markets are the typical way such ideas are presented. Some of the pioneers in this field include the work at Santa Barbara [Wolski03A] and the Universities in Melbourne Australia [GridBus] [EconomyGrid] [Buyya02A]. A new effort spanning both AAA and Computational economies has just been started as part of the UK e-Science program (see appendix sections A.2.9.1.1 and A.2.9.1.6 ) [UKeSMarket].
Logically distinct but operationally related are Grid technologies to support “parametric modeling” or the control of multiple independent but related jobs. In a typical case one explores a parameter space by launching several jobs with a single replicated code but distinct input parameters. Nimrod-G [Nimrod] and APST [Casanova03A] are well known tools in this area. Desktop Grids and Condor are related technologies.
7.9.3
Fabric Management
A seemingly important area is fabric management which means systematically controlling and monitoring the system hardware and software of Grid resources. The sensitivity of application software (such as that of the European
DataGrid) to particular versions of Linux and other operating systems and tools emphasizes the importance of this area. LCFG [LCFG] was originally developed by the Informatics school at the University of Edinburgh to manage the computers in their department. However it has been successfully extended to larger scale application involving the core software installations on both clusters and Grids. Curiously there appears no such software in common use in the US but there are two efforts to extend LCFG in Europe. In the near term, work package 4 of the European DataGrid has developed with Edinburgh an enhanced next generation LCFG(ng) [EDGWP4LCFG]. On a longer term view and with attention to research issues, Edinburgh has teamed with the Hewlett Packard laboratory in Bristol in the UK e-Science GridWeaver project [GridWeaver]. This builds on LCFG and the HP SmartFrog system supporting service specification and deployment [SmartFrog] for utility computing; see appendix section A.2.9.2.1.
7.9.4
Visualization Datamining and Computational Steering
It is clear that data analysis and in particular visualization are important Grid services. Although scientific visualization is critical and has been pursued intensely, there is no obvious consensus architecture and correspondingly it is hard to design general Grid services to support this field. The well known Utah group which developed SCIRun is packaging their visualization system as a Grid service as part of the NCSA Alliance portal activity [NCSAGAMS]. A UK e- Science workshop [UKeSSDMIV] reviewed both visualization and the related datamining issues. GADS [GADS] is addressing some of these issues in environmental science.
One reason for real-time visualization and analysis is to support computational steering; adapting in real-time the execution of a program based on its initial results. Such a capability is naturally thought of as a service (related a little to debugging) where the user is presented with a portal displaying both the current results and the ability to change parameters defining the execution of a remote job. The work of Parashar’s group at Rutgers pioneered this in the DISCOVER project [Mann03A] which had a Grid service structure implemented in CORBA. The UK e-Science RealityGrid (appendix section A.2.9.1.3 and [RealityGrid]) project is a major new effort aimed at controlling material science simulations via Grid-enabled computational steering.
7.9.5
Collaboration
We have already discussed collaboration in terms of notification (section 7.5) and information aggregation (section 7.6.3) services. These provide the core technology to support the object sharing that is the heart of collaboration. Collaborative systems support three types of capability
a) Audio-video conferencing
b) Common tools like whiteboards, text chat and instant messenger
c) More general shared applications such as PowerPoint or perhaps a scientific visualization.
Further there is a control system that sets the details of a collaboration such as which clients and applications are involved. In a Grid or web service architecture, one can implement this elegantly [Fox03C] with a web service for the control system and each shared capability constructed as individual or replicated web service where one shares either the input (replicated) or output (single instance with shared view) ports.
Best practice for collaboration in the Grid arena is the Access Grid which is being repackaged using Globus and Grid service technologies as part of the SWOF Scientific Workspaces of the Future project [SWOF]. This focuses on high- end videoconferencing (category a) above) and can be linked to commodity technology through VRVS from Caltech
[VRVS] which is popular especially in the particle physics community. CHEF [CHEFportal] provides excellent implementations of some of the tools in category b) and is being developed as part of the NeesGrid [NeesGrid-A]. CoAKTinG [CoAKTinG]is a UK project examining enhancements to Access Grid based collaboration. Indiana University has developed a Web service approach [Uyar03A] to cover all aspects of collaboration. The XGSP (XML General Session Protocol) system links the Access Grid, SIP and H.323 approaches with a common controller that using NaradaBrokering [NaradaBrokering] as a distributed messaging broker can scale to very many simultaneous users. This can integrate with Polycom [Polycom] commercial H.323 video conferencing and WebEx [WebEx] style collaborative applications. However this system is still only a research prototype.
7.9.6
Packaging
The Grid tools need to be packaged to allow robust convenient deployment. One can use technologies like RPM [RPM] familiar from the Linux community but specialized systems have been developed for Grid software. The NSF
middleware initiative uses Gridconfig to manage the configuration of their software components [Gridconfig] and GPT (Grid Packaging Tool) [GPT] for packaging. Pacman [Pacman-A] from the high energy physics community is used by the VDT [Pacman-B].
7.9.7
Other Technologies
There are of course many other tools needed and available to support Grid applications and these can be thought of as parts of the emerging Grid Shell introduced in section 7.9.1. Two examples developed by the UK particle physics GridPP project are SlashGrid [SlashGrid] which is a framework for Grid aware file systems (see section 7.8.6) and GridSite [GridSite]. The latter is a Web site content management system which is analogous to the Apache Slide system [Slide]. The latter supports the IETF WebDAV distributed authoring and versioning standard [WebDAV]. Discussion of these and other technologies can be found in the appendix, sections A.2.9.1.5, A.2.9.2.3 and A.2.9.2.4.