1 Understanding.the.Cloud.Computing.Landscape
8.2 Computing Environments
8.2.1 Institutional Grid
Institutional Grid Computing is designed to address large-scale computational problems using a network of shared resources. The major motivation is to use aggregated resources that can include computing, storage, and network, and are provided by multiple geographically distributed institutions. These grids are mainly focused on integrating existing resources with their hardware, operating systems (OSs), local resource management, and security infrastructure in order to form a virtual organization (VO) [2]. For example, in the OSG [8] or EGEE [5], when a project joins a VO, it contributes some of its own resources to the overall collaboration while being able to take advantage of the other resources shared within the organization. However, the resource provider maintains control over their own resources and may decide how and when to share them with others.
This system works on the principle that not all the resources are needed at the same time, and when a project does not need its own resources, these unutilized computing cycles are made available to others in the broader collaboration/VO.
Other examples of Institutional Grids are the TeraGrid [7] and DEISA [17], which provide a large-scale computational platform for a number of different sciences.
Instead of funding individual clusters or high-performance servers for indi-vidual projects, grids pool together financial resources to deliver high-performance computing (HPC) to a broad range of applications. As an example, Figure 8.1 shows the variety of scientific applications running on EGEE today. Initiatives such as the TeraGrid and DEISA are building a cooperative HPC ecosystem, and research projects can apply for allocations of compute cycles that allow them to execute jobs on particular HPC centers or across a number of these centers.
Most grid deployments adopt a layered architecture [2] for the infrastructure.
Figure 8.2 presents one possible high-level view of these layers. The hardware layer reflects the physical component of the infrastructure, this includes the characteris-tics of the processor or cluster, its architecture, and all the specific physical machine attributes that fully describe the platform. The network layer covers the connectiv-ity of the distributed resources orchestrated in the platform, it provides information
about the latency and bandwidth of the links, the network protocol, etc. The data storage layer illustrates all the storage and file system space available and also the protocol to access the data across the different resources. Finally, the software layer covers a large spectrum of logical components from the middleware that manages the infrastructure to the scientific application running on the platform.
The Institutional Grid provides several benefits as follows:
◾ Support for the scientific community and resource providers. As the Institutional Grids are designed to serve multiple projects and are hosted in several sites, there exists a community of experienced users and providers who can offer help and advice. As an example, there are the EGEE User forums [19] and many training events are organized [20]. The OSG maintains a Grid Operation Center, and the TeraGrid maintains a help desk and other means of providing support and outreach [7].
◾ There is almost no restriction regarding the type of application that can be run on the infrastructure. In the most part, batch execution through queue systems is supported, but there are also solutions to support more interactive applications with interactive execution [21–23].
Platform Software Data storage
Network Hardware
Figure 8.2 Grid layers and services.
High-energy physics Infrastructure Others
Multidisciplinary VOs Computational
chemistry
Figure 8.1 Number of users per application domain. (From EGEE: CIC Operations Portal, http://cic.gridops.org/index.php?section=home&page=volist.
With permission.)
◾ Resources are dedicated when available. It means that when a job is submitted into the queue management system, it will execute on dedicated resources when released from the queue.
◾ Diversity and large scale are also two strengths of Institutional Grids. There are many sites participating in the grid and so there is potentially a large number of resources that can meet user requirements in terms of character-istics and availability.
◾ Institutional Grids are collaboration-oriented and provide a secure model to share data [2]. The model has been built on the idea that giving access to shared space and distributed computing resources helps researchers from dif-ferent teams conduct joint scientific projects [24].
Despite all the benefits listed above, Institutional Grids also suffer from some draw-backs as follows:
◾ The Institutional Grid is a shared environment, in the sense that resources are made available to many users belonging to various collaborations. Thus, users compete for resources. When a user’s job is submitted to the system, it is placed in a batch queue where it is prioritized based on the site policies. The start time of the job will depend on the load and the scheduling policy of the system [25].
◾ The environment and the middleware are in a way rigid and constrained.
Institutional Grids are designed to serve many domain scientists, including those studying archeology, astronomy, earth sciences, finance, life sciences, etc.
[5] As a result, grids provide a generic software execution environment and tools.
This leaves users to interface their applications to the existing middleware, which can be difficult and usually requires a significant amount of learning. To help alleviate this problem, high-level tools are being developed to assist users.
Among them are workflow management systems [26–28] and application-level interfaces [29,30]. Institutional Grids also spend a significant amount of effort on user outreach and education helping new users take advantage of the distrib-uted resources. Finally, scientific communities often come together to provide community-based infrastructure such as science portals [31] to make it easier for a large number of users to run common applications easily [32].
◾ Variability, evolution, and changeability of the grid middleware and the com-puting environment. Grid software has been evolving over time to match the needs of the community and the understanding of the computational platform. For example, the latest Globus Toolkit, which is widely used on today’s infrastructure, has been released in various versions over the years (in sometimes incompatible ways) providing at times custom interfaces (GT 2.0), relying on a standard that was not supported in the long-term (OGSI—
and the GT3.0) release. This is the same case with EGEE and its middleware gLite [33]. Users are thus left struggling to adapt their applications to the new middleware as the older software releases are no longer supported.