• No results found

WSRF Compliant Grids

2.5 Scientific Computing Using Grids

2.5.1 WSRF Compliant Grids

There is a clear distinction between Web services and Grid services and the role they play as distributed computing technologies. Stockinger [172] notes that both Web and Grid services were designed for wide area distributed computing. Typically, these services facilitate access to computation power and storage resources by advertising functionality using the same mechanisms. The most important differences between the two do not lie in the the way they are advertised, discovered and addressed but in their purpose.

The main purpose of a Web Service is to permit communication over the network be-tween clients and service providers using standards that guarantee communication in-teroperability. The goals of Grid Services are beyond of those of Web services because they aim to offer mechanisms that allow interconnection of generically named Grid re-sources in one computational platform. For Grids, any computational resource, from processing units to printers and sensors may be abstracted based on their functionality and attributes. Grid services provide generic mechanisms to allow integration of Grid resources in wide distributed computing architectures.

The current standard for describing Grid services has its foundations in the Web Services Resource Framework (WSRF) [143]. The WSRF standard describes a set of mechanisms for easy integration and management of resources in distributed environments that are

built based on the Web services standard. The first initiative to augment Web services was proposed by the Open Grid Services Architecture (OGSA) [93] but the initiative was only an intermediary step towards the WSRF compliant Grid Services. With WSRF, Grid services have integrated the best features from both Web services and OGSA services worlds: on one hand the interoperability and on the other hand a mechanism to allow persistence of state at the service level. As noted in [172], a Grid Service is in fact an augmented Web Service that implements mechanisms for storing state information persistently beyond the lifetime of a single request rather than transiently.

The newer REST [88] standard for Web Services requires that a request must specify all the information needed by the server to handle the request therefore no stateful infor-mation should be kept at server level. The benefits of using this approach [88] do not apply for Grid architectures due to their different aim. For Grids, statefulness is an im-portant feature that prevents unnecessary network communication and data sharing even between multiple clients. The WS-Resource standard, part of WSRF, specifies a core set of XML languages to be used for describing resources and their properties and defines a set of standard management protocols that should be used in conjunction with resources.

Each Grid service has to describe resources that are made available to external users as XML documents and each resource has to be uniquely identifiable. To access a resource a client obtains the identifier of the resource from a factory service and in subsequent calls uses the identifier to specify the resource to which the call should be applied to.

The extensions that Grid Services define on top of Web Services go beyond the syntactic level because they enhance the capabilities of Web Services with consistent mechanisms that services’ clients can rely upon across all Grid services. The WSRF standard is therefore a collection of specifications related to the management of WS-Resources that are guaranteed to provide the same functionality across all Grid service providers. These mechanisms not only add capabilities that could be useful for the Web Services world but they modify the architectural model of the applications that are based on Web Services.

The first important change is introduced by the WS-Addressing [188] and WS-Resource

Service

Resource Properties

Resource Properties

Resource Properties WS-Resource

WS-Resource WS-Resource End Point

Reference 1

End Point Reference 2

End Point Reference 3

Figure 2.3: Structure of a Grid Service

[143] specifications. An instance of a regular Web Service is stateless and it is not designed to remember prior events. In order to create stateful services, state information which is stored at service level must be managed by the service and available for future invocations. Regular Web Services may be designed to implement such behaviour but the lack of a standard can only clutter the interface of the Web Service and complicate the invocation process because the client must send needed information explicitly.

The WS-Addressing specification defines a two level invocation mechanism that allows automatic attachment of a session identifier within the header of the message. Therefore the SOAP message is submitted to a URL that identifies the services and the header information is used at the service level to identify information regarding the session.

Two invocations sent to the same service will differ in execution based on the state of the targeted resource.

Closely related to this mechanism is the specification that describes the structure of the persistent information that is stored as resources. Generically called resource, this con-cept can be used to describe any informational attributes of the entities that the service interface offers access to. While simple Web Services expose a set of operations that

an external user may invoke, Grid services are closer to the OOP paradigm. The at-tributes of an object are mapped to WS-ResourceProperties, and the whole structure of the resource is advertised within the WSDL document describing the Grid Service. The WS-ResourceProperties specification describes also the mechanisms that should be im-plements to allow seamless access to the content of the resource. Standard operations for setting and getting the properties of a resource may be part of the default interface of the Grid service.

A common implementation pattern for Grid Services is the Factory Pattern in which two services are used in tandem. The factory service is a stateless service that the client calls at the first invocation. The role of the factory service is to initialize a new resource object that is kept in memory and to send back to the client an End-Point Reference (EPR) that contains the necessary information to further interact with the new created resource. The EPR contains the URL of the service and an unique identifier of the resource.

The resource created and associated with the Grid Service is intended to outlast a sin-gle call. As a consequence, the Grid Service must implement life management fea-tures that control the lifetime of a WS-Resource and control in which circumstances the memory allocated for the resource must be freed up and the resource destroyed. The WS-ResourceLifetime [144] describes mechanisms that allow seamless management of resources lyfecicle which may be extremely complex. They are created and modified as a response to user’s actions. Their lifetime spans over multiple user calls according to the purpose of the application. Unless kept alive by subsequent calls, the lifetime of a re-source can expire based on the initial setting specified at the creation of the rere-source. For the resource to be destroyed after the expiration an explicit call to request the destruction is not required.

The interaction with a service should be standardized as much as possible to make sure that the aim of complete interoperability is achieved. Grid Services invocation may raise invocation exception that describe problems that prevented a successful execution. The WS-BaseFaults [141] specification provides a standard error types that may be used by

the application to inform clients that errors have occurred during the execution of the service.

The number of Grid Services that are exposed by a Grid node and the resources that are instantiated as a result of client calls may be high. Useful mechanisms that allow grouping multiple services together for easier management are specified by the WS-ServiceGroup [145] specification. Services can be added and deleted to a group and a group can be searched within a group based on a search condition. Besides describ-ing the simple mechanisms to manage services the WS-ServiceGroup describes a set of guidelines on effective service grouping and management.

The Grid paradigm foresees the creation of complicated computational infrastructures based on the resources and their associated services participating to a Virtual Organi-zation (VO). The aim of such infrastructures is to provide a suitable environment for solving large scale problems. The tasks to be executed to solve large scale problems may require a long time to complete. Therefore, asynchronous calls should be supported by Grid systems to support non blocking computation flows. The WSRF describes such mechanisms as part of the WS-Notification [142] specification.

WS-Notification defines two types of services: notification producers and notification consumers. Consumers register themselves with one or more producers to be notified in case a specified type of event occurs. Typically, the consumer registers to be notified for changes that occur in a certain WS-Resource. Any update in the internal state of the re-source can therefore be advertised to interested consumers. This behaviour is especially useful with long running tasks, such as the ones that often occur in scientific problems.

One important aspect in distributed architectures is the need for a client to know the address of the remote service. The address can be known by default or the client can be expected to discover the services that provide required functionality. The standard discovery mechanism used by plain Web Services relies on UDDI [22] registries. The level of cohesion between the Grid Services is bigger than the one of Web Services due to the fact that they are part of a certain VO. Another significant difference is the

additional information that is stored in associated resources which can be meaningful for the discovery process. As a result, the mechanisms of discovery implemented by UDDI registries were replaced by a set of hierarchical index services. The services are compliant with the WSRF specification and standard enquiry calls can be formulated to retrieve information.