Suitable Distributed Technologies for Symbolic Computing

The computational infrastructure that we intend to use for large scale symbolic compu-tation problems is heterogeneous and highly dynamic. The compucompu-tational resources re-quired by large symbolic computation problems can be obtained by bringing together ge-ographically scattered resources provided by research institutions and universities will-ing to share their computwill-ing power. Although such institutions are willwill-ing to share their

resources, their already established computational domains and enforced rules cannot be easily changed. Any system that wants to use such resources must be versatile enough to cope with specific particularities of individual computational nodes and their respective computational domains. Appropriate communication technologies, rules to be enforced system-wide and software tools have to be carefully selected to ensure compatibility with the computational infrastructure they want to build upon.

The early distributed systems for symbolic computations that were built had to rely on existing technologies available at the time such as RMI [9] or CORBA [183]. Their primary goal was to provide small to medium scale systems that would usually use hard-ware resources provided by the local computational domain. One such example is the framework described in [159]. Even though they had as a target to create systems that rely on distributed computational resources, both MathWeb [94] and JavaMath [170]

have the disadvantage to use technologies that are not viable for systems that spread over multiple computational domains. The first impediment is that they are not open enough to allow clients and service providers to choose their platform and programming languages they prefer for building clients and services. RMI is even more restrictive than CORBA in this respect.

Another important limitation that systems build using CORBA and RMI have is that they often require more permissive security policies to be implemented by domain firewalls.

Since security threats represent a major concern in current systems, it is often the case that administrative rules prevent these systems to function correctly. Limitations of the RMI and CORBA motivated researchers and system developers to find more versatile solutions to implement distributed systems. As a result, Web Services were created and widely adopted as a compromise between interoperability and security on one hand and system efficiency on the other. The underlying architectural style that Web Services are based on is the routine-subroutine style and therefore mappings between service operations and functions provided by CASs are easy to achieve.

Grid technologies, which were initially designed to use TCP/IP socket connections for

communication, have also evolved to adopt Web Services. As identified in [147], the use of Grid services for building distributed infrastructures for symbolic computations may be beneficial in several respects. Amongst them, the WSRF frameworks could be used to implement Multilateral Simple Conversation patterns for which WS-Resources mechanisms provides automatic state support [83]. With the use of WSRF a service becomes stateful and a returning client is automatically recognized and session data can be retrieved from the associated WS-Resource. Additionally, automatic resource management may be used to free resources, a similar functionality with the one provide by the Java garbage collector.

While we consider these features to be helpful, we believe that there are several other features that are even more important for symbolic computing than the ones mentioned so far. Grid services have native support for security which eliminates the burden of enforcing security and designing appropriate security policies over disparate computa-tional domains. Another important benefit is that Grid services provide data management capabilities. Dedicated interfaces and protocols provide secure, reliable and easy to use solutions for moving large sets of data from one computational node to another. Through these services they ease the process of integrating disparate computational resources into a coherent whole. The advantages that Grid services provide for scientific computations in general and their direct support for the requirements discussed in Section 3.2 qualify Grid technologies to be used for symbolic computations.

The CAS Server components were therefore designed to use the capabilities that Grid services have to offer. Execution, data management and discovery services that the CAS Server interface has to provide were implemented using WSRF compliant Grid Services.

CAS Server uses specific features of WSRF where they were required whereas generality of the solution was kept whenever possible due to rapid evolution of technologies that may require that CAS Servers have to accommodate new standards and technologies.

We found the WSRF mechanisms to be particularly useful for describing the symbolic capabilities that the CAS Server provides to its clients through its interface. Information about the CASs that the CAS Server encapsulates and the functions that are available

for remote invocations are organized as a WS-Resource. Native indexing capabilities of Grids can therefore be used to discover these details. While our discovery process does not rely on the native provided functionality, these capabilities may be useful for compatibility with other systems.

Using Grid or Web services to expose functionality of CASs may also have small imped-iments. One such example is the lack of support for exposing more than one operation with the same name and with different argument lists. This limitation comes from the standard the WSDL 2.0 [1] which explicitly forbids that operations with the same name exist within the same service definition. This is not the case with regular CASs which may provide functions that have the same name but with a different type and number of parameters. Therefore one-to-one correspondence between a CAS function and an oper-ation on the interface of the CAS Server would not be possible. Even if such restrictions did not exist, it is still not convenient to have services exposing thousands of operations as we would be forced to provide if one-to-one correspondence were to be used. The experience gained by constructing the Computer Algebra to Grid Services (CAGS) tool [60] has let us to the conclusion that the better approach is to use a single operation through which task requests should be submitted.

This design has the advantage to provide a static and standard set of that the client may use in a dynamic way. If new functions are implemented at the CAS level and the administrator exposes them as new accepted operations accessible to remote clients, the interface of the service does not need to change. It is only necessary that the function is registered in the internal Local Registry of the CAS Server. Registration of new functions is the only deployment step required. It is not necessary to recompile or restart the Grid service as is needed in the case of GENSS services which require that a new Java operation is implemented for every new CAS function exposed.

In document Generic access to symbolic computing services (Page 101-105)