On-demand Services Composition and Infrastructure Management

(1)

On-demand Services Composition and

Infrastructure Management

Jun Peng and Jie Wang

Department of Civil and Environmental Engineering, Stanford University, Stanford, CA 94305, USA

{junpeng, jiewang}@stanford.edu

Abstract. This paper presents several engineering applications that involve

dis-tributed software services. Due to the complexity of these applications, an effi-cient and flexible service composition framework is needed. A reference service composition framework is introduced to address two issues prevailing in cur-rent service composition: interface incompatibility and performance. The refer-ence framework applies active mediation to enhance efficient execution of ap-plications employing composed services. As the number of services becomes large, the composed application imposes challenge on the service infrastructure management. A reliability model is introduced for building an automatic infra-structure management paradigm.

1 Introduction

The service composition has been utilized in many fields to improve the flexibility of software development, and to reuse existing software components. One such signifi-cant field is engineering, where services are composed to perform simulations, data integration, real-time monitoring and other on-demand applications. However, there are many issues associated with the current prevailing service composition frame-works, and two noticeable ones are interface incompatibility and performance. Web services are typically heterogeneous and adhere to a variety of conventions for control and data. Even when standards are promulgated, such as SQL, the precise meaning and scope of the output will not necessarily match the expectations of another service. In a typical composed application, all results from one service now have to be shipped back to the application site, handled there, and then shipped to the next service. We argue that this centralized data-flow approach is inefficient for integrating large-scale software services.

While great advances have been made to simply the development of distributed and on-demand service infrastructures, managing these infrastructures still remains a daunting task. Currently, most large-scale data centers that host service infrastructure experience severe problems in managing the large N-tier, networked environments. Most of these problems arise from the complexity of the infrastructure itself rather than building the applications. Maintainability of the infrastructure system for long-term efficient operation is largely missing.

(2)

2 Example Engineering Applications

2.1 Engineering Simulation

The first example is a web service framework that facilitates engineering simulations [11]. In this framework, the users can remotely access the core simulation program through a web-based user interface or other application programs, such as MATLAB. The users can specify desirable features and methods that have been developed, tested, and contributed to the framework by other participants. A standard inter-face/wrapper is defined to help developers to build/wrap engineering components as web services. Many components of the simulation program, such as linear solver, design tools, visualization tools, and etc., can be wrapped as web services and run on distributed computers to participate in engineering simulation.

Engineering simulations normally require a great deal of computation effort. To improve the performance of engineering simulations, parallel and distributed comput-ing environment can be employed. The simulations are performed on dedicated paral-lel computers, cluster of local workstations, or even distributed network workstations by utilizing Gird-enabled MPI [4]. Database systems served as web services are linked with the central server to provide persistent storage of simulation results [12].

2.2 Data Integration

The second example is a web services framework for integrating a variety of engineer-ing applications, which involves large volume of data communication. In this exam-ple, the web services are linked together through an integration framework to provide project scheduling [8]. Proprietary software applications, such as Microsoft Project, Excel, Primavera Project Planner, and 4D Viewer are wrapped as web services that export their functionalities. Although the applications run on heterogeneous platforms and utilize different interfaces, they can be accessed homogeneously through standard web services interfaces. The prototype also incorporates a variety of devices ranging from PDA, web browsers, desktop computers, to server computers. Using the infra-structure, field personnel can conduct project management with the latest project in-formation on the construction site, in a truly ubiquitous fashion [8]. In short, by using the web services model to develop the integration framework, engineering applica-tions can collaborate regardless of location and platform.

2.3 Real-time Monitoring

The third example is a distributed wireless structural monitoring system [9]. The trans-fer of measurement data is carried out by wireless communications. In addition, com-putational power is integrated with each sensing node of the system. By providing each sensor the means to process its own data, computational burden is removed from the centralized server in addition to many benefits associated with parallel data processing. The wireless structural monitoring system is applied to perform real-time

(3)

monitoring of a district or even a city, where a hierarchical system is used. Sensor units are deployed on numerous buildings. The sensor units within a building or a small region send their collected measurement data to a data server. All the data serv-ers then send their data to a data processing center. In the monitoring system, the in-frastructure management is a key factor to the successful deployment of the system. An automatic mechanism is needed to detect mal-functional sensor units, to adapt to varying circumstances, and to operate robustly under severe conditions.

3 Software Service Composition

FICAS, an experimental Flow-based Infrastructure for Composing Autonomous Ser-vices, supports a service composition paradigm that integrates software using a loose parallelism [7]. Since there is an overhead for each remote invocation of a service, this framework focuses on the composition of large and distributed services.

3.1 Service Composition Framework

Fig. 4 illustrates the main components of the FICAS framework. The buildtime com-ponents are responsible for specifying composed application and compiling applica-tion specificaapplica-tions into control sequences. For FICAS, we have defined the CLAS (Compositional Language for Autonomous Services) to provide application program-mers the necessary abstractions to describe the behaviors of their composed applica-tions. The CLAS language focuses on functional composition of web services. The runtime environment of FICAS is responsible for executing control sequences. The service directory keeps track of available web services within the infrastructure.

FICAS Buildtime Autonomous Service Directory Communication Network CLAS Program FICAS Controls Autonomo us Service Autonomo us Service Autonomo us Service Software Application Autonomous Service Wrapper FICAS Runtime FICAS Buildtime Megaservice Controller

(4)

Web services are formed by wrapping legacy or new software applications. A metamodel is defined to allow the construction of homogeneous web services in a heterogeneous computing environment. The metamodel defines a service core, which encapsulates the computational software and provides the data processing functional-ities. While each component operates asynchronously, the service core ties the com-ponents into a coordinated entity.

3.2 Separation of Control and Data Flows

Traditionally, both control flows and data flows are centrally coordinated. There are performance and scalability issues associated with this model. When large amounts of data are exchanged among the services, the controlling node becomes a communica-tion bottleneck. It is especially problematic in an Internet environment, where the communication links between controlling node and the services likely suffer limited bandwidth.

The issues associated with the centralized data and control flows motivate us to dis-tribute the data-flows among the services. Instead of using the controlling node as the intermediate data relay, the composed application can inform the services to exchange data directly. A sample application of this model is shown in Fig. 2. The decision to retain a centralized control-flow hinges upon ease of implementation and manage-ment. The distributed data-flow model utilizes the communication network among web services, and thus alleviates communication loads on the controlling node. More-over, the computation is distributed efficiently to where data resides, so that the data can be processed without incurring communication traffic.

C o m p o sed Ap p lic atio n S ervic e W rap p e r C o n tro l flo w s S e rvice W rap p er S ervic e W rap p e r S ervic e W rap p e r S ervic e W ra p p er S o ftw are S e rvice S o ftw a re S ervice S o ftw a re S ervic e S o ftw a re S ervic e S o ftw are S ervice D ata flo w s Req uest Res_ult

(5)

3.3 Active Mediation

In the semantic web setting [1], where we expect a large collection of autonomous and diverse service providers, we cannot expect that each service will deliver results that are fully compatible and useful to further services that the composed application will need to invoke. Active mediation is introduced to provide client-specific functional-ities so that services can be viewed as if they were intended for the specific needs of the client. Active mediation is enabled by mobile class, which is an information-processing module that can be dynamically loaded and executed. The mobile class is similar to the mobile agent technology [14] in that they both utilize executable pro-gram that can migrate during execution. However, the mobile agents are self-governing, whereas the mobile class is an integral part of the service composition framework. The management and deployment of mobile classes are controlled by the composed application.

Mobile classes can be used to support data processing, such as relational data op-erations. A mobile class can be constructed for each relational operator. Mobile classes are also used in place of type brokers to handle data conversion. In a large-scale service composition framework, data exit in various types and will continue to appear in different types. Rather than forwarding data among the type brokers, the composed application loads the mobile classes on the services to provide the type mediation function. The type mediation supported by mobile classes eliminates inter-mediate data traffic. Services can produce a wide variety of data suitable for extrac-tion and reporting [13]. Mobile classes are loaded onto the upstream service to medi-ate the output data for the downstream service.

4 Automatic Infrastructure Management

The lack of management tools in distributed applications, coupled with the increasing dependence on the availability of these systems, is causing significant resources to be put into managing these infrastructures. There is also a pressing need to address Qual-ity of Service (QoS) and Service Level Agreements (SLA) for infrastructure manage-ment. Consequently, new IT technologies aiming at automating the deployment and maintenance of IT infrastructure are emerging. One promising technology is auto-nomic computing where the IT infrastructure and its components are self-configuring, self-healing, self-optimizing and self-protecting.

The IBM autonomic computing manifesto presents challenges to develop and de-ploy systems and software that can run themselves [3], can adapt to varying circum-stances, and can operate robustly under the damaged conditions. This vision is an amalgam of analogies from biology [2, 5, 6, 10], where the self-regulating control systems can maintain steady state. While research for these analogies begins to draw more attentions, building software that is aware of its own behavior, its surrounding context, and the control and interaction among components is extremely difficult [10].

(6)

4.1 Topology-based Modeling for IT Infrastructure

One of the most urgent needs for managing IT infrastructure is to know the infrastruc-ture system itself. Based on this information, we can then discuss the possibility of a more automatic management paradigm. Once we establish a proper model to describe the system, we can use it for

− monitoring the real-time system,

− better trouble shooting the system,

− estimating the reliability of the system, and

− calculating the QoS and SLA based on the reliability.

The starting point for self-managing a complex infrastructure is to determine the to-pology of the underlying applications. For this purpose, we define toto-pology as the combination of the static description of the application's components and the relation-ships that exist among these components. All applications have two parts to them: (1) infrastructure, and (2) application entities. The model for IT infrastructure needs to include entities for the above two parts and their relations.

More specifically, a model for infrastructure can be considered to consist of three tiers as follows: (1) Network Tier, which consists of devices such as switches, routers, load balancers, firewalls, and etc.; (2) Systems Tier, which provides the computing infrastructure and consists of computer servers and operating system; (3) Applications Infrastructure Tier, which provides the software containers in which application com-ponents execute.

4.2 Estimating Reliability

The reliability of an IT infrastructure can be estimated based on the underlying infra-structure model. As the first step, we should establish a reliability model for the IT infrastructure. The model reflects the following intuitive system feature:

− In general, the reliability of a system decreases as the complexity of the system increase.

− If there is a single point of failure in the system, the system cannot be more stable than that component.

− Redundancy enhances reliability.

− However, redundant components may be prone to fail at the same time because the failure may have the same cause.

The reliability estimation method includes the following four steps, which can be summarized as:

− Mapping the system as a graph.

− Determining the critical graph for outage.

− Assessing the reliability of basic modules.

(7)

4.3 Using Reliability Model for QoS and SLA

Due to high reliability and availability requirements for on-demand distributed ser-vice, we need to employ an industrial standard for assuring the quality of such service. We propose using QoS (Quality of Service) and SLA (Service Level Agreements) to help on-demand service providers cope with the unprecedented demands on manage-ment of their service infrastructures.

There are basically two approaches to deliver QoS. One is to create business ser-vice that requires very high infrastructure reliability. For this approach, we calculate the current reliability of the infrastructure. If it cannot support the business ments, predication is needed for an acceptable IT infrastructure to fulfill the require-ments. The precdication is based on simulations of reliability model for planned infra-structures. The second approach is to dispense with the current IT infrastructure to prioritize the business requirements and maximize the business service offering. The dispensation is based on the calculation of the best capability and usage of the infra-structure using the reliability model. Both approaches need the information of the reliability model for the IT infrastructure.

After we understand the available QoS for the infrastructure, we can establish SLA for the IT service. SLA monitoring and enforcement become increasingly important in an IT service environment. By using the predications obtained from a reliability model and the estimated QoS, we can set up the foundation for supporting SLA. By using SLA, distributed services may be subscribed dynamically and on-demand based on the requirements and priorities. As a result, for economic and practical reasons, we should use the model-based infrastructure management paradigm to reach an automated proc-ess for both the service itself as well as the SLA management system that measures and monitors the QoS parameters, to check the agreed-upon service levels, and to report violations to the authorized parties involved in the SLA management process.

5 Summary

In this paper, we discuss several distributed services in engineering applications do-main. A composition framework for such services is proposed to deal with the effi-ciency and flexibility for integrating complex applications. More specifically, a refer-ence service composition framework is introduced to address the two issues prevailing in current service composition: interface incompatibility and performance. The refer-ence framework applies active mediation to enhance efficient execution of applica-tions employing composed services.

As the number of services becomes large, the composed application imposes chal-lenge on the service infrastructure management. To tackle this ever-increasing com-plexity in the deployment and management of the infrastructure for services, we intro-duce a topology based infrastructure model and the reliability for an automatic infra-structure management paradigm. Finally, we propose a procedure for establishing QoS and SLA for managing on-demand and distributed services.

(8)

References

1. Berners-Lee, T., Hendler, J. and Lassila, O.: The Semantic Web. Scientific American. 284(5) (2001) 34-43

2. Blair, G.S., Coulson, G., Blair, L., Duran-Limon, H., Grace, P., Moreira, R. and Parlavant-zas, N.: Reflection, Self-Awareness and Self-Healing in OpenORB. Proceedings of the First ACM Workshop on Self-Healing Systems. Charleston, SC (2002) 9-14

3. Dashofy, E.M., van-der-Hoek, A. and Taylor, R.N.: Towards Architecture-based Self-Healing Systems. Proceedings of the First ACM Workshop on Self-Self-Healing Systems. Charleston, SC (2002) 21-26

4. Foster, I., Geisler, J., Gropp, W., Karonis, N., Lusk, E., Thiruvathukal, G. and Tuecke., S.: Wide-Area Implementation of the Message Passing Interface. Parallel Computing. 24(12) (1998) 1735-1749

5. Garlan, D. and Schmerl, B.: Model-based Adaptation for Self-Healing Systems. Proceed-ings of the First ACM Workshop on Self-Healing Systems. Charleston, SC (2002) 27-32 6. George, S., Evans, D. and Davidson, L.: A Biologically Inspired Programming Model for

Self-Healing Systems. Proceedings of the First ACM Workshop on Self-healing Systems. Charleston, SC (2002) 102-104

7. Liu, D.: A Distributed Data Flow Model for Composing Software Services. Ph.D. Thesis. Department of Electrical Engineering, Stanford University, Stanford, CA (2003)

8. Liu, D., Cheng, J., Law, K.H., Wiederhold, G. and Sriram, R.D.: Engineering Information Service Infrastructure for Ubiquitous Computing. Journal of Computing in Civil Engineer-ing. 17(4) (2003) 219-229

9. Lynch, J.P., Law, K.H., Kiremidjian, A.S., Carryer, E., Kenny, T.W., Partridge, A. and Sundararajan, A.: Validation of a Wireless Modular Monitoring System for Structures. Proceedings of Smart Structures and Materials, SPIE. San Diego, CA (2002)

10. Mikic-Rakic, M., Mehta, N. and Medvidovic, N.: Architectural Style Requirements for Self-Healing Systems. Proceedings of the First ACM Workshop on Self-Healing Systems. Charleston, SC (2002) 49-54

11. Peng, J. and Law, K.H.: A Prototype Software Framework for Internet-Enabled Collabora-tive Development of a Structural Analysis Program. Engineering with Computers. 18(1) (2002) 38-49

12. Peng, J., Liu, D. and Law, K.H.: An Engineering Data Access System for a Finite Element Program. Journal of Advances in Engineering Software. 34(3) (2003) 163-181

13. Sample, N., Beringer, D. and Wiederhold, G.: A Comprehensive Model for Arbitrary Re-sult Extraction. Proceedings of ACM Symposium on Applied Computing. Madrid, Spain (2002)

14. White, J.E.: Mobile Agents. In: Bradshaw, J.M. (eds.): Software Agent, MIT Press, (1997) 437-472