The first key step in creating the ability to build high availability systems from open-architecture building blocks is to identify what those building blocks are. Each building block will provide one part of the overall technology stack in a high availability computer system. Ideally, each building block will be independent of the others so that multiple, competing products can be marketed for each building block, with interoperability between the building blocks enabled by the adoption of common interfaces for common services.
Since there is a well developed model for open-architecture building blocks in traditional (i.e., not high availability) systems, it is natural to begin there. This model has clearly defined building blocks for hardware platforms, operating systems, middleware packages (e.g., communication protocol stacks, database systems), and application software. High availability systems are built out of this same set of technology, with the added requirement that managed redundancy somehow be a part of the overall system.
A major technology differentiation of high availability systems is the inclusion of configuration and fault management functions. As described earlier, this requires the detection, diagnosis, isolation, recovery, and repair of faults anywhere they could result in the failure of the high availability system. The approach taken by the HA Forum is to define this function as an incremental addition to the existing open-architecture model for computer systems.
Thus, a single, new building block has been defined, called management middleware. This is defined as a set of configuration and fault management capabilities which are independent of any particular hardware platform, operating system, or other technology building blocks. While it is possible for operating system vendors, hardware platform vendors, or application software vendors to provide this new capability, defining it as a standard, portable function with standard interfaces to the other building blocks which make up a high availability system maximizes the ability to mix and match technology building blocks throughout the system.
Figure 13 depicts the technology building blocks that should be available from multiple, competing sources if high availability systems can be built from open-architecture technology. In this figure, each shaded block represents a piece of technology (building block) which is relevant to building high availability systems, and identifies (by the wide gray arrows) interfaces which are relevant to high availability capabilities. Ideally, each building block could have multiple, competing sources, and be exchanged without impacting other building blocks. In reality, this ideal is not likely to be met perfectly; but, the more similar the interfaces are between building blocks, the more open the system becomes.
The remainder of this section will describe each of the building blocks and interfaces depicted in Figure 13.
The hardware platform consists of the entire set of hardware, firmware, etc., normally provided by a hardware system vendor, ready to support an operating system like Windows or Linux. For the purposes of building a high availability system, of particular significance is the platform
management infrastructure which permits fault-management operations on the components in the hardware.
The small boxes within the hardware platform box indicate that a further degree of open architecture is desired within the hardware of a system. In addition to providing a platform that supports the right interfaces to operating system and other building blocks, open systems also provide a capability to integrate multiple, third-party peripheral cards, power supplies, and potentially other components. Thus, the interfaces between these components, as well as the interface between them and other vendor-specific features in the hardware platform, need standardization. Again, in particular for high availability systems, management interfaces are important considerations for standardization within the hardware platform.
Figure 13. Open Architecture Building Block in an HA System
Application Software Operating System Management Middleware “Other” Middleware (e.g., DBMS, Protocol Stack) Hardware Platform
between the operating system and the application software and between the operating system and the other middleware are shown as narrow arrows (indicating that it is not an interface that has significance to the high availability capabilities in the system). However, even with this restrictive view, the operating system will require certain capabilities to operate in a high availability system. These include exporting to management middleware an interface for management of the operating system itself, and the ability to deal with a changing (hot-swappable) hardware configuration as required to support the configuration management capabilities described in Section 5.0.
The management middleware contains the new system-level functionality that is unique to high availability systems to provide fault management and configuration management capabilities. While it is not required that this be separately developed and integrated into the system, by defining this as a separate building block and standardizing the interfaces to the management middleware from all other building blocks, it becomes possible to achieve the open-architecture high availability systems much more quickly. Note that the interface to the hardware is shown going through the operating system. Access to the hardware is handled by device drivers in the OS but the semantics of the interface are defined by the underlying hardware.
The other middleware consists of software packages which add additional capabilities to an operating system such as database management or communication protocol processing.
Middleware often contains direct interfaces to the operating system, to hardware devices (accessed through the operating system via device drivers – indicated by the arrow passing through the operating system level to the hardware platform), and to application software. To be incorporated well into a high availability system, middleware packages should also export a management interface for fault and configuration management within the middleware.
The application software may or may not be aware of the high availability system infrastructure. To support the cases where it is, there must be an interface between the application software and the management middleware. This interface makes the application itself manageable, and may provide access to services unique to the high availability system such as checkpointing application state or heartbeating. Other than this interface, application software will generally interact with the rest of the building blocks (operating system and other middleware) no differently than in non-high availability systems.