Chapter 2: Automating Processing Architecture Maintenance
2.1 Overview
As mentioned in the introduction chapter, previous work has shown that the choice of the architecture impacts the response times of the shared program. In general, choosing the architecture that provides the best response times is a difficult task. Therefore, Chung [21] developed an analytical model that can predict the architecture that gives the best response times. The model assumes two-user collaborations, a constant cost of processing each input command, zero cost of processing each output command, zero cost of transmitting inputs and outputs, constant think times before each command, and no type-ahead (no type-ahead means that users do not enter a command until they see the output for the previous commands). Chung showed both analytically and experimentally that (a) low network latency favors the centralized architecture and (b) asymmetric processing powers favor the centralized
architecture. As these conditions can change dynamically, he developed a system that supports architecture changes at runtime. He also performed experiments showing that when a user with a powerful computer joins the collaboration, it is useful to dynamically centralize the shared program to the new user’s computer.
One issue with Chung’s system is that the users have to select the architecture to use at start time and decide when to switch architectures at runtime. A system that automatically
maintains the architecture would be useful because it would relieve the users of performing these tasks. In this chapter, we present such a self-optimizing system for small-scale
collaboration scenarios, where by small-scale, we mean two or three users.
In the process of creating the self-optimizing system, we extended Chung’s work in two ways. First, we present a three-user version of Chung’s two-user model that relaxes three of the six assumptions made by the original model. In particular, the new model does not assume a constant cost of processing each input command, zero cost of processing each output command, or constant think time before each input command. The updated model still assumes negligible transmission costs and no type-ahead. Second, we present a system that can automatically gather the parameters of the three-user model and apply the model to decide which architecture should be used. By combining this new system with our own version of Chung’s system that can dynamically switch architectures at runtime, we create a self-optimizing system that automates the maintenance of the architecture. Therefore, our system consists of two-sub systems. The first is our version of Chung’s system, which we call the sharing sub-system to denote the fact that it is responsible for sharing the application. The second is the new system we develop, which we call the optimization sub-system to denote to the fact that it is responsible for improving response times.
Chapter Scope:
We analyze the impact of the collaboration architecture on response times. We consider small-scale collaboration scenarios involving two or three users in which the cost of transmitting commands is negligible and there is no type-ahead.
39
Each of the systems of the self-optimizing system raises several issues that must be addressed in order for the system to function correctly. A fundamental issue raised by the sharing sub-system is how it shares an application among the users. The system must somehow intercept users’ input commands and the corresponding outputs and send them to the appropriate computers. Ideally, it should do so in a manner transparent to the application. Moreover, the system needs to be configurable so that the same application can be shared using both centralized and replicated architectures.
Another fundamental issue is how the sharing sub-system switches between replicated and centralized architectures dynamically at runtime. Whenever the system switches from a centralized to a replicated architecture, the system must bring the program components on the new master computers up to date; otherwise, these program components may be out of sync with the program component running on the computer which was the master in the centralized architecture. Moreover, if an architecture switch is not performed atomically with respect to user commands, then the shared application may be shared in a manner that is inconsistent with the notion of centralized and replicated architectures. To illustrate, suppose that the system switches from the replicated (centralized) to the centralized (replicated) architecture. Suppose that during the switch, an input (output) command was en- route to the computer that is (was) a slave in the centralized architecture. Therefore, a slave
Chapter Goals:
We present our self-optimizing system that better meets response time requirements than existing systems by automating the maintenance of the architecture.
(master) user in the new architecture will receive an input (output) command from a remote user which is inconsistent with the notion of centralized (replicated) architectures.
A related issue is how the system accommodates late-comers. When a late-comer joins as a master, the system must as above, bring the program components on the late-
comer’s computer up to date; otherwise, these program components may not be synchronized with the program components on other computers. Similarly, when a late-comer joins as a slave, the system must bring the user-interface component on the late-comer’s computer up to date; otherwise, future outputs may not make sense.
Chung’s framework has provided solutions to all of these issues. Since our sharing sub-system is a version of Chung framework, for each issue, we will describe Chung’s solution. If in our system we use a different approach than Chung, then we will state the difference between ours and Chung’s systems. If we do not state a difference, it means that we have reused Chung’s solution.
The optimization sub-system raises a different set of issues. One issue is how the system is organized. Implementations in which the system is organized in a client-server or peer-to-peer fashion on the users’ machines both seem promising. However, the former can overload the machine on which it is running and hence degrade the response times of the local user, while the latter must reach an agreement among the peer components, which is a version of the distributed consensus problem commonly found in distributed systems.
Another important issue is how the optimization sub-system gathers values of the parameters in the analytical model, which it must do in order to apply the model. In
particular, it must measure the input and output processing costs on each user’s computer and the network latencies among all of the users’ computers.
41
Once the system gathers the parameter values, it can use the analytical model to predict replicated and centralized architecture response times. The next issue is how it passes the predicted response times to the total order function and then invoke the functionality in the sharing sub-system for switching architectures to change to the architecture returned by the total order function.
We address the issues raised by both the sharing and optimization sub-systems system when we describe them below. The rest of this chapter is organized as follows. We first derive our three-user version of Chung’s analytical model. We then describe our version of Chung’s system that can dynamically switch architectures at runtime. Following this, we discuss the upgrades to the system necessary to turn it into our self-optimizing system. Then, we describe experiments conducted with the self-optimizing system. Finally, we present discussions and a brief summary.