Control theory - Towards auto-scaling in the cloud: online resource allocation techniques

Control theory provides automation mechanisms for management of complex information systems. Systems under control of feedback loops can deal with disturbances, uncertainties, un- predictable changes. In figure2presented a standard feedback loop. The system under control is called target system. It has a number of metrics, which marked as measured output in the figure. The system has set of control knobs(control input). The task of the controller is to peri- odically adjust control knobs to insure that measured output meets its desired value (reference

input) specified by the system designer. To provide high level of control the controller algorithm

should consider control error, as well as external disturbance, which can impact measured out-

put of the system.

Control systems can be open or closed. First type does not use feedback to verify whether the output achieved desired state. It means that system does not monitor the output of the process it controls. Therefore open loop systems cannot correct control errors and compensate

Figure 2:Standard feedback control loop Picture from [160]

disturbances. Usually open control systems used for management of simple processes, where the feedback it not an issue. Closed systems use feedback. Hence, they observe the output and correct the output if it deviates from the desired value. For the control systems it is important not only react to deviation of the output, but also anticipate the errors. Closed systems that predict errors are called feed-forward. The best quality of control is achieved when feedback and feed-forward controller work together. We consider only closed loop systems, because they offer feedback mechanism that informs the auto-scaling system about the application state and virtual resources utilization. Based on the number of input and output parameters controllers classified into SISO (single input single output) and MIMO (multiple inputs and multiple outputs) controllers. For example, to control CPU and RAM resource allocation process the auto-scaling system requires MIMO controller.

According to Hellerstein, Singhal, and Wang [66] the design of closed loop controller consists of three main steps. Firstly, one should define the control objective. For example, the objective of auto-scaling system is control quality of service by adjusting resource allocation to ensure that performance indicators such as 95% percentile of the response time meets SLO. In this case the reference input is specified, so the control system solves regulation problem. There are other examples of the control objectives such, management of resource utilization [108]. The authors target 80% CPU utilization of a web server.

Second step is describing the software system in terms of control theoretic concepts. In figure2presented key elements of feedback control system for regulatory control. Assume we want to design auto-scaling controller for a VM running a web server. As the reference input we can use the web server response time. The target system is VM that runs the web server. The measured output is response time of the server. Virtual CPU speed is control knob for the response time regulation. The relationship between input and output can be affected by

external disturbance such as the web server clients request rate. The goal of the controller is

adapting the control input to keep the output consistent with the reference input in presence of external disturbance.

Third step is obtaining the model to describe the relationship between input and output. In control theory the step is referred as system identification process. The relationship can be derived with help first-principles [65]. Often the exact form of the relationship is not available. In this case, black-box approaches are used to construct the generic models with the help of statistic techniques.

Patikirikorala and Colman [112] provide classification of well-established control schemes. They classify the schemes in four categories: fixed gain controllers, adaptive controllers, model predictive controllers and reconfiguring control.

Fixed gain controllersare the simplest types of controllers. The tuning parameters of the controller are set during system identification experiments. One of the most used controllers is Proportional Integral Derivative(PID) controller. The following algorithm describes PID the controller: uk = Kp* ek+ Ki k X j=1 ej+ Kd* (ek−ek−1) (1)

ukis input value, for example, CPU power of a VM. ekis a control error, which is calculated as

difference between measured output y and input reference r. Kp, Ki, Kd are proportional, inte-

gral and derivative gain parameters. During setup process of the controller the gain parameters of each component(proportional, integral and derivative) are tuned to achieve desired control quality. They don’t change during runtime of the system. Therefore the controller is called fixed

gain. This type of controllers is useful for the systems where workload conditions don’t change

or change within nominal range. However, if the workload is characterized by high fluctuations, then the controlled system will experience performance degradation. In [96,45,95,68] fixed gain controllers are applied for dynamic resource allocation. Lim et al. [96] build proportional integral controller to allocate application server VMs based on CPU utilization. The workload was changed within predefined operational range using step function. In [95] authors apply integral controller perform horizontal scaling of storage tier (VM running HDFS cluster). The controllers performs reactive scaling when unpredicted load spikes occur. Heo et al. [68] build CPU and memory controllers to scale web server VM. The controller periodically adjusts the resources with respect to changed workload.

Adaptive controllersaddress some limitations of fixed gain controllers. The controller is equipped with online estimation techniques such as least square method. With help of the technique the controller can tune own parameters to meet user provided high-level objectives. Padala et al. [109] propose adaptive controller for provisioning multi-tier web applications. The controller adjusts CPU and Disk I/O resources of each tier VMs. Authors apply recursive least square method to periodically update the controller parameters. In [108] adaptive controller applied to keep CPU utilization of web application close to 80%. Ferguson et al. [51] use MIMO adaptive controller to meet job deadline of MapReduce application. Authors consider the case where job deadline can be modified. To adapt to changed job deadline the controller dynamically reassigns number of MapReduce tasks running in parallel.

Model predictive controllers. Two previous types of controllers are reactive controllers. They cannot anticipate future behavior of the system. In contrast, the MPC can predict future behavior and perform optimization with respect to predicted value. For auto-scaling systems it is important to have proactive component in order to provide better scaling decisions [17]. Roy, Dubey, and Gokhale [120] apply ARMA based workload prediction and include the workload component to the control loop that adjusts the number of running VMs to maintain user-defined response time. Nathuji, Kansal, and Ghaffarkhah [106] developed MIMO controller to regulate resource allocation between multiple batch applications and provide performance with respect to different quality of service levels.

Reconfiguring controllers Adaptive controllers dynamically adapt the parameters of the controller. However, the control algorithm remains unchanged. Reconfiguring controller is a form adaptive controller that can change control algorithm during runtime. There have been attempt [129] to apply the controller in resource allocation process. However, it lacks stability proofs.

One of the complex parts of applying control theoretic approaches is building the model of relationship between input and output. Classical PID controllers consider single linear models. However, most of inter-relationships in computing systems are non-linear. ARMA(X) (auto- regressive moving average) is able to capture the correlation between current output of the system and recent control inputs. ARMA-based models can anticipate future output values and improve quality of control. In [160,109] use ARMA model to manage resource allocation of web application. Kalyvianaki, Charalambous, and Hand [83] applied Kalman filters to control CPU allocation to 3-tier web applications. They build MIMO model that catches resource usage correlations between the tiers. A number of works[91,117,151] employ Fuzzy models. The fuzzy model consists of a set of rules which connect input variables with output variables. The model associates workload (input) with resource demand (output). With the help of fuzzy rules input and output of the controller mapped to a fuzzy set. Basically the rules embed human

Figure 3:Queuing model from [93] Figure 4:Queuing network from [53]

expert’s knowledge. Fuzzy rule describes membership function that determines a value in the range from 0 to 1. Xu et al. [151] developed fuzzy controller to learn relationship between workload, application performance and resource usage. Then obtained model is used to estimate required CPU for the incoming workload. Usually fuzzy model is fixed at design time, therefore the workload with abrupt changes can lead to control overshooting. To address the issue Rao et al. [117] designed self-tuning fuzzy controller that can dynamically correct control overshooting. Authors adjust database VM virtual CPU cap to target desired response time of web application under the workload with high fluctuations.

In summary, control theory provides feedback control mechanism that adapts to workload changes and operating conditions. However, the quality of control greatly depends on the applied model. Many applications have non-linear relationship between performance and resource consumption. Hence, there is a need to apply non-linear models. Unfortunately the control theory does not provide general methodology to obtain the model. The model usually obtained empirically, which requires extensive experiments and deep domain knowledge. Moreover, the accuracy of resource allocation depends on the type of chosen controller. The reactive controller is simple to implement, but it cannot anticipate future needs. Therefore the focus should be on applying model predictive controllers that can provide better scaling decisions.

In document Towards auto-scaling in the cloud: online resource allocation techniques (Page 40-43)