• No results found

Machine Learning for Decision Support System

2.4 Machine Learning

2.4.2 Machine Learning for Decision Support System

Machine learning is not only contributing enormously to various areas, for example feature and speech recognition, bioinformatics and robotics but also performing quite well in resolving complex problems of cloud computing.

2.4.2. Machine Learning for Decision Support System 37 scheduling, load balancing and resource scaling, by forecasting the future needs. Machine learning also helps in deriving suitable decision models for complex application scenarios. Some potential research efforts are discussed here to explore the scope of machine learning in this domain. Furthermore, machine learning based decision support methods are also investigated in the context of multi-cloud brokerage solutions. In this regard, different machine learning algorithms are also highlighted to indicate their effectiveness for prediction methods.

This work[72]is considered one of the initial efforts for dynamic resource scaling in the cloud. Three platform-agnostic algorithms are analysed under the defined objective. One of the three algorithms is developed by RishtScale, while the two others predict system load using linear regression and autoregression of level 1. The Regression analysis is a statistical process for estimating relationship among variables of the dataset (scalar dependent variable y and one or more explanatory variables X). This relationship can be linear in nature if the relationship is modelled using linear predictor function. Autoregression is a stochastic process used to estimate future values based on weighted sum of past values. Furthermore, the authors’ also established a scoring metric based on availability and cost for measuring the effectiveness and efficiency of these algorithms. Here, availability is counted by considering the number of dropped requests out of a total number of requests. The results demonstrate that linear regression is considered more susceptible to small fluctuations in the generated load. By contrast, autoregression shows less sensitivity to load fluctuations. In addition to that autoregression is far more reactive than RisghtScale algorithm.

Caron et al. [73] targets resource scaling by identifying the patterns of past incidences of short-term workload and matching with current occurrences, which is similar to string matching. Furthermore, Knuth-Morris-Pratt (KMP) algorithms is used to identify the similarities in the past and the current data. Historical data about CPU utilization is used as target pattern and a unit time of 100 seconds is fixed to make chunks of data for matching. This process is very time-consuming as the current pattern has to be matched with loads of historical patterns until the match is found. Another limitation is the time unit used for capturing data traces, which is not feasible in case of cloud computing environment where pricing schemes are different. For example, Amazon charges on hourly basis, so any delay in the decision can increase the utilization cost.

The objective of this work [74] is to provision resources ahead of time before these are actually required. Future demands are predicted earlier using different machine learning methods. Three quality of service attributes are considered as input to the prediction methods, the attributes are

response time, throughput and CPU utilization. Linear Regression (LR), Support Vector Machine (SVM) and Neural Network (NN) are used as prediction methods. SVM is a supervised learning model with associated learning algorithms to analyse data used for regression and classification. The NN approach is inspired by the way a biological brain solves the problem where a large number of neurones are gathered in a cluster and attached with a central point. Artificial Neural Network (ANN) model shows interconnections between neurons in different layers of each system which defines interconnection patterns between different layers. The learning process updates the weight of interconnection based on the input and then activates the function to convert the neuron’s weighted input to its output activation.

Applications relying on dynamic autoscaling techniques may not be capable of handling a sudden traffic surge resulting from special offers or events, and hence, turn out to be low in performance.This work[75]focuses on the limitation of reactive dynamic auto-scaling approach and the use of empirical data for adaptive resource provisioning. Similar to [74], this approach also targeted the predictive resource provisioning for web server using Neural Network (NN) and Linear Regression (LR) methods though using benchmark on Amazon EC2 to collect data for training and testing. This work also uses the sliding window method and tries to capture the workload patterns for forecasting. The efficiency of prediction model depends on the workload patterns only. This work also targets the limitation of [73] by reducing the unit time of 100 seconds to 60 seconds for data log in order to be reactive according to billing time for Amazon EC2.

Jim [76]has developed a neural network based framework that learns from actual data to model the performance of power plant. This work targets the application of machine learning for power optimization of Google data center. The machine learning based power performance model predicts the power usage in Google data centers and results in improved energy efficiency.

This work [77] targets the resource management decisions in cloud computing using machine learning. Support vector regression is used as a prediction method to estimate the response time for designing the resource allocation strategy. A Genetic Algorithm (GA) based resource dispatch mechanism is proposed for the relocation of resources. The resource dispatch mechanism aims at complementing the SLA between virtual machine operators and cloud service providers by effectively utilizing the resources and maintaining the desirable performance at cloud level.

Analysis: The above-mentioned state-of-the art are applying different machine learning methods

2.4.3. Machine Learning for Cloud and VM Selection 39