Received: 4 November 2019; Accepted: 31 December 2019; Published: 4 January 2020 Abstract: **Evaporation** is a very important **process**; it is one of the most critical factors in agricultural, hydrological, and meteorological studies. Due to the interactions of multiple climatic factors, **evaporation** is considered as a complex and nonlinear phenomenon to model. Thus, machine learning methods have gained popularity in this realm. In the present study, four machine learning methods of **Gaussian** **Process** **Regression** (GPR), **K**-**Nearest** **Neighbors** (KNN), **Random** **Forest** (RF) and **Support** **Vector** **Regression** (SVR) were used to predict the **pan** **evaporation** (PE). Meteorological data including PE, temperature (T), relative humidity (RH), wind speed (W), and sunny hours (S) collected from 2011 through 2017. The accuracy of the studied methods was determined **using** the statistical indices of Root Mean Squared Error (RMSE), correlation coefficient (R) and Mean Absolute Error (MAE). Furthermore, the Taylor charts utilized for evaluating the accuracy of the mentioned models. The results of this study showed that at Gonbad-e Kavus, Gorgan and Bandar Torkman stations, GPR with RMSE of 1.521 mm/day, 1.244 mm/day, and 1.254 mm/day, KNN with RMSE of 1.991 mm/day, 1.775 mm/day, and 1.577 mm/day, RF with RMSE of 1.614 mm/day, 1.337 mm/day, and 1.316 mm/day, and SVR with RMSE of 1.55 mm/day, 1.262 mm/day, and 1.275 mm/day had more appropriate performances in estimating PE values. It was found that GPR for Gonbad-e Kavus Station with input parameters of T, W and S and GPR for Gorgan and Bandar Torkmen stations with input parameters of T, RH, W and S had the most accurate predictions and were proposed for precise estimation of PE. The findings of the current study indicated that the PE values may be accurately estimated with few easily measured meteorological parameters.

Show more
17 Read more

Their obtained results revealed high capabilities of implemented firefly algorithm in decreasing the prediction error of standalone ANFIS model in all studied stations. Khosravi et al. [ 24 ] examined the potential of five data mining and four ANFIS models for predicting reference evapotranspiration in two stations in Iraq. They stated that for both studied stations, the ANFIS-GA generated the most accurate predictions. Salih et al. [ 25 ] investigated the capabilities of co-ANFIS for predicting **evaporation** from reservoirs **using** meteorological parameters. The findings of the mentioned study indicated the suitable accuracy of the co-ANFIS model in **evaporation** estimation. Recently, Feng et al. [ 26 ] examined the performance of two solar radiation-based models for the estimation of daily **evaporation** in different regions of China. They suggested that Stewart’s model can be preferred when the meteorological data of sunny hours and air temperature are available. Therefore, it is possible to estimate the **evaporation** through intrinsically nonlinear models. Qasem et al. [ 27 ] examined the applicability of wavelet **support** **vector** **regression** and wavelet artificial neural networks for predicting PE at Tabriz and Antalya stations. Obtained results indicated that artificial neural networks had better performances, and the wavelet transforms did not have significant effects in reducing the prediction errors at both studied stations. Yaseen et al. [ 28 ] predicted PE values **using** four machine learning models in two stations of Iraq. They reported that the SVM indicated the best performance comparing to other studied methods.

Show more
17 Read more

most critical factors in agricultural, hydrological, and meteorological studies. Due to the interactions of multiple climatic factors, the **evaporation** is a complex and nonlinear phenomenon; therefore, the data-based methods can be used to have precise estimations of it. In this regard, in the present study, **Gaussian** **Process** **Regression** (GPR), **Nearest**-Neighbor (IBK), **Random** **Forest** (RF) and **Support** **Vector** **Regression** (SVR) were used to estimate the **pan** **evaporation** (PE) in the meteorological stations of Golestan Province, Iran. For this purpose, meteorological data including PE, temperature (T), relative humidity (RH), wind speed (W) and sunny hours (S) collected from the Gonbad-e Kavus, Gorgan and Bandar Torkman stations from 2011 through 2017. The accuracy of the studied methods was determined **using** the statistical indices of Root Mean Squared Error (RMSE), correlation coefficient (R) and Mean Absolute Error (MAE). Furthermore, the Taylor charts utilized for evaluating the accuracy of the mentioned models. The outcome indicates that the optimum state of Gonbad-e Kavus, Gorgan and Bandar Torkman stations, **Gaussian** **Process** **Regression** (GPR) with the error values of 1.521, 1.244, and 1.254, the **Nearest**-Neighbor (IBK) with error values of 1.991, 1.775, and 1.577, **Random** **Forest** (RF) with error values of 1.614, 1.337, and 1.316, and **Support** **Vector** **Regression** (SVR) with error values of 1.55, 1.262, and 1.275, respectively, have more appropriate performances in estimating

Show more
21 Read more

Ren and Zhang [18] used a **Support** **Vector** Machine com- bined with Minimum Enclosing Ball, a method they called MEB-SVM, to classify static gestures acquired from a video camera. After the image acquisition, image segmentation, contour selection, and classification, their work achieved a mean recognition rate of 92.89%. Liu et al. [19] used an SVM with Hu moments to classify hand postures acquired from a camera, automating the verification of hand inte- grality for the Chinese Driver Physical Examination Sys- tem. An error rate of 3.5% was generated after tests were executed with data from 20 people. Chen and Tseng [20] developed a robotic visual system to recognize static ges- tures for finger guessing games. An SVM classifier was im- plemented and configured to be robust enough to work re- gardless of hand angles and skin colors. In their tests, their setup achieved a correct recognition rate of 95.0% for the paper, rock, and scissors game, **using** data from four people. Meng, Pears, and Bailey [21] presented a method to recog- nize human actions from video streams, **using** a linear SVM as classifier, trained with data acquired from Motion History Image (MHI) and Hierarchical Motion History Histogram (HMHH). **Using** examples of walking, jogging, running, boxing, hand clapping, and hand waving, recorded from 25 people in four different scenarios, the method achieved a maximum recognition rate of 93.1%.

Show more
16 Read more

The city of Hamadan, as an ecotourism and historic destination, can provide a platform for development of sustainable tourism through proper planning and provi- sion of infrastructure. This study aimed a) to identify fac- tors affecting the satisfaction of tourists traveling the city of Hamadan in line with the appropriate decision making for the development of tourism as well as increasing the satisfaction of tourists; and b) to compare performance of two data mining techniques of **random** **forest** (RF) and **K**-**nearest** neighborhood (KNN) in predicting tourism satisfaction.

Show more
The most important work in software development is Estimation. Various Software Effort Estimation models came into existence when people started following the standard project management **process**. Researchers often use KLOC, Story Points, Function Points and Use Case Points as a measure of size. There are several techniques used to calculate effort which is broadly classified into algorithmic models, and non-algorithmic models [23]. Algorithmic models such as COCOMO, COCOMO-II, Putnam’s, etc., cannot do early estimations because the attributes they use could only be calculated after project completion. So for doing early estimation, the best alternate is non-algorithmic models such as expert-based, learning-based, linguistic-based and optimization-based models. In this research we are discussing on machine learning techniques which can perform early estimations and have the ability to handle non-linear function, these are adaptable for any environment, we could calculate confidence in decision made.

Show more
According to the results of the **analysis**, it was determined that technical indicators are very successful in order to predict stock price (Aldin et. al., 2012). Taylor and Allen made a study about technical **analysis** in foreign exchange market. Within this context, they conducted a survey to foreign exchange dealers in London. As a result of survey **analysis**, it was determined that 90% of dealers use technical **analysis** in their works. Another result of this study is that dealers trust technical **analysis** result for short time period. However, they prefer fundamental **analysis** in the long term (Taylor and Allen, 1992). Blume and others tried to investigate the applicability of stock exchange volume for technical **analysis**. They concluded that the volume is very informative in order to define the value of stock exchange. Another conclusion of this study is that it was defined that investors, who use statistical information, are more successful than the others in their investments (Blume et. al., 1994). Lam tried to integrate fundamental and technical **analysis** for financial performance prediction. Within this scope, financial data of 364 S&P companies for the years between 1985 and 1995 was used. In addition to this situation, neural networks method was used in this study so as to achieve this objective. As a result of the **analysis**, it was determined that **using** fundamental and technical **analysis** gives better results (Lam, 2004). Chavarnakul and Enke made a study related to the performance of 2 technical indicators of technical **analysis** approach. Within this scope, they used generalized **regression** neural network method. Furthermore, S&P 500 index data was used in this study so as to achieve the purpose. In conclusion, it was defined that stock trading **using** the neural network showed better performance than the results of stock trading without neural network assistance (Chavarnakul and Enke, 2008).

Show more
18 Read more

Contrary to the application of KNNcatImpute to the GENICA data set in which all SNPs/observations are considered in the search for the **k** **nearest** SNPs/observations, we here “just” use the SNPs without missing values to identify the **k** **nearest** **neighbors** of a SNP with missing values. Moreover, we restrict the search by considering the SNPs chromosomewise such that the missing genotypes of a particular SNP are imputed **using** only SNPs that come from the same chromosome as the considered SNP. The latter is not only time-saving, but also biologically meaningful, as only SNPs from the same chromosome are inherited together. In Table 2, the mean fractions of falsely imputed values are summarized for the different settings of KNNcatImpute. This table shows that while employing the corrected Pearson’s contingency coefficient works poorly also in the application to the HapMap data, the other three distance measures perform almost equally well, where the scaled Manhattan distance exhibits slightly lower error rates than d SMC which in turn leads to slightly less falsely imputed

Show more
16 Read more

We consider learning on graphs, guided by kernels that encode similarity between vertices. Our fo- cus is on **random** walk kernels, the analogues of squared exponential kernels in Euclidean spaces. We show that on large, locally treelike graphs these have some counter-intuitive properties, specif- ically in the limit of large kernel lengthscales. We consider **using** these kernels as covariance func- tions of **Gaussian** processes. In this situation one typically scales the prior globally to normalise the average of the prior variance across vertices. We demonstrate that, in contrast to the Euclidean case, this generically leads to significant variation in the prior variance across vertices, which is undesir- able from a probabilistic modelling point of view. We suggest the **random** walk kernel should be normalised locally, so that each vertex has the same prior variance, and analyse the consequences of this by studying learning curves for **Gaussian** **process** **regression**. Numerical calculations as well as novel theoretical predictions for the learning curves **using** belief propagation show that one obtains distinctly different probabilistic models depending on the choice of normalisation. Our method for predicting the learning curves **using** belief propagation is significantly more accurate than previous approximations and should become exact in the limit of large **random** graphs.

Show more
35 Read more

Principle Component **Analysis** (PCA) In order to examine the relationships among a set of p correlated variables, it may be useful to transform the original set of variables to another new set of uncorrelated variables called principal components. These new variables are linear combinations of the original variables and are derived in decreasing order of importance so that, for example, the first principal component accounts for the largest variance of the original data. PCA originated in some work by Karl Pearson around the turn of the century, and was further developed in the 1930s by Harold Hotelling (Chatfield and Collins, 1980). The usual objective of the **analysis** is to see if the first few components account for most of the variation in the original data. In other words, if some of the original variables are highly correlated, they are effectively 'having the same information' and there may be near-linear constraints on the variables. This method will simply find components which are close to the original variables but arranged in decreasing order of variance (liu et al. 2003). As a result, the information of original variables was exhibited by derived principal components and don't waste aspects of data's information (Konishi and Rao 2014). The PCA can be explained as four below stages: A) Calculation of KMO 1 factor

Show more
14 Read more

Abstract— Recognition of several fruit images is major challenges for the computers. Mostly fruit recognition techniques which combine different **analysis** method like color-based, shaped-based, size-based and texture-based. Different fruit images color and shape values are same, but not robust and effective to recognize and identify the images. We introduce new fruits recognition techniques. This combines four features **analysis** method shape, size and color, texture based method to increase accuracy of recognition. Proposed method used is **nearest** neighbor classification algorithm. These methods classify and recognize the fruit images from the **nearest** training fruit example. In this paper it takes the fruit images as input and then recognition system shows the fruit name. Proposed fruit recognition system analyses, classifies and identifies the Fruit recognition system improves the educational learning purpose sharply for small kids and used grocery store to automate labeling and computing the price.

Show more
In particular, a large number of data-driven models have been created. For example, the empirical model and machine learning algorithm have been extensively investigated. Stephens and Stewart (1963) developed an empirical model **using** radiation and air temperature. This model was found to perform best among 23 models in extremely arid areas (Al-Shalan and Salih 1987). Hanson (1989) presented an empirical equation **using** radiation and air temperature in the USA. Linacre (1977) proposed a simple model **using** temperature in Australia. Rotstayn et al. (2006) coupled the radiative component and the aerodynamic component to develop the PenPan model, which was later validated by Roderick et al. (2007) and Johnson and Sharma (2010) across Australia. Lim et al. (2016) modified the PenPan model to present the PenPan-V2 model, which was found to outperform the original PenPan model in Australia. Patel and Majmundar (2016) obtained empirical relations as functions of air temperature, relative humidity, wind velocity, and sunshine duration in India. Andreasen et al. (2017) developed multilinear **regression** models **using** various combinations of meteorological variables in the USA. The main benefit of empirical models is that the meteorological variables are routinely measured and easily available. However, they can only be applied to the places with similar climatic conditions (Goyal et al. 2014). Moreover, the empirical models cannot provide accurate estimations due to the complex **process** of **evaporation** (Shalamu 2011).

Show more
31 Read more

In "Surveying converse inventory network effectiveness: maker's point," pushed by the significance of regular supportability and remanufacturing exercises, M. Kumar et al. use the dug in fuzzy data envelopment **analysis** (FDEA) approach to manage focus pivot creation network the board. They direct their exploration from the maker's factor of view. In truth, they convert the proposed FDEA model into a new immediate programming improvement trouble. thus, the issue is point by point as an interim programming trouble. They fight that their proposed model can deliver generous outcomes. They show that the ISO 14001 accreditation plot just hardly improves the stock system's confirmation of home grown viability. but, their revelations surprisingly show that associations that have completed alter assembling system practices for a shorter time allotment could practically outmaneuver those which have realized turn round store network practices for an extra drawn out range of time.

Show more
Calculation of the performance of a solar thermal system is highly complex when **using** an analytical modelling approach. An overview of the theoretical equations governing the thermal dy- namics of solar thermal collectors can be found in Duf ﬁ e and Beckman (2013). Often, computational models are required to capture the physical phenomena at the expense of a large amount of computational time and power. A combination of ﬁ nite differ- ence and electrical analogy models were used in (Notton et al., 2013; Motte et al., 2013) to calculate the outlet temperature of a building integrated solar thermal collector. The accuracy of the numerical model was validated against experimental data allowing the authors to simulate future geometric and material design al- terations to improve the ef ﬁ ciency of the solar collector. A nu- merical modelling approach was applied to a building integrated, aerogel covered, solar air collector in Dowson et al. (2012). From this, the authors were able to calculate outlet temperatures and collector ef ﬁ ciency from weather conditions. The model outputs were validated to within 5% of the measured values over a short measurement period. As a result, the authors could simulate much longer time periods to demonstrate the potential ef ﬁ ciency and ﬁ nancial payback of their proposed solution. A numerical model- ling approach within the MATLAB environment applied to a v- groove solar collector was developed in Karim et al., (2014). The resulting model can predict the air temperature at any part of the

Show more
12 Read more

As an additional **analysis**, the performance of both models is also evaluated under extrapolating conditions. Extrapolation is a term that is used to describe the scenario when a model is forced to perform prediction in regions beyond the space of the original training data set. Due to the varying nature of processes in industry, empirical models tend to suffer from reduced robustness performance due to the models’ incapability to maintain their original accuracy for data outside the original training range (Castillo 2003; Himmelblau 2008; Kordon 2004; Lennox et al. 2001; Nelles 2001). In **process** industries, extrapolation is completely unavoidable because plants often operate outside the range of the original identification data used to develop the model (Castillo 2003; Kordon 2004). The variations in processes are actually a dominant and frequently encountered event. Many factors dictate such

Show more
In this paper, LS-WSVR with three different wavelet kernels are applied to forecasting fund volatility, and the in-sample and out-of-sample forecasting performance of these LS-WSVR are compared with those of LS-SVR with **Gaussian** kernel functions according to evaluation indices. The remaining of this paper is organized as follows. Section 2 presents the theory of LS-WSVR algorithm. Empirical results on SZSE fund index illustrating the effectiveness of the LS-WSVR are provided in Section 3. Conclusions are given in the final section.

or the **Gaussian** kernel, SVMs were able to obtain extremely good perfor- mance on this problem. This was particularly surprising since the input attributes x were just a 256-dimensional **vector** of the image pixel intensity values, and the system had no prior knowledge about vision, or even about which pixels are adjacent to which other ones. Another example that we briefly talked about in lecture was that if the objects x that we are trying to classify are strings (say, x is a list of amino acids, which strung together form a protein), then it seems hard to construct a reasonable, “small” set of features for most learning algorithms, especially if different strings have dif- ferent lengths. However, consider letting φ(x) be a feature **vector** that counts the number of occurrences of each length-**k** substring in x. If we’re consid- ering strings of english letters, then there are 26 **k** such strings. Hence, φ(x)

Show more
25 Read more

We used **regression** trees as basis functions. Boosting **regression** trees involves generating a sequence of trees, each grown on the residuals of the previous tree [5,9]. Prediction is accomplished by weighting the ensemble outputs of all the **regression** trees. We used stochastic gradient boosting, assuming the **Gaussian** distribution for minimizing squared-error loss in the R package gbm [9]. We determined the main tuning parameter, the optimal number of iterations (or trees), **using** an out-of- bag estimate of the improvement in predictive perfor- mance. This evaluates the reduction in deviance based on observations not used in selecting the next **regression** tree. The minimum number of observations in the trees’ terminal nodes was set to 1, the shrinkage factor applied to each tree in the expansion to 0.001 and the fraction of the training set observations randomly selected to propose the next tree in the expansion to 0.5. With these settings boosting **regression** trees with at most 8- way interactions between SNPs required 3656 iterations for the training dataset based on inspecting graphical plots of the out-of-bag change in squared error loss against the number of iterations [9].

Show more
Many researchers have also investigated the ap- plicability of the time series **analysis**, rstly proposed by Box and Jenkins [15], to hydrology studies, such as rainfall [16], ow [17,18], wind speed [19,20], and radiation [21,22]. Kisi (2004) used Articial Neural Networks (ANN) to predict monthly ow and com- pared the results with autoregressive models. He stated that ANN predictions, in general, are better than those found with AR(4) [23]. Yurekli and Ozturk (2003) determined alternative autoregressive moving average **process** (ARMA) models **using** the graphs of autocorrelation (ACF) and partial autocorrelation functions (PACF). The plots of the ACF showed that ARMA (1,0) with a constant was the best model by considering Schwarz Bayesian Criterion (SBC) and error estimates [24]. Torres et al. (2005) used ARMA and persistence models to predict the hourly average wind speed up to 10 h in advance. They showed that the use of ARMA models signicantly improved wind speed forecasts compared to those obtained with per- sistence models [25]. Wu and Chau (2010) investigated ARMA, **K**-**Nearest**-**Neighbors** (KNN), ANN and Phase Space Reconstruction-based Articial Neural Network (ANN-PSR) models to determine the optimal approach of predicting the monthly stream ow time series. They compared these models by a one-month-ahead forecast. They determined that the KNN model gives the best performance among the four models, but only exhibits weak superiority to ARMA [26]. Alhassoun et al. (1997) generated annual and monthly **evaporation** sequences **using** the rst order Markov model for ten stations in Saudi Arabia. They evaluated the perfor- mance of the developed models **using** the methods of fragments, Thomas-Fiering and Two-Tier, and dened their suitability [27]. Knapp et al. (1984) generated a weekly **evaporation** time series **using** the mass transfer method for Milford Lake. They also developed a mathematical model for the time series. The model

Show more
In this paper, we will prove that many SVMs based on Lipschitz continuous loss functions have a bounded Bouligand influence function. To formulate our results we will use Bouligand-derivatives in the sense of Robinson (1991) as defined above. These directional derivatives were to our best knowledge not used in robust statistics so far, but are successfully applied in approximation theory for non-smooth functions. Section 2 covers our definition of the Bouligand influence function (BIF) and contains the main result which gives the BIF for **support** **vector** **machines** based on a bounded kernel and a B-differentiable Lipschitz continuous convex loss function. In Section 3 it is shown that this result covers the loss functions L ε , L τ−pin , L c−Huber , and L log as special cases. Section 4

Show more
22 Read more