Here the Euclidean **distance** analysis is performed on the PSD result obtained for each patient in the database. For the Euclidean **distance** analysis 1 st the dataset (PSD output) of one normal patient is treated as the test data. The PSD output of the all the remaining patients normal as well as epilepsy belong to the combined dataset. The test patient’s data is compared with every patient in the combined dataset using the Euclidean **distance** **method**. Thus the Euclidean **distance** is calculated by comparing the 1 st row in the test patient’s data with each row of 1 st patient in the database, similarly the 2 nd row in the test patient’s data is compared with each row of 1 st patient in the database and the same process of comparison is continued with the remaining rows of the test patient’s data. Thus if the test patients data has dimensions( 200*16) whereas the 1 st patients data in the combined dataset has dimensions (256*16) the output of the Euclidean **distance** analysis will be a matrix of dimension (200*256). The minimum value of Euclidean **distance** is found per row. Thus we get a matrix of dimension (200*1). Now addition of all the rows is done to get a single minimum Euclidean **distance** result for comparison between the test patient and 1 st patient in the database. Similarly the minimum Euclidean **distance** is also calculated between the test patient and the 2 nd patient in the data base .The process continues till the test patient’s data is compared with all the patients in the combined dataset.

The PIMA Indian diabetes dataset collected from UCI repository, University of California. The training and testing datasets were arranged by normalizing the instances of the data. The analysis has been made by taking different **distance** metrics like Manhattan **Distance** **Method**, Euclidean **Distance** **Method**, Chebychev **Distance** **Method**, and Minkowski **Distance** **Method** along with existing **distance** **method** proposed by Cătălin Stoean as fitness function in Genetic Algorithm. From the above said **distance** metrics Minkowski **distance** is getting more accuracy when compared to others. The Accuracy of each **distance** **method** is interpreted in tabular form as well as column chart as shown below.

of edit operations, typically insertion, deletion or replacement of a single symbol, the min- imum **distance** between two strings is the minimum number of edit operations required to transform one string into the other. In the context of parsing, the target string is a sentence of the context-free language, into which the syntactically incorrect input string must be transformed. So a perfect repair is a sentence nearest to the actual input, in the sense that there is no sentence whose minimum **distance** from the input is smaller. Algorithms for global correction, which aim to construct such repairs, exist 5 ; 7 ; 8 ; 9 ; 10 but are typically based

40 Read more

Abstract—Hoaxes are non-malicious viruses. They live on deceiving human’s perception by conveying false claims as truth. Throughout history, hoaxes have actually able to influence a lot of people to the extent of tarnishing the victim’s image and credibility. Moreover, wrong and misleading information has always been a distortion to a human’s growth. Some hoaxes were created in a way that they can even obtain personal data by convincing the victims that those data were required for official purposes. Hoaxes are different from spams in a way that they masquerade themselves through the address of those related either directly or indirectly to us. Most of the time, they appear as a forwarded message and sometimes from legit companies. This paper addresses this issue by developing a hoax detection system by incorporating text matching **method** using Levenshtein **Distance** measure. The proposed model is used to identify text-based hoax emails. Sensitivity and specificity are used to evaluate the accuracy of the system in identifying hoax emails.

Estimating the density of regeneration, or number of seedlings per unit area, on a given site is important to foresters for assessing existing regeneration, determin- ing reforestation needs, and determining if reforesta- tion efforts have been successful. A variety of sampling methods have been developed for estimating regenera- tion density. The majority of the methods fall into two general categories: plot sampling and **distance** sampling (Payandeh and Ek, 1986). Plot sampling is the tradi- tional approach that involves establishing fixed size plots within an area, counting the trees within each plot, and then converting the tree counts to a density estimate. On the other hand, **distance** sampling involves measur- ing the **distance**(s) from a sample point or tree to another tree(s) within an area and then using this **distance**(s) to estimate density. Our objective was to compare the per- formance of a new **distance**-based **method**, known as the mean **distance** **method**, as an alternative to traditional plot sampling. Both methods were evaluated through computer simulation analysis on 405 square meter blocks (0.1 acre) of a field surveyed clumped distribution and a computer generated random distribution at different density levels of 100, 50 and 25%.

Figures 1-3 show the graphs for scaling and tables 1-3 show the tables for scaling. In figures 1-3, horizontal axis shows the angle of scaling while vertical axis shows percentage of keypoint matched. At horizontal axis 0 means no scaling, negative scaling shows the percentage of decrement and positive scaling shows percentage of increment in image size. Figure 1 shows scaling graph for ORL face database while table 1 shows scaling table for ORL face database. Figures 2- 3 show scaling graphs for Indian face database while tables 2- 3 show scaling tables for Indian face database. Figures 1-3 and tables 1-3 show that matching rate is enhanced for scaling by cosine and correlation matching methods as compare to original SIFT matching **method** (Euclidean **distance** **method**). Figures 4-6 show the graphs for illumination (Brightness and contrast) change and tables 4-6 show the tables for illumination change. In figures 4-6 horizontal axis shows change in illumination, while vertical axis shows percentage of keypoint matched. For illumination we take the scale from -100 to 100. Where at a change of -100 in illuminations whole image become black, while at a change of 100 in illuminations whole image become white. Change of 0 in illumination shows that the image is original. Figure 4 is illumination graph for ORL face database while table 4 is illumination table for ORL face database. Figures 5-6 are illumination graphs for Indian face database and tables 5-6 are illumination tables for Indian face database. From graphs and tables for illumination changes it is clear cosine and correlation matching methods give enhanced matching rate in case of illuminations change as compare to original SIFT. Figures 7-9 show the graphs for variation in rotation and tables 7-9 show the tables for variation in rotation. Here we consider rotation in clockwise direction. Here figure 7 and table 7 are rotation variation graph and table respectively for ORL face database while figures 8-9 and tables 8-9 are graphs and tables respectively for Indian face database. Here also cosine and correlation matching methods enhance the matching rate as compare to SIFT’s traditional matching **method** (Euclidean **distance** matching **method**) for rotation.

They investigated the dependence of torsional properties for wall thickness. Using an approximated model and starting from Bredt’s formulas (valid only for thin-walled closed sections), Serra (1996) obtained a formulation for the calculation of the torsional problem of solid cross-sections. Wang (1998) introduced the **method** of eigen function expansion and matching to solve the torsion problem of arbitrary shaped tubes described by curved and straight pieces. Najera and Herrera (2005) presented a **method** to approximate the torsional rigidity of non-circular solid cross sections encountered in mechanisms and machines.

Abstract. Accurate and fast **distance** calculation with respect to a single optical image is useful for real-time 3D construction and acquisition, however currently rare **distance** calculation methods theoretically base on an optical single image, and the traditional **distance** calculation **method** with a single image has limitations due to assumption that the step edge in the image must be strictly horizontal or vertical, which is difficult to fulfill in real applications because the slope of a practical edge could be any other value except for zero and infinity. In this paper, a **distance** calculation **method** with a single defocused image containing an oblique step edge is proposed, and no special camera equipment or unique external condition is required. First, the basic theory of coordinate system rotation has been introduced to simplify the sampling process of neighbor points and eliminate the influence resulted from slope of the step edge; one-dimension point-spread-function of the original **distance** calculation **method** is expended to two-dimension, taking into account coordinate transform, and a comprehensive **distance** calculation based on an oblique step edge with any slope is deduced; The relationship between the precision of **distance** calculation and the slope of the step edge is analyzed and proved in theory. Finally, a serial of simulation are conducted to validate our proposed **distance** calculation **method**, and the experimental results prove the applicability, validity and high precision of our **method**.

The trial decipherment approach is much slower than the frequency distribution methods, requiring roughly one hour of CPU time in order to classify each ciphertext. More complex decipherment algo- rithms are even slower, which precludes their appli- cation to this test set. Our re-implementations of the dynamic programming algorithm of Knight et al. (2006), and the integer programming solver of Ravi and Knight (2008) average 53 and 7000 seconds of CPU time, respectively, to solve a single 256 charac- ter cipher, compared to 2.6 seconds with our greedy- swap **method**. The dynamic programming algorithm improves decipherment accuracy over our **method** by only 4% on a benchmark set of 50 ciphers of 256 characters. We conclude that our greedy-swap al- gorithm strikes the right balance between accuracy and speed required for the task of cipher language identification.

12 Read more

In case of client side message batching, we want to compare an arbitrary user input bit U[i] with i th bit of every header. Therefore, we place the same bit U[i] in each message slot of a single vector, so that comparison can be per- formed in parallel. When we batch a vector consisting of the same elements, the mapping will be to a constant polynomial with the corresponding element being the constant term. This means that, we will not spend any time for batching in the runtime and this is the perquisite of the column-wise packing **method**. The operations in the user side are

19 Read more

Software categorization defined as the activity of labeling software, belonging to different domains. Generally there are two different ways for making automatic classifiers. In a knowledge engineering approach, the knowledge of human experts is described as a set of rules, which are then used in the process of classification. The disadvantage of this approach is that it requires lot of effort to make human knowledge explicit and for each new domain a separate formulation of the rules need to done manually again. In a machine learning approach, the classifier is built automatically and classification for different domains can be learned using the same algorithm. The exactness of all automatic classification system is extremely reliant upon the effort and concern taken during process[1].A cluster analysis plays big role in software categorization. Cluster analysis is used to separate the data into bunch or groups where no prior information is available. It divides data into groups (clusters) that are meaningful, useful or both and then the clusters should be based on the original structure of the data. The cluster analysis is a vital tool in decision making and an effective **method** to obtaining solutions. The units within a cluster are as similar as possible, and clusters are also as different as possible. The main reasons for doing a cluster analysis are data exploration, visualization, data reduction, hypothesis generation. Partitioning or clustering techniques are used in many areas for a wide spectrum of problems. Cluster analysis

In order to evaluate the spatial complexity measures, 16 java projects developed by the Under Graduate students of Department of Computer Science & Engineering, Pondicherry Engineering College has been used. Two or three members involved in developing the projects. The length of the programs varies from 372 non blank-non comment lines to 1465. The numbers of classes found in the programs vary from 10 to 17. The Complexity measures Class Attribute Spatial Complexity (CASC), Class **Method** Spatial Complexity (CMSC), Object Definition Spatial Complexity (ODSC) and Object Member Usage Spatial Complexity (OMUSC) are computed by an automated tool developed in java. The results are tabulated in Table 1.

In this paper, we use the idea of approximate analytical cal- culation of the luminosity **distance** by virtue of solving the corresponding differential equation with certain initial con- ditions, proposed in [11]. Solving this equation in a spati- ally flat FLRW universe by means of VIM, we obtain the ap- proximate analytical expressions for the luminosity **distance** in terms of redshift for the different content of cosmological models. We show that by using the VIM, the expression for d L (z) in arbitrary accuracy can be easily obtained by imple-

[62] Web-link of Statsoft at Web location: http://www.statsoft.com/textbook/stcluan.ht ml presents an overview of all the important clustering techniques such as: Joining – Tree Clustering (Hierarchical Tree, **Distance** Measures - Euclidean **distance**; Squared Euclidean **distance**; City-block (Manhattan) **distance**; Chebychev **distance**; Power **distance**; Percent disagreement ), Amalgamation or Linkage Rules (Single linkage -nearest neighbor, Complete linkage -furthest neighbor, Unweighted pair-group average, Weighted pair-group average, Unweighted pair-group centroid, Weighted pair-group centroid –median, Ward's **method**), Two-way Joining, k -Means Clustering, Expectation Maximization Clustering.

In this paper, we have discussed the results obtained by several feature methods such as color moment, SIFT and GIST along with Manhattan **distance** for remote sensing images retrieval. In this approach, Manhattan **distance** work faster than the SVM classifier. In the future work, we will do more research on remote sensing image classification to increase the speed up the search process and to increase the search accuracy.

10 Read more

In Table 4, it can be observed that the fastest algorithm is the one proposed by Zhao et al. (0.6819 seconds/eye). It also has the highest precision ( P = 0.9464 ) but the low- est recall ( R = 0.5989 ) . In contrast, Daugman’s **method** is the slowest one (2.6508 seconds/eye), and achieves a precision of P = 0.7482 and a recall of R = 0.8139. Segmentation error is very similar for both frameworks, being E = 0.0442 for Daugman’s and E = 0.0418 for Zhao and Kumar. Finally, the proposed **method** has a quite fast execution time (0.9187 s/eye), the lowest segmenta- tion error (E = 0.0102) and the highest recall (R = 0.9322), with a precision of P = 0.8972. In addition, Dice’s Index show it as the most accurate one. Figure 8 shows some results of properly segmented iris by the presented framework, from the initial image to the final segmented iris. Figure 9 shows some not properly segmented iris. It can be observed as the contrast between the iris and the pupil for dark eyes is very low and sometimes makes the algorithm fail. Most segmentation errors occur due to pupil boundary leakage, so in future versions this algo- rithm should be improved in order to achieve a greater accuracy.

14 Read more

can play the same role as r, and we can replace r with α in LAVIFDT-OPN. The r value depends on the data set. There is no bound to r value. However, α is bound within 0 and 1. Thus, we can replace the neighbourhood r with α in Equation (2). Through the known r values for some data sets, we can ﬁnd the corresponding α values, and if its change is much less than r values, then it is obviously more convenient to use α value instead of r value. There is no systematic **method** for the selection of r value at the moment, but there are some well known data sets with r values which could be identiﬁed through comparative study in experiment. So, we can ﬁnd α values which are equivalent to those r values, and if the ideal α values do not change signiﬁcantly with a signiﬁcant change of the ideal r values, then we can set that α value as the optimised one.

41 Read more

In this paper we propose a new NN-based **method** which improves the performance of large scale images annotation by greatly reducing the semantic information loss. The difference between our **method** and Boiman's **method** [12] is that, we still use bag of visual words model, furthermore, image semantic information is introduced for computing the **distance** of nearest neighbor images. In our **method**, firstly, we utilize image semantic information for **distance** metric learning (DML) [18, 19], and obtain a new **distance** measure which can minimize the semantic gap between different visual features. Then we construct our NN-based classifier relying on this new **distance** measure. Experiments on the ImageCLEF2012 concept annotation dataset [2] confirm the effectiveness of our **method**. Furthermore, our **method**, as a non- parametric classifier, is able to handle a huge number of image categories, and avoids overfitting parameters.

In online dispatch system, there are two concepts in driver allocation. In the first concepts, the pickup order is mandatory. In this concept, when driver received pickup order, he must execute the pickup order. If he rejects the pickup order, he will get penalty from the taxi company. In this concept, pickup order is usually allocated to the nearest available driver. The purpose is to minimize the passenger waiting time and driver pickup **distance**. In the second concept, the pickup order is not mandatory. The passenger pickup request will be broadcasted to the drivers under the certain **distance** to the passenger pickup location. The driver who receives the pickup request may accept or ignore this request. Pickup request then will be allocated to the driver who accepts the request for the first time. If there is not any drivers who accept the request, then the request is fail. Driver ignores the pickup request because of some reasons, such as the transport **distance** is to far or the pickup or the destination location is dangerous area so that accepting this request may triggers conflict with local traditional motorcycle taxi.

12 Read more

The hazard-rate model has a very flat shoulder for the values of the shape parameter used in this study, which means that the detection probability, assumed to be one at zero **distance**, remains at one for some **distance** from the line, before it starts to fall (Fig. 1 of Buckland et al., in press). Neither of the detection function models used for analysis (either a half-normal or a uniform key, with cosine adjustments) share this property. As a consequence, we anticipated modest upward bias in density estimates from this source. **Method** 3 is based on having perfect knowledge of detected groups, and a priori, we expected this **method** to perform best. It gave consistent estimates of density with good precision, and some positive bias (+8.6%), as anticipated (Table 1). We see also that **method** 1 (average bias +8.0%), based on analysing individuals, matches the performance of **method** 3. This is surprising, as **method** 3 uses additional information not available to **method** 1: the true number of animals in a detected group, and the mean location of all animals in a detected group. Plumptre and Cox (2006) proposed the use of **method** 2, but its performance was disappointing, with bias tending to increase with increasing group size and group spread. Bias also differed between the two detection functions. **Method** 4 showed inconsistent biases. Biases were smaller when the true detection function was

28 Read more