Intrusion alert prediction can be mainly categorized as network attack graph based predic- tion [8], [9], [10] and sequence modeling techniques (such as Markovmodel, HiddenMarkovmodel, Bayesian networks and Dynamic Programming) based prediction [11], [12], [13]. Se- quence modeling techniques have been successfully implemented in fields such as biology (DNA sequence) [14], speech pattern identification [15] and financial data forecasting [16]. HiddenMarkovModel (HMM) is one of the widely used sequential data modeling method. The hiddenMarkovmodel is a stochastic model which was introduced in the late 1960s by Baum and his colleagues [17], [18]. Due to rich mathematical structure of hiddenMarkovmodel, it widely applied in real world applications such as speech recognition, handwriting pattern recognition, gesture recognition, intrusion detection and speech tagging.
A hiddenMarkovmodel (HMM) is a statistical Markovmodel in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states. An HMM can be considered as the simplest dynamic Bayesian network [17].HMM is a popular technique used in bioinformatics and character recognition. Da Silva and Ferreira in 2009 have conducted a research on applying HMM for process mining. They have successfully used HMM with sequence clustering for log tracing [18]. There are many new algorithms have been created based on HMM. Zaki et.al introduced a variable order hiddenMarkovmodel with state durations combining pattern mining and data modeling named VOGUE. This algorithm has been tested on web usage mining, intrusion detection and as a spell checker [19].
Abstract: Datasets of the same geographic space at different scales and temporalities are increasingly abundant, paving the way for new scientific research. These datasets require data integration, which implies linking homologous entities in a process called data match- ing that remains a challenging task, despite a quite substantial literature, because of data imperfections and heterogeneities. In this paper, we present an approach for matching spa- tial networks based on a hiddenMarkovmodel (HMM) that takes full benefit of the under- lying topology of networks. The approach is assessed using four heterogeneous datasets (streets, roads, railway, and hydrographic networks), showing that the HMM algorithm is robust in regards to data heterogeneities and imperfections (geometric discrepancies and differences in level of details) and adaptable to match any type of spatial networks. It also has the advantage of requiring no mandatory parameters, as proven by a sensitivity ex- ploration, except a distance threshold that filters potential matching candidates in order to speed-up the process. Finally, a comparison with a commonly cited approach highlights good matching accuracy and completeness.
Sung-Bae Cho and Hyuk-Jang Park [7] present an optimal measure abstraction method based on a HiddenMarkovModel to improve the models proposed by S.Forrest. They analyzed various attack patterns and enhanced system performance of HMM-based anomaly detection systems. The basic idea is to use privilege transition flows for modelling HiddenMarkov Models so as to improve the intrusion detection rate while minimizing the false alarm rate. German Florez-Larrahondo [8] uses a novel incremental learning algorithm to allow HiddenMarkov Models online learning from complex computer applications so as to generate the efficient anomaly detection models. All research on HiddenMarkov Models is almost concentrated on anomaly detection. On the other hand, HiddenMarkov Models are also used to represent the likelihood of transitions between security states [9]. Zhang Song-hong [10] uses HiddenMarkov Models to model multi-step complex network attacks and to recognize the attacker's intention and to forecast next possible attack using the Forward and Viterbi algorithm. The perfect method and techniques for modelling complicate network attacks effectively have not been reported up to now.
Networks have vast amount of social networking data necessary to understand the properties and the behavior of social networks [1] [6] [7] [8]. Human inte- ractions study found that the use of networks was important as interactions were dynamic and this had an impact on the properties exhibited by the network evolution [8]. The dynamics, complexities and stochasticity in the network inte- ractions can be captured through HiddenMarkovModel (HMM). An analysis of HMM shows that the model is adaptable to a wide array of applications due to its strong statistical foundation; evaluation and training algorithms; ability to handle new data robustly and predict similar patterns efficiently [9]. A number of models of social network interactions have been studied based on different forms and structures [1] [5] [10] [11] [12] [13] [14].
information at multiple sites. The HiddenMarkovmodel is a doubly stochastic process in which the rainfall observation distribution depends on several unobserved discrete states (Rabiner and Juang, 1986). The HiddenMarkov models have become popular tools for modeling dependent random variables in such diverse areas such as DNA recognition, speech processing, and rainfall modeling. For rainfall modeling, the hidden (unobserved) states of HMM can be used to interpret the various patterns of circulation anomalies (Robertson et al., 2004; Robertson et al., 2005; Greene et al., 2008). The HMM can be extended to model non-stationary processes by incorporating time-varying atmospheric variables, which is known as the non- Homogeneous HiddenMarkovmodel (NHMM). This model exhibits unobserved weather states and serves as a link between the local rainfall process and large-scale atmospheric information.
ABSTRACT:Selecting an optimal web service among a list of functionally equivalent web services still remains a challenging issue. Some issue of internet service is poor service quality, low performance servers, high latency can lead to lost sales, customer lost and user’s frustration. In this study we proposed a method QoS metrication which is based on HiddenMarkovModel (HMM) and which suggest optimal path to execute user request efficiently. The term response time technique measure and predict behaviour of Web services also used for rank services quantitatively rather than just qualitatively. By performing some experiments of real world data we define reliability and usefulness of our proposed method. The results have shown how our proposed method can help user to automatically select the most reliable Web Service taking into account several metrics, among them, system predictability and response time variability Module.
states such as exons, introns, etc. The hiddenMarkov models usually have to be expanded to include additional requirements such as the codon frame information, site state duration probability informations and must also follow some acception rules. For example, the start of an initial exon must begin with the canonic triplet sequence ATG and end with TAA, TAG or TGA; the introns generally follow the GT-AG rule. There are many popular gene finder application programs in this area such as GeneMark, GeneScan and etc. The statistical model employed by GeneMark.hmm is a HMM with duration or a hidden semi-Markovmodel. The state duration distributions are derived as approximation of the observed length distributions in the training set of sequences and they are characterized by the minimum and maximum duration length allowed. For example, the minimum and maximum durations of introns and intergenic seqeunces are set to 20 and 10,000 nts. The hiddenMarkovmodel with binned duration algorithm presented in this dissertation could be used as an improvement for the GeneMark to make it more efficient while still remaining the same powerful performance. Gesture recognition is another area where hiddenMarkov models are often applied [47] [32]. Consider an automatic system that recognizes continuous hand motion for Arabic number from 0 to 9 [31]. In the segmentation and preprocessing stage, a Gaussian Mixture Model (GMM) is used for skin color detection, then hands are localized and tracked using blob analysis to generate their motion trajectors (gesture paths). Good features including location, orientation and velocity are extracted in the following feature extraction stage. The final stage is the HMM classification. Based on the complexity of the gesture, several states are generated for each isolated gesture by mapping each straight-line segment into a HMM state. A corresponding hiddenMarkovmodel is built for each isolated gesture. The Baum-Welch algorithm then is applied to train the HMMs and the Viterbi algorithm is used for identifying.
ABSTRACT: ATM card fraud is causing billions of dollars in losses for the card payment industry. In today’s world the most accepted payment mode is ATM card for both online and also for regular purchasing; hence frauds related with it are also growing. To find the fraudulent transaction, we implement an Advanced Security Model for ATM payment using HiddenMarkovModel (HMM), which detects the fraud by using customers spending behaviour. This Security Model is primarily focusing on the normal spending behaviour of a cardholder and some advanced securities such as Location, Amount, Time and Sequence of transactions. If the trained Security model identifies any misbehaviour in upcoming transaction, then that transaction is permanently blocked until the user enter High Security Alert Password (HSAP). This paper provides an overview of frauds and begins with ATM card statistics and the definition of ATM card fraud. The main outcome of the paper is to find the fraudulent transaction and avoids the fraud before it happens.
HiddenMarkov models are sophisticated and flexible statistical tool for the study of protein models. Using HMMs to analyze proteins is part of a new scientific field called bioinformatics, based on the relationship between computer science, statistics and molecular biology. HiddenMarkov models (HMMs) offer a more systematic approach to estimating model parameters. The HMM is a dynamic kind of statistical profile. Like an ordinary profile, it is built by analyzing the distribution of amino acids in a training set of related proteins. However, an HMM has a more complex topology than a profile. It can be visualized as a finite state machine. Finite state machines typically move through a series of states and produce some kind of output either when the machine has reached a particular state or when it is moving from state to state. A markovmodel is a statistical model that stepwise goes through some kind of change. Markovmodel is characterized by the property that the change is dependent only on the current state. HMMs are hidden because only the symbols emitted by system are observable, not the underlying walks between states[15]. HMMs are the Legos of computational sequence analysis.A HiddenMarkovModel M is defined by
The research hotspot in post-genomic era is from se- quence to function. Building genetic regulatory net- work (GRN) can help to understand the regulatory mechanism between genes and the function of organ- isms. Probabilistic GRN has been paid more attention recently. This paper discusses the HiddenMarkovModel (HMM) approach served as a tool to build GRN. Different genes with similar expression levels are considered as different states during training HMM. The probable regulatory genes of target genes can be found out through the resulting states transi- tion matrix and the determinate regulatory functions can be predicted using nonlinear regression algo- rithm. The experiments on artificial and real-life datasets show the effectiveness of HMM in building GRN.
In this paper an efficient method for performing an informed watermarking on the colour images in the wavelet domain using hiddenmarkovmodel is presented.The hiddenmarkovmodel training is used for training and finding exact embedding strength vector in image for embedding message in to it. The HMM model used is efficient in this purpose this system is performing better than the existing system and it can be extended for the purpose of watermarking of color images. The system however is not performing very well when we apply cropping attacks to the watermark image.
Signal model can provide the basis for the theoretical description of a signal processing system. The signal models are used to learn about the signal source when it is unavailable. Also these models are used to realize many practical systems. In this paper environmental noise signals are modelled and these modelled noise signals can be used as a reference noise signal for the noise cancellation system when the type of noise is not known priori. In this work, an approach to model the environmental noises using HiddenMarkovModel (HMM) and Fuzzy HiddenMarkov models (FHMM) are used and thereby use the modelled noise as reference noise input for cancelling the encountered noise using Fuzzy Recursive Least Square algorithm (FRLS) is proposed. The system is tested for various noises like horn noises from bus, car and babble noise. The performance of both the algorithms is compared. Experimental results show that Fuzzy Recursive Least Square algorithm with reference noise from fuzzy HMM based modelled noise provides 33% better performance than Recursive Least Square algorithm with reference noise from HMM based modelled noise.
Malware is a software which is designed with an intent to damage a network or computer re- sources. Today, the emergence of malware is on boom letting the researchers develop novel tech- niques to protect computers and networks. The three major techniques used for malware detec- tion are heuristic, signature-based, and behavior based. Among these, the most prevalent is the heuristic based malware detection. HiddenMarkovModel is the most efficient technique for mal- ware detection. In this paper, we present the HiddenMarkovModel as a cutting edge malware de- tection tool and a comprehensive review of different studies that employ HMM as a detection tool.
The HiddenMarkovModel (HMM) is a powerful statistical tool for modeling generative sequences that can be characterized by an underlying process generating an observable sequence. HiddenMarkovModel is one of the most basic and extensively used statistical tools for modeling the discrete time series. In this paper using transition probabilities and emission probabilities different algorithm are computed and modeled the series and the algorithms to solve the problems related to the hiddenmarkovmodel are presented. Hiddenmarkov models face some problems like learning about the model, evaluation process and estimate of parameters included in the model. The solution to these problems as forward-backward, Viterbi, and Baum Welch algorithm are discussed respectively and also useful for computation. A new hiddenmarkovmodel is developed and estimates its parameters and also discussed the state space model.
A planned attack normally is performed in a long term time frame with persistent and stealthy attacks to avoid violation of IDS rules, such as two password guessing constantly in a long duration which triggers no alert and will not be discovered by IDS. To determine if a machine is under attack, the proposed approach extracts and analyzes the logs related to the observing machine to identify whether an attack sequence exists. This study adopts hiddenMarkovModel (HMM) to model the sequence of anomaly behaviors. As mentioned above that different attack strategies may leave traces in different logs. An attack plan often lasts for a long duration, so the detection should infer and correlate various logs in a period of time. A successful attack consists of at least three stages: (1) reconnaissance: gathering information from a target machine, such as scan or password guess; (2) intrusion: intruding/exploiting the target with the vulnerability found; (3) attacking: using the compromised machine to attack others.
With the development of economy, estimation has gradually received attention. Economic performance is essential to a company, that's why data analyst is very popular. Since Dongfeng Motor Corporation is one of the magnate company in Chinese vehicle market, estimation the data of Dongfeng could be very meaningful. There are many methods used to estimate economic performance, in this thesis we mainly focus on HiddenMarkovModel (HMM).
Abstract-Due to a rapid advancement in the electronic commerce technology, the use of credit cards has dramatically increased. As credit card becomes the most popular mode of payment for both online as well as regular purchase, cases of fraud associated with it are also rising. In this paper, we model the sequence of operations in credit card transaction processing using a HiddenMarkovModel (HMM) and show how it can be used for the detection of frauds. An HMM is initially trained with the normal behavior of a cardholder. If an incoming credit card transaction is not accepted by the trained HMM with sufficiently high probability, it is considered to be fraudulent. At the same time, we try to ensure that genuine transactions are not rejected. We present detailed experimental results to show the effectiveness of our approach and compare it with other techniques available in the literature.
Nowadays the customers prefer the most accepted payment mode via credit card for the convenient way of online shopping, paying bills in easiest way. At the same time the fraud transaction risks using credit card is a main problem which should be avoided. There are many data mining techniques available to avoid these risks effectively. In existing research they modelled the sequence of operations in credit card transaction processing using a HiddenMarkovModel (HMM) and shown how it can be used for the detection of frauds. To provide better accuracy and to avoid computational complexity in fraud detection in proposed work semi HiddenMarkovmodel (SHMM) algorithm of anomaly detection is presented which computes the distance between the processes monitored by credit card detection system and the perfect normal processes. With this we are implementing another method for fraud detection is that having a key idea is to factorize marginal log-likelihood using a variation distribution over latent variables. An asymptotic approximation, a factorized information criterion (FIC) obtained by applying the Laplace method to each of the factorized components. Our experimental results demonstrate that we can significantly reduce loss due to fraud through distributed data mining of fraud models.
Large-scale data containing multiple important rare clusters, even at moderately high dimensions, pose challenges for existing clustering methods. To address this issue, we propose a new mixture model called HiddenMarkovModel on Variable Blocks (HMM-VB) and a new mode search algorithm called Modal Baum-Welch (MBW) for mode-association clustering. HMM-VB leverages prior information about chain-like dependence among groups of variables to achieve the effect of dimension reduction. In case such a dependence structure is unknown or assumed merely for the sake of parsimonious modeling, we develop a recursive search algorithm based on BIC to optimize the formation of ordered variable blocks. The MBW algorithm ensures the feasibility of clustering via mode association, achieving linear complexity in terms of the number of variable blocks despite the exponentially growing number of possible state sequences in HMM-VB. In addition, we provide theoretical investigations about the identifiability of HMM-VB as well as the consistency of our approach to search for the block partition of variables in a special case. Experiments on simulated and real data show that our proposed method outperforms other widely used methods. Keywords: Gaussian mixture model, hiddenMarkovmodel, modal Baum-Welch algorithm, modal clustering