parallel Architectures
Although results of these implementations are dependent on the particular problem and on the given data representation, the general steps can be executed on a CNN architecture in similar types of problems. In this way one can benefit from the immense multi-parallel performance of the CNN chip that has been demonstrated in numerous other cases.
I hope that with this implementation we could exhibit a new, more abstract class of algorithms: topographic development of different metaheuristics [46] that can be mapped easily and with high performance on the different versions of the CNN-UM architecture.
These algorithms could be used in mobile applications where parameter optimization is needed within strict time constraints and with a low energy consumption considering
30
reading global memory incurs an additional 400 to 600 clock cycles
31
this is the power consumption of the processor only,not counting the surrounding environment
the advantages of the xenonv3 chip.
The number of elements simulated and tested on the virtual machine are enough to solve complex optimization problems in practice. The virtual machine and the Xenon ar- chitecture are both equivalents of the CNN-UM architecture, however the implementation of nonlinear templates32 and using neighborhood radii larger than one is not possible on all the CNN architectures.
The topographic thinking and implementation of parallel, localized algorithm is getting more and more important and popular. Within this new algorithm development the cellular architectures are playing a key role, as it can be observed how the number of processors are increasing on a chip in every year. I am happy, that I could present here my version of the cellular genetic algorithm that can be mapped perfectly into a cellular machine, keeping the convergence of the algorithm as it was evidenced by three different practical test cases.
As it can be seen from the results larger neighborhoods induce a higher level of implicit migration and with proper parameter settings the exploitation/exploration ratio can be tuned to reach faster convergence speed.
32it is proven, that all nonlinear templates can be implemented as a series of linear templates 49
Chapter 4
State Estimation of Hidden
Markov Models
As we have seen in the previous chapters, the many-core and cellular architectures are offering brand new possibilities and also new challenges in algorithm development. In the previous chapters I have shown how the local selection can be used in case of static problems, when the optimal state (the solution) is constant in time and how it can be implemented and mapped on many-core architectures. In this chapter I will introduce the cellular particle filter (Pfilter, PF) to estimate the hidden state of Hidden Markov Models (HMM) and how this algorithm can be mapped to modern cellular architectures.
In this chapter I will introduce Hidden Markov Models and methods that can be used for state estimation. This entire chapter and the description of the theory were based on the lecture notes of Ramon van Handel used at Princeton University [47]. I will also refer to [48] and [49] for the relevant mathematical theory.
The applications of this kind of processes is extremely versatile, however they can be divided into different subgroups:
• processes where the original state1can not be observed directly, only through distor-
tion and noise. E.g: the general channel model: In this case our aim is to approximate
X based on the values of Y .
• The other case is, when we would like to estimate and predict Y 2, however the
state transition of Y can not be seen or derived directly. Sometimes it is useful to introduce a hidden value X in the background, which has a simple state transition and also effects/determine the values of Y E.G: A stock process (Y ) can have difficult dynamics, and usually we are interested in stock prices, however to introduce the manufacturing process (X) will simplify the state transition and determine 3 also the stock price.
1X
t that we would like to process or estimate
2
the current and previous values can be observed
3
with an additional noise
In this dissertation I will examine the first case.
4.1
Filtering, Smoothing, Prediction
We can divide the main tasks and usual problems with HMMs into three different subgroups, however in all cases our aim is to approximate the conditional probability of the hidden state based on the observations:
P (X0X1...Xn∈ S|Y0, Y1. . . Yk) (4.1)
Our aim is always to maximize this value: identify the conditional distribution of the hidden states. The three different problems depend on the values of n and k.4 The problems can be divided into three cases which are the following:
• filtering k = n • smoothing k < n • prediction k > n
In this dissertation I will introduce filtering (k = n) and I will work with this problem, however all the other problems will raise similar questions and calculations.
To solve the problem we can derive a filtering equation (described in Appendix D). It is proven that this equation gives the optimal solution of the problem the hidden trajectory with the lowest mean square error based on the observation. Based on Appendix D the filtering of HMMs can be done in theory and the equations are clear to maximize the con- ditional expectation. Although there are different problem classes, with special dynamics, where this calculation can be done relatively easily, in case of the general problem the calculation of the integral in (10.9) is computationally expensive. Tn practice it can be difficult to calculate the values in (10.9) and to obtain the results one has to calculate a computationally expensive integration in equation (10.9).
In some worse cases this integration can not be calculated at all and we can only apply different approximations.
4.1.1 The General Case
In the general case, covering many practical problems the state space can be infinite, continuous and the state transition and observation are arbitrary and non-linear. In this case we run into formidable computational difficulties when we wish to apply this technique in practice to calculate the filtering recursion.
4
Usually we calculate the conditional distribution because this is the best approximation in the L2 space. To minimize the error according to other metric we should calculate the conditional distribution according to some other arbitrary function.
The problem in this case is not only, that with certain problems the calculation of the integral is extremely time consuming, but with the increase of the dimension of the problem the computational need will increase exponentially. We will have to face the curse of dimensionality (first described by Richard E. Bellman [50]) .In lower dimensions my so- lution could work but the resources required by the algorithm are increasing exponentially with the increase of the dimension.
The first solution that comes to our mind is to discretize the continuous state space to avoid the continuous state space5: this will lead to the usage of a finite grid to approximate
our integral, however this grid helps us to approximate the continuous state-space, but it will not help to avoid the curse of dimensionality and the exponentially increasing need of computation.
One can easily see, that a finite grid with uniform spatial resolution can not be approx- imate efficiently all the possible state spaces. Certain parts of the state space has to be approximated with high resolution and in some cases even a much lower resolution can be satisfactory. We can use Monte Carlo methods to avoid unnecessary computation, because it was used in many similar problems and with this method we can avoid the curse of dimensionality. With random sampling we will still see, that not all of the used samples, lets call them particles will be used. We will see that many of them will not approximate our functional (when the observations are not identical). We will not need these samples in our recursion anymore and we can throw them out from the cohort. One can intuitively see, that it would be good to substitute these particles by other samples which were proved to be useful according to the previous observations.
These methods are called particles filters with resampling.
In the following section I will describe these methods more detailedly.