5.14
Exercise - Noisy Nonlinear Systems
Implement the same system, but add noise to the measurement.
In [34]: #enter your code here
5.14.1
Solution
In [35]: sensor_variance = 30 movement_variance = 2 pos = (100,500) zs, ps = [], [] for i in range(100):pos = predict(pos[0], pos[1], movement, movement_variance) Z = math.sin(i/3.)*2 + random.randn()*1.2
zs.append(Z)
pos = update(pos[0], pos[1], Z, sensor_variance) ps.append(pos[0])
p1, = plt.plot(zs, c=’r’, linestyle=’dashed’, label=’measurement’) p2, = plt.plot(ps, c=’#004080’, label=’filter’)
plt.legend(loc=’best’) plt.show()
5.14.2
Discussion
This is terrible! The output is not at all like a sin wave, except in the grossest way. With linear systems we could add extreme amounts of noise to our signal and still extract a very accurate result, but here even modest noise creates a very bad result.
Very shortly after practitioners began implementing Kalman filters they recognized the poor performance of them for nonlinear systems and began devising ways of dealing with it. Much of the remainder of this book is devoted to this problem and its various solutions.
5.15
Summary
This information in this chapter takes some time to assimilate. To truly understand this you will probably have to work through this chapter several times. I encourage you to change the various constants and observe the results. Convince yourself that Gaussians are a good representation of a unimodal belief of something like the position of a dog in a hallway. Then convince yourself that multiplying Gaussians truly does compute a new belief from your prior belief and the new measurement. Finally, convince yourself that if you are measuring movement, that adding the Gaussians correctly updates your belief. That is all the Kalman filter does. Even now I alternate between complacency and amazement at the results.
If you understand this, you will be able to understand multidimensional Kalman filters and the various extensions that have been make on them. If you do not fully understand this, I strongly suggest rereading this chapter. Try implementing the filter from scratch, just by looking at the equations and reading the text. Change the constants. Maybe try to implement a different tracking problem, like tracking stock prices. Experimentation will build your intuition and understanding of how these marvelous filters work.
Chapter 6
Multivariate Kalman Filters
6.1
Introduction
The techniques in the last chapter are very powerful, but they only work in one dimension. The gaussians represent a mean and variance that are scalars - real numbers. They provide no way to represent multidi- mensional data, such as the position of a dog in a field. You may retort that you could use two Kalman filters for that case, one tracks the x coordinate and the other tracks the y coordinate. That does work in some cases, but put that thought aside, because soon you will see some enormous benefits to implementing the multidimensional case.
In this chapter I am purposefully glossing over many aspects of the mathematics behind Kalman filters. If you are familiar with the topic you will read statements that you disagree with because they contain simplifications that do not necessarily hold in more general cases. If you are not familiar with the topic, expect some paragraphs to be somewhat ‘magical’ - it will not be clear how I derived a certain result. I prefer that you develop an intuition for how these filters work through several worked examples. If I started by presenting a rigorous mathematical formulation you would be left scratching your head about what all these terms mean and how you might apply them to your problem. In later chapters I will provide a more rigorous mathematical foundation, and at that time I will have to either correct approximations that I made in this chapter or provide additional information that I did not cover here.
To make this possible we will restrict ourselves to a subset of problems which we can describe with Newton’s equations of motion. In the literature these filters are called discretized continuous-time kinematic filters. In the next chapter we will develop the math required for solving any kind of dynamic system.
In this chapter we are dealing with a simpler form that we can discuss in terms of Newton’s equations of motion: given a constant velocity v we can compute distance exactly with:
x = vt + x0 If we instead assume constant acceleration we get
x = 1 2at
2+ v0t + x0
And if we assume constant jerk we get
x = 1 6jt 3+1 2a0t 2+ v 0t + x0
As a reminder, we can generate these equations using basic calculus. Given a constant velocity v we can compute the distance traveled over time with the equation
v = dx dt dx = v dt Z x x0 dx = Z t 0 v dt x − x0= vt − 0 x = vt + x0
6.2
Multivariate Normal Distributions
What might a multivariate normal distribution look like? In this context, multivariate just means multiple variables. Our goal is to be able to represent a normal distribution across multiple dimensions. Consider the 2 dimensional case. Let’s say we believe that x = 2 and y = 17 This might be the x and y coordinates for the position of our dog, or the temperature and wind speed at our weather station, it doesn’t really matter. We can see that for N dimensions, we need N means, like so:
µ = µ1 µ2 .. . µn
Therefore for this example we would have
µ = 2 17
The next step is representing our variances. At first blush we might think we would also need N variances for N dimensions. We might want to say the variance for x is 10 and the variance for y is 4, like so.
σ2=10 4
This is incorrect because it does not consider the more general case. For example, suppose we were tracking house prices vs total m2 of the floor plan. These numbers are correlated. It is not an exact correlation, but in general houses in the same neighborhood are more expensive if they have a larger floor plan. We want a way to express not only what we think the variance is in the price and the m2, but also the degree to which they are correlated. It turns out that we use the following matrix to denote covariances with multivariate normal distributions. You might guess, correctly, that covariance is short for correlated variances. Σ = σ12 σ12 · · · σ1n σ21 σ22 · · · σ2n .. . ... . .. ... σn1 σn2 · · · σ2n
If you haven’t seen this before it is probably a bit confusing at the moment. Rather than explain the math right now, we will take our usual tactic of building our intuition first with various physical models. At this point, note that the diagonal contains the variance for each state variable, and that all off-diagonal elements (covariances) are represent how much the ith (row) and jth (column) state variable are linearly correlated to each other. In other words, it is a measure for how much they change together. No correlation will have a covariance of 0. So, for example, if the variance for x is 10, the variance for y is 4, and there is no linear correlation between x and y, then we would say
Σ =10 0 0 4
6.2. MULTIVARIATE NORMAL DISTRIBUTIONS 137