This is the second part of the tutorial series on particle filters. In this tutorial part, we derive the particle filter algorithm from scratch. Here are all the tutorials in this series (click on the link to access the tutorial webpage):
PART 2: You are currently reading Part 2. In Part 2 we derive the particle filter algorithm from scratch.
PART 3: In part 3, we explain how to implement the particle filter algorithm in Python from scratch.
In this tutorial part, first, we briefly revise the big picture of particle filters. Then, we briefly revise the main results from probability and statistics that a student needs to know in order to understand the derivation of particle filters. Then, we derive the particle filter algorithm. Finally, we concisely summarize the particle filter algorithm. Before reading this tutorial part, it is suggested to go over the first tutorial part.
The YouTube tutorial accompanying this webpage is given below.
Big Picture of Particle Filters – Approximation of Posterior Probability Density Function of State Estimate
As explained in the first tutorial part, for presentation clarity and not to blur the main ideas of particle filters with too many control theory details, we develop a particle filter for the linear state-space model given by the following two equations:
(1)
where
In the first tutorial part, we explained that the main goal of particle filters is to recursively approximate the state posterior probability density function, denoted by
(2)
where
(3)
In the particle filter and robotics literature, the posterior density is also called the filtering density (filtering distribution) or the marginal density (marginal distribution). Some researchers call the density
At the time instant
(4)
A particle with the index of
(5)
where
The particle filter design problem is to design an update rule (update functions) for particles that will update the particles from the time step
(6)
where the update rule should incorporate the observed output
Once we have the approximation of the posterior distribution (5), we can approximate the mean and variance of the estimate, as well as any statistics, such as higher-order moments or confidence intervals.
Basic Results from Statistics and Probability Necessary to Know For Deriving the Particle Filter
To develop the particle filter algorithm, it is important to know the basic rules of probability and the Bayes’ theorem. Here, we briefly revise the main results from the probability theory you need to know in order to understand the derivation of the particle filter algorithm. Let
(7)
or
(8)
On the other hand, we also have
(9)
By combining (8) and (9), we have
(10)
From the last equation, we obtain
(11)
The equation represents the mathematical formulation of Bayes’ theorem or Bayes’ rule. Although it looks simple, this is one of the most important and fundamental results in statistics and probability.
In the context of the Bayesian estimation, we want to infer statistics or an estimate of
(12)
where
(13)
or in the normalized form
(14)
Derivation of Particle Filter Algorithm
After revising the basics of the Bayes’ rule, let us now go back to the derivation of the particle filter. We start from the probability density function that takes into account all the states starting from
(15)
where
(16)
This density takes into account the complete state-sequence
(17)
where
This equation is the expectation of the (vector) function of the state sequence
The equation (17) is the most important equation for us. The main purpose of the estimation process is to compute an approximation of the smoothed posterior density that will enable us to compute this integral. Namely, if we would know the smoothed posterior density, then we would be able at least in theory (keep in mind that solving an expectation integral can sometimes be a very difficult task) to approximate any statistics of the smoothed posterior state estimate. For example, if we want to compute the mean of the estimated state sequence
However, we do not know the smoothed posterior density
Here comes the first remedy for this problem (we will need several remedies). The remedy is to use the importance sampling method. In our previous tutorial, given here, we explained the basics of the importance sampling method. The main idea of the importance sampling is to select a new density or a function denoted by
(18)
where
(19)
the general form of the weight is
(20)
What did we achieve with this strategy? First of all, notice that we did not use the unknown smoothed posterior distribution to generate the state sequence samples
However, the unknown smoothed posterior is used to compute the weights
To summarize: the main idea of the particle filter derivation is to derive the recursive equation for the weight
Let us now start with the derivation of the recursive relation for
(21)
Then, by using Bayes’ rule, we have
(22)
where
(23)
This is the direct consequence of the form of the output equation of the state-space model (1). The distribution of the output only depends on the state and measurement noise at the time step
By substituting (23) in (22), we obtain
(24)
Next, we can write
(25)
Next, we can use (8) to write the density
(26)
Next, since the state-space model is Markov, the probability density
(27)
This is because of the structure of the state equation of (1). The probability density function of the state at the time instant
On the other hand, the density
(28)
By substituting (27) and (28) in (26), we obtain
(29)
By substituting (29) in (25), we obtain
(30)
The expression (30) is important since it establishes a recursive relation for the probability density
Next, we need to derive a recursive equation for the smoothed proposal density (also known as smoothed importance density). Again, we start from the general form of the proposal density that takes into account the complete state sequence
(31)
This density can be written as
(32)
Next, by using (8), we have
(33)
Since the proposal density is not strictly related to our dynamical system, the Markov property does not hold by default. However, we can always make an assumption that similar to the Markov property. Consequently, we assume
(34)
Also, the conditional density
(35)
By substituting (34) and (35) in (33), we obtain the final equation for the batch importance density:
(36)
We see that the smoothed proposal density
Now, we can substitute the recursive relations given by (30) and (36) into the equation for the importance weight (20) of the smoothed posterior density
(37)
The last term in the last equation is actually the value of the weight of the smoothed posterior density at the time step
(38)
Consequently, we can write (37) as follows
(39)
Here, it is important to observe the following:
We can compute the weight
(40)
where
(41)
and we normalize these computed quantities to obtain
Now that we know how the weights should be computed in the recursive manner, we need to address the following problem.
We can observe that in order to compute the weight (40), we need to know the density (function)
The idea is to select this function such that
- We can easily draw samples from the distribution corresponding to this density.
- The expression for the weights (40) simplifies.
One of the simplest possible particle filters, called the bootstrap particle filter or the Sequential Importance Resampling (SIR) filter selects this function as the state transition density
(42)
By substituting (42) in (40), we obtain
(43)
This implies that for the sample
(44)
That is, the weight is computed by using the previous value of the weight and the value of the measurement density at the sample
Now, we are ready to formulate the particle filter algorithm.
Particle Filter Algorithm
First, we state the algorithm, and then, we provide detailed comments about every step of the algorithm.
STATMENT OF THE SIR (BOOTSTRAP) PARTICLE FILTER ALGORITHM: For the initial set of particles
(45)
Perform the following steps for
- STEP 1: Using the state samples
, , computed at the previous step , at the time step generate state samples , , from the state transition probability with the density of . That is,(46)
- STEP 2: By using the observed output
, and the weight computed at the previous time step , calculate the intermediate weight update, denoted by , by using the density function of the measurement probability(47)
After all the intermediate weights , are computed, compute the normalized weights as follows(48)
After STEP 1 and STEP 2, we computed the set of particles for the time step(49)
- STEP 3 (RESAMPLING): This is a resampling step that is performed only if the condition given below is satisfied. Calculate the constant
(50)
If , then generate a new set of particles from the computed set of particles(51)
by using a resampling method (see the comments below). In (51), the “OLD” set of particles is the set of particles computed after STEP 1 and STEP 2, that is, the set given in (49).
Algorithm Explanation: We initialize the algorithm with an initial set of particles. This set of particles can be obtained by generating initial state samples from some distribution that covers the expected range of initial states and computing the weights. For example, we can assume a uniform distribution and we can assign equal weights to all the points generated from a uniform distribution.
In step 1 of the algorithm, for the given value of the state sample
In step 2 of the algorithm, we first compute intermediate (not-normalized) weights
Step 3 of the algorithm is performed only if the condition stated above is satisfied. If the condition is satisfied, we need to resample our particles. Here is the reason for resampling. It often happens that after several iterations of the particle filter algorithm, we will have one or only a few particles with non-negligible weights. This means that most of the weights of particles will have a very small value. Effectively, this means that the weights of these particles are equal to zero. This is the so-called degeneracy problem in particle filtering. Since the time propagation of every particle consumes computational resources, we waste computational resources by propagating particles with very small weights. One of the most effective approaches to deal with the degeneracy problem is to resample particles. This means that we generate a new set of