In this post, we derive equations describing the recursive least squares method. The motivation for creating this tutorial, comes from the fact that these equations are used in the derivation of the Kalman filter equations. Apart from this, the recursive least squares method is the basis of many other algorithms, such as adaptive control algorithms. Consequently, it is of paramount importance to properly understand the recursive least-squares method. The derivation of the Kalman filter equations on the basis of the recursive least-squares equations is arguably much simpler and easier to understand than the derivation based on other methods or approaches.
The Python implementation of the derived least-squares method is given here.
The YouTube videos accompanying this post are given below.
Motivational Example
To motivate the need for the recursive least squares method, we consider the following example. We assume that a vehicle is moving from a starting position
(1)
We assume that we are able to measure the position at the discrete-time instants
(2)
where
(3)
And our vector
(4)
The goal of the recursive least squares method is to estimate the vector
Issues with the ordinary least squares method
In our previous post which can be found here, we derived the solution to the least squares problem. Here for completeness of this tutorial, we briefly revise the ordinary least squares method.
Consider the following system of linear equations
(5)
where
(6)
where
The equations in (5) can be compactly written as follows
(7)
where
(8)
Usually, in the least squares estimation problems
(9)
There are a number of computationally efficient approaches for solving (9). Here for brevity and simplicity, we use the solution that is not the most efficient one to be computed. This solution is given by the following equation
(10)
There are at least two issues with the formulation and solution of the least squares problem:
- The least-squares problem formulation and its solutions assume that all measurements are available at a certain time instant. However, this often might not be the case in the practice. Also, we would like to update the solution as new measurements arrive in order to have an adaptive approach for updating the solution.
- When the number of measurements
is extremely large, the solutions of the least squares problem are difficult to compute. That is, the computational and memory complexities of the solution are very high.
These facts motivate the development of the recursive least squares that is presented in the sequel.
Recursive Least Squares Method – Complete Derivation From Scratch
Let us assume that at the discrete-time instant
(11)
where
(12)
and we assume that the mean of
(13)
The recursive least-squares method that we derive in this section, has the following form:
(14)
where
The equation (14) updates the estimate of
Our goal is to derive the formulas for computing the gain matrix
First, we need to introduce new notation. We assume that the vector
(15)
This partitioning directly corresponds to the partitioning of the vector
(16)
Next, we introduce the estimation error vector
(17)
That is,
(18)
The gain
(19)
Next, we show that this cost function, can be represented as follows:
(20)
where
(21)
That is, the gain matrix
To show this, let us first expand the following expression
(22)
From the last expression, we have
(23)
Next, by taking the matrix trace of the last expression, we obtain
(24)
and this is exactly the expression for the cost function
Next, we derive the expression for the time propagation of the estimation covariance matrix. By substituting the estimate propagation equation (14) and the measurement equation (11) in the expression for the estimation error, we obtain
(25)
Next, we have
(26)
Applying the expectation operator to (26), we obtain
(27)
We have
(28)
Then, since
(29)
(30)
(31)
By substituting (28), (29), (30), and (31) in (27), we obtain an important expression for the propagation of the estimation error covariance matrix
(32)
Let us expand the last expression
(33)
By substituting (49) in (20), we obtain
(34)
We find the gain matrix
(35)
To compute the derivatives of matrix traces, we use the formulas given in The Matrix Cookbook (Section 2.5). We use the following formulas
(36)
(37)
(38)
We find derivatives of all terms in (34). Since
(39)
Next, by using (36), we have
(40)
By using (37), we have
(41)
By using (38), we have
(42)
(43)
By substituting (39), (40), (41), (42), and (43):
(44)
From the last equation, we have
(45)
By solving the last equation we obtain the final expression for the gain matrix
(46)
Before we summarize the recursive least-squares method, we derive an alternative form of the estimation error covariance matrix propagation equation. First, we can write the gain matrix (46), as follows
(47)
Here, we should observe that the matrix
(48)
We substitute (47) in (49), and as the result, we obtain
(49)
By adding the second and third terms and by grouping the last two terms, we obtain
(50)
That is, an alternative expression for the propagation of the estimation error covariance matrix is given by
(51)
Consequently, the recursive least squares method consists of the following three equations
- Gain matrix update
(52)
- Estimate update
(53)
- Propagation of the estimation error covariance matrix by using this equation
(54)
or this equation(55)
If no apriori knowledge of the estimate is known, then this method is initialized with the estimation error covariance matrix that is equal to , where is a large number, and with a random value of .
The Python implementation of the derived least-squares method is given here.