January 1, 2025

Correct and Clear Explanation of Linearization of Dynamical Systems


In this tutorial, we provide a clear and correct explanation of the linearization of dynamical systems. The motivation for creating this tutorial comes from the fact that online we can find a number of tutorials that do not correctly or clearly explain the linearization process of dynamical systems. Consequently, this tutorial aims to provide a clear, concise, and correct explanation of the linearization process. The YouTube tutorial accompanying this post is given below.

Motivational example

We consider a simple gravity pendulum shown in the figure below.

A ball (red color in the figure) with a mass of m is attached by using a massless rod to the pivot point Q. We assume that the force F is acting at the ball. The length of the rod is l. The force is always perpendicular to the rod. In the figure above, g is the gravitational acceleration constant. We assume that the mass of the rod is significantly smaller than the mass of the ball and consequently, we can neglect it. The free-body diagram is shown in the figure below.

In the figure above, N is the normal reaction force exerted by the rod on the ball, mg is the gravitational force, F is the control force, n is the normal unit vector in the direction of the rod, and \tau is the tangent unit vector perpendicular to the rod (tangent to the circle describing the trajectory of the ball). Note that n and \tau define two perpendicular axes. From Newton’s law, we obtain

(1)   \begin{align*}m\vec{a}=\vec{N}+\vec{F}+m\vec{g}\end{align*}

where \vec{a} is the acceleration vector. By scalarly multiplying this equation with \vec{\tau} unit vector, we obtain the projection of this equation onto the \tau axis (tangent axis):

(2)   \begin{align*}ma_{\tau} = F- mg\sin(\theta)\end{align*}

On the other hand, the tangential acceleration is given by

(3)   \begin{align*}a_{\tau}=l\ddot{\theta} \end{align*}

where \ddot{\theta} is the second time derivative of \theta. The variable \ddot{\theta} is called the angular acceleration. By substituting this expression in the previous equation, we obtain

(4)   \begin{align*}ml\ddot{\theta} =F- mg\sin(\theta)\end{align*}

From this equation, we obtain

(5)   \begin{align*}\ddot{\theta}+\frac{g}{l}\sin(\theta) =\frac{1}{ml}F\end{align*}

For simplicity, we assume that the control force F is equal to

(6)   \begin{align*}F=u^{2}\end{align*}

where u is the control input. Consequently, the final model of the system has the following form

(7)   \begin{align*}\ddot{\theta}+\frac{g}{l}\sin(\theta) =\frac{1}{ml}u^2\end{align*}

Obviously, this system is nonlinear since

  1. It nonlinearly depends on the dependent variable \theta.
  2. It nonlinearly depends on the input u.

Let us write the ordinary differential equation (7) in the state-space form. First, we introduce the state-space variables

(8)   \begin{align*}x_{1}=\theta \\x_{2}=\dot{\theta}\end{align*}

By differentiating the last two equations, we obtain

(9)   \begin{align*}\dot{x}_{1}& =\dot{\theta}=x_{2} \\\dot{x}_{2}& =\ddot{\theta}=-\frac{g}{l}\sin(\theta)+\frac{1}{ml}u^2=-\frac{g}{l}\sin(x_{1})+\frac{1}{ml}u^2\end{align*}

Consequently, the state-space model has the following form

(10)   \begin{align*}\begin{bmatrix} \dot{x}_{1} \\ \dot{x}_{2}  \end{bmatrix} =\begin{bmatrix} x_{2} \\ -\frac{g}{l}\sin(x_{1})+\frac{1}{ml}u^2  \end{bmatrix}\end{align*}

Usually, we compactly write this state-space model as follows

(11)   \begin{align*}\dot{\mathbf{x}}=\mathbf{f}(\mathbf{x},\mathbf{u})\end{align*}

where \mathbf{x} is the state vector, \mathbf{u} is the control input vector, and \mathbf{f}(\cdot) is a nonlinear vector function of the state vector and input vector. In the general case, these quantities are defined as follows

(12)   \begin{align*}\mathbf{x}=\begin{bmatrix}x_{1}\\x_{2}\\ \vdots \\ x_{n}  \end{bmatrix},\;\; \mathbf{u}=\begin{bmatrix}u_{1}\\u_{2}\\ \vdots \\ u_{m}  \end{bmatrix} ,\;\; \mathbf{f}=\begin{bmatrix} f_{1}(\mathbf{x},\mathbf{u}) \\ f_{2}(\mathbf{x},\mathbf{u}) \\ \vdots \\ f_{n}(\mathbf{x},\mathbf{u}) \end{bmatrix},\;\; \end{align*}

In our case, we have

(13)   \begin{align*}& \mathbf{x}=\begin{bmatrix}x_{1}\\x_{2}  \end{bmatrix}, \;\; \mathbf{u}=u, \\& f_{1}(\mathbf{x}) = x_{2},\;\; f_{2}(\mathbf{x})=-\frac{g}{l}\sin(x_{1})+\frac{1}{ml}u^2.  \end{align*}

Later in this tutorial, we will get back to our nonlinear model. Next, we explain the linearization process.

Linearization Procedure

Consider the figure shown below.

The quantities in this figure are

  • \mathbf{x} is the state vector of the nonlinear system
  • \mathbf{x}^{*} is the state around which we linearize the system
  • \Delta \mathbf{x} is defined by

    (14)   \begin{align*}\Delta \mathbf{x} = \mathbf{x} - \mathbf{x}^{*}\end{align*}

The vector \Delta \mathbf{x} is the vector of new variables. This vector is the state vector of the linearized system. However, since the input applied to the system can be nonlinear, we need to linearize the system with respect to the input. Consequently, we introduce

(15)   \begin{align*}\Delta \mathbf{u} = \mathbf{u} - \mathbf{u}^{*}\end{align*}

Where \mathbf{u}^{*} is the vector of control inputs around which we linearize the dynamics, and \Delta \mathbf{u} is the control vector in new variables. The vector \Delta \mathbf{u} is the control input vector of the linearized system.

When linearizing the dynamics, we have the freedom of choice to choose the vector \mathbf{x}^{*}. Typical choices are:

  1. The equilibrium point of the system. That is, the equilibrium point \mathbf{x}^{*} is defined as follows

    (16)   \begin{align*}& \dot{\mathbf{x}}^{*}=\mathbf{f}(\mathbf{x}^{*},\mathbf{u}=0)=0\\& \dot{\mathbf{x}}^{*}=\mathbf{f}(\mathbf{x}^{*})=0 \\& \mathbf{f}(\mathbf{x}^{*})=0\end{align*}


    Note here, that the equilibrium points are computed for \mathbf{u}=0. That is, by assuming that the control input is not affecting the system dynamics.
  2. The steady state of the system. Let us assume that there is a constant input vector \mathbf{u}^{*} that produces the steady-state \mathbf{x}^{*}. The vectors \mathbf{x}^{*} and \mathbf{u}^{*} satisfy the following equation

    (17)   \begin{align*}\mathbf{f}(\mathbf{x}^{*},\mathbf{u}^{*})=0\end{align*}


    since both \mathbf{x}^{*} and \mathbf{u}^{*} are constants.
  3. The nominal trajectory. Instead of selecting the linearization state vector as a steady-state vector or an equilibrium point, the state vector can be selected as a point on a state trajectory. In this case, we have

    (18)   \begin{align*}\mathbf{x}^{*}=\mathbf{x}^{*}(t),\;\; \mathbf{u}^{*}=\mathbf{u}^{*}(t)\end{align*}



    For a known \mathbf{u}^{*}(t), the state vector \mathbf{x}^{*} satisfies the following equation

    (19)   \begin{align*}\dot{\mathbf{x}}^{*}=\mathbf{f}(\mathbf{x}^{*},\mathbf{u}^{*})\end{align*}



    The solution \mathbf{x}^{*}=\mathbf{x}^{*}(t) is the nominal state trajectory around which the dynamics is linearized. This type of linearization is shown below.

Besides these selections, we can also approximate the dynamics around other states and inputs.

The general idea of Linearization

First, let us recall the linearization procedure of nonlinear algebraic functions. Consider the following scalar function f(x) of a scalar argument. This function is illustrated in the figure below.

Let us assume that we want to approximate the function f(x) around the point x^{*}. We use the Taylor expansion of the first-order to approximate the function:

(20)   \begin{align*}f(x)\approx f(x^{*})+\frac{df}{dx}\Bigg|_{x^{*}}\underbrace{(x-x^{*})}_{\Delta x}\end{align*}

The right-hand side of the last equation is an equation of a tangent line through the point (x^{*},f(x^{*})). This equation has the following mathematical form

(21)   \begin{align*}y=f(x^{*})+\frac{df}{dx}\Bigg|_{x^{*}}\underbrace{(x-x^{*})}_{\Delta x}\end{align*}

Let us consider the following example

(22)   \begin{align*}f(x)=x^2\end{align*}

Let us approximate this function at the point (1,1). We have

(23)   \begin{align*}f(x)\approx f(1)+\frac{df}{dx}\Bigg|_{x^{*}=1}(x-1)=1+2(x-1)=2x-1\end{align*}

The linearization of nonlinear state-space models is similar in spirit to the linearization of scalar nonlinear functions. In the sequel, we explain the linearization procedure of state-space models.

We approximate the nonlinear function \mathbf{f}(\mathbf{x},\mathbf{u}) around the point (\mathbf{x}^{*},\mathbf{u}^{*}) by using the Taylor expansion

(24)   \begin{align*}\dot{\mathbf{x}}& =\mathbf{f}(\mathbf{x},\mathbf{u}) \approx \mathbf{f}(\mathbf{x}^{*},\mathbf{u}^{*})+\frac{\partial \mathbf{f} }{\partial \mathbf{x}}\Bigg|_{\mathbf{x}^{*},\mathbf{u}^{*}}\cdot \Delta \mathbf{x}+\frac{\partial \mathbf{f} }{\partial \mathbf{u}}\Bigg|_{\mathbf{x}^{*},\mathbf{u}^{*}}\cdot \Delta \mathbf{u}\end{align*}

where

(25)   \begin{align*}\Delta \mathbf{x}= \mathbf{x} - \mathbf{x}^{*} ,  \;\;\; \Delta \mathbf{u}= \mathbf{u} - \mathbf{u}^{*},\end{align*}

and where

(26)   \begin{align*}\frac{\partial \mathbf{f} }{\partial \mathbf{x}}=\begin{bmatrix} \frac{\partial f_{1}}{\partial x_{1}}  & \frac{\partial f_{1}}{\partial x_{2}} & \ldots & \frac{\partial f_{1}}{\partial x_{n}} \\  \frac{\partial f_{2}}{\partial x_{1}}  & \frac{\partial f_{2}}{\partial x_{2}} & \ldots & \frac{\partial f_{2}}{\partial x_{n}} \\ \vdots & \vdots  &  & \vdots  \\ \frac{\partial f_{n}}{\partial x_{1}}  & \frac{\partial f_{n}}{\partial x_{2}} & \ldots & \frac{\partial f_{n}}{\partial x_{n}} \end{bmatrix},\;\; \frac{\partial \mathbf{f} }{\partial \mathbf{u}}=\begin{bmatrix}  \frac{\partial f_{1}}{\partial u_{1}}  & \frac{\partial f_{1}}{\partial u_{2}} & \ldots & \frac{\partial f_{1}}{\partial u_{m}} \\  \frac{\partial f_{2}}{\partial u_{1}}  & \frac{\partial f_{2}}{\partial u_{2}} & \ldots & \frac{\partial f_{2}}{\partial u_{m}} \\ \vdots & \vdots  &  & \vdots  \\ \frac{\partial f_{n}}{\partial u_{1}}  & \frac{\partial f_{n}}{\partial u_{2}} & \ldots & \frac{\partial f_{n}}{\partial u_{m}} \end{bmatrix},\end{align*}

The vertical lines in (24) mean that the matrices are evaluated at the points \mathbf{x}^{*} and \mathbf{u}^{*} These matrices of partial derivatives with respect to the state and input are called the Jacobian matrices. From (24), we obtain

(27)   \begin{align*}\dot{\mathbf{x}} - \mathbf{f}(\mathbf{x}^{*},\mathbf{u}^{*}) \approx \frac{\partial \mathbf{f} }{\partial \mathbf{x}}\Bigg|_{\mathbf{x}^{*},\mathbf{u}^{*}}\cdot \Delta \mathbf{x}+\frac{\partial \mathbf{f} }{\partial \mathbf{u}}\Bigg|_{\mathbf{x}^{*},\mathbf{u}^{*}}\cdot \Delta \mathbf{u}\end{align*}

On the other hand, from (25), we obtain

(28)   \begin{align*}\frac{d}{dt}\Delta \mathbf{x}=\Delta \dot{\mathbf{x}} = \dot{\mathbf{x}}- \dot{\mathbf{x}}^{*} =\dot{\mathbf{x}} - \mathbf{f}(\mathbf{x}^{*},\mathbf{u}^{*}) \end{align*}

Consequently, from (27) and (28), we obtain

(29)   \begin{align*}\Delta \dot{\mathbf{x}}  \approx \frac{\partial \mathbf{f} }{\partial \mathbf{x}}\Bigg|_{\mathbf{x}^{*},\mathbf{u}^{*}}\cdot \Delta \mathbf{x}+\frac{\partial \mathbf{f} }{\partial \mathbf{u}}\Bigg|_{\mathbf{x}^{*},\mathbf{u}^{*}}\cdot \Delta \mathbf{u}\end{align*}

By replacing the approximation with equality, we obtain

(30)   \begin{align*}\Delta \dot{\mathbf{x}} = \frac{\partial \mathbf{f} }{\partial \mathbf{x}}\Bigg|_{\mathbf{x}^{*},\mathbf{u}^{*}}\cdot \Delta \mathbf{x}+\frac{\partial \mathbf{f} }{\partial \mathbf{u}}\Bigg|_{\mathbf{x}^{*},\mathbf{u}^{*}}\cdot \Delta \mathbf{u}\end{align*}

Let us introduce a new notation

(31)   \begin{align*}A=\frac{\partial \mathbf{f} }{\partial \mathbf{x}}\Bigg|_{\mathbf{x}^{*},\mathbf{u}^{*}},\;\; B=\frac{\partial \mathbf{f} }{\partial \mathbf{u}}\Bigg|_{\mathbf{x}^{*},\mathbf{u}^{*}},\;\;   \mathbf{z}=\Delta \mathbf{x},\;\;   \mathbf{w}=\Delta \mathbf{u}\end{align*}

From (30) and (31), we obtain the linearized model

(32)   \begin{align*}\dot{\mathbf{z}}=A\mathbf{z}+B\mathbf{w}\end{align*}

where

  • The system matrices A and B are defined as follows

(33)   \begin{align*}A=\frac{\partial \mathbf{f} }{\partial \mathbf{x}}\Bigg|_{\mathbf{x}^{*},\mathbf{u}^{*}}, \;\; B=\frac{\partial \mathbf{f} }{\partial \mathbf{u}}\Bigg|_{\mathbf{x}^{*},\mathbf{u}^{*}}\end{align*}

  • The linearized state vector and linearized input vector are defined by

(34)   \begin{align*}\mathbf{z}=\Delta \mathbf{x}=\mathbf{x} - \mathbf{x}^{*} ,  \;\;\; \mathbf{w}=\Delta \mathbf{u}= \mathbf{u} - \mathbf{u}^{*}\end{align*}

It should be kept in mind that the linearization produces a reliable approximation of the nonlinear system only for relatively small values of \mathbf{z}.

Linearization of Nonlinear Pendulum Equations

The nonlinear state-space model is given by the following equation

(35)   \begin{align*}\begin{bmatrix} \dot{x}_{1} \\ \dot{x}_{2}  \end{bmatrix} =\begin{bmatrix} x_{2} \\ -\frac{g}{l}\sin(x_{1})+\frac{1}{ml}u^2  \end{bmatrix}\end{align*}

From this equation, we obtain

(36)   \begin{align*}f_{1}&=x_{2} \\ f_{2}&=-\frac{g}{l}\sin(x_{1})+\frac{1}{ml}u^2  \end{align*}

The Jacobian matrix with respect to the state is defined by

(37)   \begin{align*}\frac{\partial \mathbf{f}}{\partial \mathbf{x} } =\begin{bmatrix} \frac{\partial f_{1}}{\partial x_{1}} & \frac{\partial f_{1}}{\partial x_{2}} \\ \frac{\partial f_{2}}{\partial x_{1}} & \frac{\partial f_{2}}{\partial x_{2}} \end{bmatrix}=\begin{bmatrix} 0 & 1\\ -\frac{g}{l}cos(x_{1}) & 0 \end{bmatrix}\end{align*}

The Jacobian matrix with respect to the control input is defined by

(38)   \begin{align*}\frac{\partial \mathbf{f}}{\partial \mathbf{u} } =\begin{bmatrix} \frac{\partial f_{1}}{\partial u} \\  \frac{\partial f_{2}}{\partial u}\end{bmatrix} =\begin{bmatrix} 0 \\ \frac{2}{ml} u\end{bmatrix}\end{align*}

We approximate the nonlinear system at the state and input

(39)   \begin{align*}\mathbf{x}^{*}=\begin{bmatrix} 0 \\ 0 \end{bmatrix},\;\; u^{*}=1,\end{align*}

For this selection of the state and input, we obtain

(40)   \begin{align*}A=\frac{\partial \mathbf{f}}{\partial \mathbf{x} }\Bigg|_{\mathbf{x}^{*}}=\begin{bmatrix} 0 & 1\\ -\frac{g}{l} & 0 \end{bmatrix}\\B=\frac{\partial \mathbf{f}}{\partial \mathbf{u} }\Bigg|_{u^{*}}=\begin{bmatrix} 0 \\ \frac{2}{ml} \end{bmatrix}\end{align*}

The final linearized model is given by

(41)   \begin{align*}\dot{\mathbf{z}}=\begin{bmatrix} 0 & 1\\ -\frac{g}{l} & 0 \end{bmatrix}\mathbf{z}+\begin{bmatrix} 0 \\ \frac{2}{ml} \end{bmatrix}w\end{align*}