5 Diagonalization

5.1 Eigenvalues and Eigenvectors

Definition. A square matrix $D = (d_{ij})$ is called a diagonal matrix if

\begin{equation*} d_{ij} = 0 \quad \text{whenever } i \ne j. \end{equation*}

Equivalently, all entries off the main diagonal are zero.

Diagonalizable

A linear operator $T$ on a finite-dimensional vector space $V$ is called diagonalizable if there is an ordered basis $\beta$ for $V$ such that $[T]_\beta$ is a diagonal matrix. A square matrix $A$ is called diagonalizable if $L_A$ is diagonalizable.

Eigenvector and Eigenvalue

Let $T$ be a linear operator on a vector space $V$ . A nonzero vector $v \in V$ is called an eigenvector of $T$ if there exists a scalar $\lambda$ such that $T(v) = \lambda v$ . The scalar $\lambda$ is called the eigenvalue corresponding to the eigenvector $v$ .

Let $A$ be in $M_{n \times n}(F)$ . A nonzero vector $v \in F^n$ is called an eigenvector of $A$ if $v$ is an eigenvector of $L_A$ ; that is, if $Av = \lambda v$ for some scalar $\lambda$ . The scalar $\lambda$ is called the eigenvalue of $A$ corresponding to the eigenvector $v$ .

Theorem 5.1. Let $A \in M_{n \times n}(F)$ . Then a scalar $\lambda$ is an eigenvalue of $A$ if and only if

\begin{equation*} \det(A - \lambda I_n) = 0. \end{equation*}

Characteristic Polynomial

Let $A \in M_{n \times n}(F)$ . The polynomial $f(t) = \det(A - t I_n)$ is called the characteristic polynomial of $A$ .

It is easily shown that similar matrices have the same characteristic polynomial.
Let $T$ be a linear operator on an $n$ -dimensional vector space $V$ with ordered basis $\beta$ . We define the characteristic polynomial $f(t)$ of $T$ to be the characteristic polynomial of $A = [T]_\beta$ . That is,

\begin{equation*} f(t) = \det(A - t I_n). \end{equation*}

We often denote the characteristic polynomial of an operator $T$ by $f(t) = \det(T - t I)$ .

Theorem 5.2. Let $A \in M_{n \times n}(F)$ .

$\text{(a)}\quad$ The characteristic polynomial of $A$ is a polynomial of degree $n$ with leading coefficient $(-1)^n$ .

$\text{(b)} \quad A$ has at most $n$ distinct eigenvalues.

Theorem 5.3. Let $T$ be a linear operator on a vector space $V$ , and let $\lambda$ be an eigenvalue of $T$ . A vector $v \in V$ is an eigenvector of $T$ corresponding to $\lambda$ if and only if $v \ne 0$ and $v \in N(T - \lambda I)$ .

5.2 Diagonalizability

Theorem 5.4. Let $T$ be a linear operator on a vector space $V$ , and let $\lambda_1, \lambda_2, \ldots, \lambda_k$ be distinct eigenvalues of $T$ . If $v_1, v_2, \ldots, v_k$ are eigenvectors of $T$ such that $\lambda_i$ corresponds to $v_i$ $(1 \le i \le k)$ , then $\{v_1, v_2, \ldots, v_k\}$ is linearly independent.

Corollay 1. Let $T$ be a linear operator on an $n$ -dimensional vector space $V$ . If $T$ has $n$ distinct eigenvalues, then $T$ is diagonalizable.

Definition. A polynomial $f(t)$ in $\text{P}(F)$ splits over $F$ if there are scalars $c, a_1, \ldots, a_n$ (not necessarily distinct) in $F$ such that

\begin{equation*} f(t) = c(t - a_1)(t - a_2)\cdots(t - a_n). \end{equation*}

If $f(t)$ is the characteristic polynomial of a linear operator or a matrix over a field $F$ , then the statement that $f(t)$ splits is understood to mean that it splits over $F$ .

Theorem 5.5. The characteristic polynomial of any diagonalizable linear operator splits.

Definition. Let $\lambda$ be an eigenvalue of a linear operator or matrix with characteristic polynomial $f(t)$ . The (algebraic) multiplicity of $\lambda$ is the largest positive integer $k$ for which

\begin{equation*} (t - \lambda)^k \end{equation*}

is a factor of $f(t)$ .

Eigenspace

Let $T$ be a linear operator on a vector space $V$ , and let $\lambda$ be an eigenvalue of $T$ . Define

\begin{equation*} E_\lambda = \{ x \in V : T(x) = \lambda x \} = N(T - \lambda I_V). \end{equation*}

The set $E_\lambda$ is called the eigenspace of $T$ corresponding to the eigenvalue $\lambda$ . Analogously, we define the eigenspace of a square matrix $A$ to be the eigenspace of $L_A$ .

Theorem 5.6. Let $T$ be a linear operator on a finite-dimensional vector space $V$ , and let $\lambda$ be an eigenvalue of $T$ having multiplicity $m$ . Then

\begin{equation*} 1 \le \dim(E_\lambda) \le m. \end{equation*}

Theorem 5.7. Let $T$ be a linear operator on a vector space $V$ , and let $\lambda_1, \lambda_2, \ldots, \lambda_k$ be distinct eigenvalues of $T$ . For each $i = 1,2,\ldots,k$ , let $S_i$ be a finite linearly independent subset of the eigenspace $E_{\lambda_i}$ . Then

\begin{equation*} S = S_1 \cup S_2 \cup \cdots \cup S_k \end{equation*}

is a linearly independent subset of $V$ .

Theorem 5.8. Let $T$ be a linear operator on a finite-dimensional vector space $V$ such that the characteristic polynomial of $T$ splits. Let $\lambda_1, \lambda_2, \ldots, \lambda_k$ be the distinct eigenvalues of $T$ . Then:

$\text{(a)}\quad T$ is diagonalizable if and only if the multiplicity of $\lambda_i$ is equal to $\dim(E_{\lambda_i})$ for all $i$ .

$\text{(b)}\quad$ If $T$ is diagonalizable and $\beta_i$ is an ordered basis for $E_{\lambda_i}$ for each $i$ , then

\begin{equation*} \beta = \beta_1 \cup \beta_2 \cup \cdots \cup \beta_k \end{equation*}

is an ordered basis for $V$ consisting of eigenvectors of $T$ .

5.3 Applications

Fast Matrix Exponentiation

If a matrix $A$ is diagonalizable, there exists an invertible matrix $P$ and a diagonal matrix $D$ such that

\begin{equation*} A = P D P^{-1}. \end{equation*}

Then, for any positive integer $n$ ,

\begin{equation*} A^n = P D^n P^{-1}, \end{equation*}

where

\begin{equation*} D^n = \begin{pmatrix} \lambda_1^n & & 0 \\ & \ddots & \\ 0 & & \lambda_k^n \end{pmatrix} \end{equation*}

if $D = \operatorname{diag}(\lambda_1,\dots,\lambda_k)$ . Thus computing $A^n$ reduces to raising scalars $\lambda_i$ to the $n$ -th power.

Solving Linear Differential Equations

The system of differential equations is written in matrix form as

\begin{equation*} x' = A x, \end{equation*}

where $x(t)$ is the vector of unknown functions and $A$ is the coefficient matrix.

The main idea is to diagonalize $A$ . If

\begin{equation*} A = Q D Q^{-1}, \end{equation*}

then substituting into the system gives

\begin{equation*} x' = Q D Q^{-1} x. \end{equation*}

Define the new variable

\begin{equation*} y(t) = Q^{-1} x(t), \end{equation*}

which transforms the system into

\begin{equation*} y' = D y. \end{equation*}

Since $D$ is diagonal, this gives three independent scalar differential equations, which are easy to solve. The solution to the original system is obtained by transforming back:

\begin{equation*} x(t) = Q\, y(t). \end{equation*}

5.4 Direct Sums

Sum and Direct Sum

Let $W_1, W_2, \ldots, W_k$ be subspaces of a vector space $V$ . We define the sum of these subspaces to be the set

\begin{equation*} \{\, v_1 + v_2 + \cdots + v_k : v_i \in W_i \text{ for } 1 \le i \le k \,\} \end{equation*}

which we denote by $W_1 + W_2 + \cdots + W_k$ or $\sum\limits_{i=1}^k W_i$

The sum of subspaces of a vector space is also a subspace.

Let $W_1, W_2, \ldots, W_k$ be subspaces of a vector space $V$ . We call $V$ the direct sum of the subspaces $W_1, W_2, \ldots, W_k$ and write $V = W_1 \oplus W_2 \oplus \cdots \oplus W_k$ , if

\begin{equation*} V = \sum_{i=1}^k W_i \end{equation*}

and

\begin{equation*} W_j \cap \sum_{i \ne j} W_i = \{0\} \quad \text{for each } j \ (1 \le j \le k). \end{equation*}

Theorem 5.9. Let $W_1, W_2, \ldots, W_k$ be subspaces of a finite-dimensional vector space $V$ . The following conditions are equivalent.

$\text{(a)}\quad V = W_1 \oplus W_2 \oplus \cdots \oplus W_k.$

$\text{(b)}\quad V = \sum_{i=1}^k W_i$ and, for any vectors $v_1, v_2, \ldots, v_k$ such that $v_i \in W_i$ $(1 \le i \le k)$ , if $v_1 + v_2 + \cdots + v_k = 0$ then $v_i = 0$ for all $i$ .

$\text{(c)}\quad$ Each vector $v \in V$ can be uniquely written as

\begin{equation*} v = v_1 + v_2 + \cdots + v_k, \end{equation*}

where $v_i \in W_i$ .

$\text{(d)}\quad$ If $\gamma_i$ is an ordered basis for $W_i$ $(1 \le i \le k)$ , then $\gamma_1 \cup \gamma_2 \cup \cdots \cup \gamma_k$ is an ordered basis for $V$ .

$\text{(e)}\quad$ For each $i = 1, 2, \ldots, k$ , there exists an ordered basis $\gamma_i$ for $W_i$ such that

\begin{equation*} \gamma_1 \cup \gamma_2 \cup \cdots \cup \gamma_k \end{equation*}

is an ordered basis for $V$ .

Theorem 5.10. A linear operator $T$ on a finite-dimensional vector space $V$ is diagonalizable if and only if $V$ is the direct sum of the eigenspaces of $T$ .

5.5 Invariant Subspaces and the Cayley–Hamilton Theorem

Invariant subspaces

Let $T$ be a linear operator on a vector space $V$ . A subspace $W$ of $V$ is called a $T$ -invariant subspace of $V$ if $T(W) \subseteq W$ , that is, if $T(v) \in W$ for all $v \in W$ .

Definition. Let $T$ be a linear operator on a vector space $V$ , and let $x$ be a nonzero vector in $V$ . The subspace

\begin{equation*} W = \operatorname{span}(\{x,\, T(x),\, T^2(x),\, \ldots\}) \end{equation*}

is called the $T$ -cyclic subspace of $V$ generated by $x$ . It is a simple matter to show that $W$ is $T$ -invariant.

In fact, $W$ is the “smallest’’ $T$ -invariant subspace of $V$ containing $x$ . That is, any $T$ -invariant subspace of $V$ containing $x$ must also contain $W$ .

Theorem 5.11. Let $T$ be a linear operator on a finite-dimensional vector space $V$ , and let $W$ be a $T$ -invariant subspace of $V$ . Then the characteristic polynomial of $T_W$ divides the characteristic polynomial of $T$ .

Theorem 5.12. Let $T$ be a linear operator on a finite-dimensional vector space $V$ , and let $W$ denote the $T$ -cyclic subspace of $V$ generated by a nonzero vector $v \in V$ . Let $k = \dim(W)$ . Then:

$\text{(a)}\quad \{v,\, T(v),\, T^2(v),\, \ldots,\, T^{k-1}(v)\}$ is a basis for $W$ .

$\text{(b)}\quad$ If

\begin{equation*} a_0 v + a_1 T(v) + \cdots + a_{k-1} T^{\,k-1}(v) + T^{\,k}(v) = 0, \end{equation*}

then the characteristic polynomial of $T_W$ is

\begin{equation*} f(t) = (-1)^k \bigl(a_0 + a_1 t + \cdots + a_{k-1} t^{k-1} + t^k \bigr). \end{equation*}

The Cayley–Hamilton Theorem

Theorem 5.13 (Cayley–Hamilton). Let $T$ be a linear operator on a finite-dimensional vector space $V$ , and let $f(t)$ be the characteristic polynomial of $T$ . Then $f(T) = T_0$ , the zero transformation. That is, $T$ “satisfies’’ its characteristic equation.

Corollary 1. Let $A$ be an $n \times n$ matrix, and let $f(t)$ be the characteristic polynomial of $A$ . Then $f(A) = O$ , the $n \times n$ zero matrix.

Theorem 5.14. Let $A$ be an $n \times n$ matrix and $f(x)$ be a polynomial such that $f(A) = O$ (the zero matrix). Then, every eigenvalue $\lambda$ of $A$ must be a root of the scalar equation $f(x) = 0$ .

Definition. Let $B_1 \in M_{m \times m}(F)$ , and let $B_2 \in M_{n \times n}(F)$ . We define the direct sum of $B_1$ and $B_2$ , denoted $B_1 \oplus B_2$ , as the $(m+n) \times (m+n)$ matrix $A$ such that

\begin{equation*} A_{ij} = \begin{cases} (B_1)_{ij} & \text{for } 1 \le i, j \le m, \\[6pt] (B_2)_{(i-m),(j-m)} & \text{for } m+1 \le i, j \le n+m, \\[6pt] 0 & \text{otherwise}. \end{cases} \end{equation*}

If $B_1, B_2, \ldots, B_k$ are square matrices with entries from $F$ , then we define the direct sum of $B_1, B_2, \ldots, B_k$ recursively by

\begin{equation*} B_1 \oplus B_2 \oplus \cdots \oplus B_k = (B_1 \oplus B_2 \oplus \cdots \oplus B_{k-1}) \oplus B_k. \end{equation*}

If $A = B_1 \oplus B_2 \oplus \cdots \oplus B_k$ , then we often write

\begin{equation*} A = \begin{pmatrix} B_1 & O & \cdots & O \\ O & B_2 & \cdots & O \\ \vdots & \vdots & \ddots & \vdots \\ O & O & \cdots & B_k \end{pmatrix}. \end{equation*}

Theorem 5.15. Let $T$ be a linear operator on a finite-dimensional vector space $V$ , and suppose that

\begin{equation*} V = W_1 \oplus W_2 \oplus \cdots \oplus W_k, \end{equation*}

where $W_i$ is a $T$ -invariant subspace of $V$ for each $i$ $(1 \le i \le k)$ . Suppose that $f_i(t)$ is the characteristic polynomial of $T_{W_i}$ $(1 \le i \le k)$ . Then

\begin{equation*} f_1(t)\times f_2(t)\times \cdots\, \times f_k(t) \end{equation*}

is the characteristic polynomial of $T$ .

5.1 Eigenvalues and Eigenvectors​

Diagonalizable​

Eigenvector and Eigenvalue​

Characteristic Polynomial​

5.2 Diagonalizability​

Eigenspace​

5.3 Applications​

Fast Matrix Exponentiation​

Solving Linear Differential Equations​

5.4 Direct Sums​

Sum and Direct Sum​

5.5 Invariant Subspaces and the Cayley–Hamilton Theorem​

Invariant subspaces​

The Cayley–Hamilton Theorem​