跳到主要内容

5 Diagonalization

5.1 Eigenvalues and Eigenvectors

Definition. A square matrix D=(dij)D = (d_{ij}) is called a diagonal matrix if

dij=0whenever ij.\begin{equation*} d_{ij} = 0 \quad \text{whenever } i \ne j. \end{equation*}

Equivalently, all entries off the main diagonal are zero.

Diagonalizable

A linear operator TT on a finite-dimensional vector space VV is called diagonalizable if there is an ordered basis β\beta for VV such that [T]β[T]_\beta is a diagonal matrix. A square matrix AA is called diagonalizable if LAL_A is diagonalizable.

Eigenvector and Eigenvalue

Let TT be a linear operator on a vector space VV. A nonzero vector vVv \in V is called an eigenvector of TT if there exists a scalar λ\lambda such that T(v)=λvT(v) = \lambda v. The scalar λ\lambda is called the eigenvalue corresponding to the eigenvector vv.

Let AA be in Mn×n(F)M_{n \times n}(F). A nonzero vector vFnv \in F^n is called an eigenvector of AA if vv is an eigenvector of LAL_A; that is, if Av=λvAv = \lambda v for some scalar λ\lambda. The scalar λ\lambda is called the eigenvalue of AA corresponding to the eigenvector vv.

Theorem 5.1. Let AMn×n(F)A \in M_{n \times n}(F). Then a scalar λ\lambda is an eigenvalue of AA if and only if

det(AλIn)=0.\begin{equation*} \det(A - \lambda I_n) = 0. \end{equation*}

Characteristic Polynomial

Let AMn×n(F)A \in M_{n \times n}(F). The polynomial f(t)=det(AtIn)f(t) = \det(A - t I_n) is called the characteristic polynomial of AA.

  • It is easily shown that similar matrices have the same characteristic polynomial.

  • Let TT be a linear operator on an nn-dimensional vector space VV with ordered basis β\beta. We define the characteristic polynomial f(t)f(t) of TT to be the characteristic polynomial of A=[T]βA = [T]_\beta. That is,

f(t)=det(AtIn).\begin{equation*} f(t) = \det(A - t I_n). \end{equation*}
  • We often denote the characteristic polynomial of an operator TT by f(t)=det(TtI)f(t) = \det(T - t I).

Theorem 5.2. Let AMn×n(F)A \in M_{n \times n}(F).

(a)\text{(a)}\quadThe characteristic polynomial of AA is a polynomial of degree nn with leading coefficient (1)n(-1)^n.

(b)A\text{(b)} \quad A has at most nn distinct eigenvalues.

Theorem 5.3. Let TT be a linear operator on a vector space VV, and let λ\lambda be an eigenvalue of TT. A vector vVv \in V is an eigenvector of TT corresponding to λ\lambda if and only if v0v \ne 0 and vN(TλI)v \in N(T - \lambda I).

5.2 Diagonalizability

Theorem 5.4. Let TT be a linear operator on a vector space VV, and let λ1,λ2,,λk\lambda_1, \lambda_2, \ldots, \lambda_k be distinct eigenvalues of TT. If v1,v2,,vkv_1, v_2, \ldots, v_k are eigenvectors of TT such that λi\lambda_i corresponds to viv_i (1ik)(1 \le i \le k), then {v1,v2,,vk}\{v_1, v_2, \ldots, v_k\} is linearly independent.

Corollay 1. Let TT be a linear operator on an nn-dimensional vector space VV. If TT has nn distinct eigenvalues, then TT is diagonalizable.

Definition. A polynomial f(t)f(t) in P(F)\text{P}(F) splits over FF if there are scalars c,a1,,anc, a_1, \ldots, a_n (not necessarily distinct) in FF such that

f(t)=c(ta1)(ta2)(tan).\begin{equation*} f(t) = c(t - a_1)(t - a_2)\cdots(t - a_n). \end{equation*}
  • If f(t)f(t) is the characteristic polynomial of a linear operator or a matrix over a field FF, then the statement that f(t)f(t) splits is understood to mean that it splits over FF.

Theorem 5.5. The characteristic polynomial of any diagonalizable linear operator splits.

Definition. Let λ\lambda be an eigenvalue of a linear operator or matrix with characteristic polynomial f(t)f(t). The (algebraic) multiplicity of λ\lambda is the largest positive integer kk for which

(tλ)k\begin{equation*} (t - \lambda)^k \end{equation*}

is a factor of f(t)f(t).

Eigenspace

Let TT be a linear operator on a vector space VV, and let λ\lambda be an eigenvalue of TT. Define

Eλ={xV:T(x)=λx}=N(TλIV).\begin{equation*} E_\lambda = \{ x \in V : T(x) = \lambda x \} = N(T - \lambda I_V). \end{equation*}

The set EλE_\lambda is called the eigenspace of TT corresponding to the eigenvalue λ\lambda. Analogously, we define the eigenspace of a square matrix AA to be the eigenspace of LAL_A.

Theorem 5.6. Let TT be a linear operator on a finite-dimensional vector space VV, and let λ\lambda be an eigenvalue of TT having multiplicity mm. Then

1dim(Eλ)m.\begin{equation*} 1 \le \dim(E_\lambda) \le m. \end{equation*}

Theorem 5.7. Let TT be a linear operator on a vector space VV, and let λ1,λ2,,λk\lambda_1, \lambda_2, \ldots, \lambda_k be distinct eigenvalues of TT. For each i=1,2,,ki = 1,2,\ldots,k, let SiS_i be a finite linearly independent subset of the eigenspace EλiE_{\lambda_i}. Then

S=S1S2Sk\begin{equation*} S = S_1 \cup S_2 \cup \cdots \cup S_k \end{equation*}

is a linearly independent subset of VV.

Theorem 5.8. Let TT be a linear operator on a finite-dimensional vector space VV such that the characteristic polynomial of TT splits. Let λ1,λ2,,λk\lambda_1, \lambda_2, \ldots, \lambda_k be the distinct eigenvalues of TT. Then:

(a)T\text{(a)}\quad T is diagonalizable if and only if the multiplicity of λi\lambda_i is equal to dim(Eλi)\dim(E_{\lambda_i}) for all ii.

(b)\text{(b)}\quadIf TT is diagonalizable and βi\beta_i is an ordered basis for EλiE_{\lambda_i} for each ii, then

β=β1β2βk\begin{equation*} \beta = \beta_1 \cup \beta_2 \cup \cdots \cup \beta_k \end{equation*}

is an ordered basis for VV consisting of eigenvectors of TT.

5.3 Applications

Fast Matrix Exponentiation

If a matrix AA is diagonalizable, there exists an invertible matrix PP and a diagonal matrix DD such that

A=PDP1.\begin{equation*} A = P D P^{-1}. \end{equation*}

Then, for any positive integer nn,

An=PDnP1,\begin{equation*} A^n = P D^n P^{-1}, \end{equation*}

where

Dn=(λ1n00λkn)\begin{equation*} D^n = \begin{pmatrix} \lambda_1^n & & 0 \\ & \ddots & \\ 0 & & \lambda_k^n \end{pmatrix} \end{equation*}

if D=diag(λ1,,λk)D = \operatorname{diag}(\lambda_1,\dots,\lambda_k). Thus computing AnA^n reduces to raising scalars λi\lambda_i to the nn-th power.

Solving Linear Differential Equations

The system of differential equations is written in matrix form as

x=Ax,\begin{equation*} x' = A x, \end{equation*}

where x(t)x(t) is the vector of unknown functions and AA is the coefficient matrix.

The main idea is to diagonalize AA. If

A=QDQ1,\begin{equation*} A = Q D Q^{-1}, \end{equation*}

then substituting into the system gives

x=QDQ1x.\begin{equation*} x' = Q D Q^{-1} x. \end{equation*}

Define the new variable

y(t)=Q1x(t),\begin{equation*} y(t) = Q^{-1} x(t), \end{equation*}

which transforms the system into

y=Dy.\begin{equation*} y' = D y. \end{equation*}

Since DD is diagonal, this gives three independent scalar differential equations, which are easy to solve. The solution to the original system is obtained by transforming back:

x(t)=Qy(t).\begin{equation*} x(t) = Q\, y(t). \end{equation*}

5.4 Direct Sums

Sum and Direct Sum

Let W1,W2,,WkW_1, W_2, \ldots, W_k be subspaces of a vector space VV. We define the sum of these subspaces to be the set

{v1+v2++vk:viWi for 1ik}\begin{equation*} \{\, v_1 + v_2 + \cdots + v_k : v_i \in W_i \text{ for } 1 \le i \le k \,\} \end{equation*}

which we denote by W1+W2++WkW_1 + W_2 + \cdots + W_k or i=1kWi\sum\limits_{i=1}^k W_i

  • The sum of subspaces of a vector space is also a subspace.

Let W1,W2,,WkW_1, W_2, \ldots, W_k be subspaces of a vector space VV. We call VV the direct sum of the subspaces W1,W2,,WkW_1, W_2, \ldots, W_k and write V=W1W2WkV = W_1 \oplus W_2 \oplus \cdots \oplus W_k, if

V=i=1kWi\begin{equation*} V = \sum_{i=1}^k W_i \end{equation*}

and

WjijWi={0}for each j (1jk).\begin{equation*} W_j \cap \sum_{i \ne j} W_i = \{0\} \quad \text{for each } j \ (1 \le j \le k). \end{equation*}

Theorem 5.9. Let W1,W2,,WkW_1, W_2, \ldots, W_k be subspaces of a finite-dimensional vector space VV. The following conditions are equivalent.

(a)V=W1W2Wk.\text{(a)}\quad V = W_1 \oplus W_2 \oplus \cdots \oplus W_k.

(b)V=i=1kWi\text{(b)}\quad V = \sum_{i=1}^k W_i and, for any vectors v1,v2,,vkv_1, v_2, \ldots, v_k such that viWiv_i \in W_i (1ik)(1 \le i \le k), if v1+v2++vk=0v_1 + v_2 + \cdots + v_k = 0 then vi=0v_i = 0 for all ii.

(c)\text{(c)}\quadEach vector vVv \in V can be uniquely written as

v=v1+v2++vk,\begin{equation*} v = v_1 + v_2 + \cdots + v_k, \end{equation*}

where viWiv_i \in W_i.

(d)\text{(d)}\quadIf γi\gamma_i is an ordered basis for WiW_i (1ik)(1 \le i \le k), then γ1γ2γk\gamma_1 \cup \gamma_2 \cup \cdots \cup \gamma_k is an ordered basis for VV.

(e)\text{(e)}\quadFor each i=1,2,,ki = 1, 2, \ldots, k, there exists an ordered basis γi\gamma_i for WiW_i such that

γ1γ2γk\begin{equation*} \gamma_1 \cup \gamma_2 \cup \cdots \cup \gamma_k \end{equation*}

is an ordered basis for VV.

Theorem 5.10. A linear operator TT on a finite-dimensional vector space VV is diagonalizable if and only if VV is the direct sum of the eigenspaces of TT.

5.5 Invariant Subspaces and the Cayley–Hamilton Theorem

Invariant subspaces

Let TT be a linear operator on a vector space VV. A subspace WW of VV is called a TT-invariant subspace of VV if T(W)WT(W) \subseteq W, that is, if T(v)WT(v) \in W for all vWv \in W.

Definition. Let TT be a linear operator on a vector space VV, and let xx be a nonzero vector in VV. The subspace

W=span({x,T(x),T2(x),})\begin{equation*} W = \operatorname{span}(\{x,\, T(x),\, T^2(x),\, \ldots\}) \end{equation*}

is called the TT-cyclic subspace of VV generated by xx. It is a simple matter to show that WW is TT-invariant.

  • In fact, WW is the “smallest’’ TT-invariant subspace of VV containing xx. That is, any TT-invariant subspace of VV containing xx must also contain WW.

Theorem 5.11. Let TT be a linear operator on a finite-dimensional vector space VV, and let WW be a TT-invariant subspace of VV. Then the characteristic polynomial of TWT_W divides the characteristic polynomial of TT.

Theorem 5.12. Let TT be a linear operator on a finite-dimensional vector space VV, and let WW denote the TT-cyclic subspace of VV generated by a nonzero vector vVv \in V. Let k=dim(W)k = \dim(W). Then:

(a){v,T(v),T2(v),,Tk1(v)}\text{(a)}\quad \{v,\, T(v),\, T^2(v),\, \ldots,\, T^{k-1}(v)\} is a basis for WW.

(b)\text{(b)}\quadIf

a0v+a1T(v)++ak1Tk1(v)+Tk(v)=0,\begin{equation*} a_0 v + a_1 T(v) + \cdots + a_{k-1} T^{\,k-1}(v) + T^{\,k}(v) = 0, \end{equation*}

then the characteristic polynomial of TWT_W is

f(t)=(1)k(a0+a1t++ak1tk1+tk).\begin{equation*} f(t) = (-1)^k \bigl(a_0 + a_1 t + \cdots + a_{k-1} t^{k-1} + t^k \bigr). \end{equation*}

The Cayley–Hamilton Theorem

Theorem 5.13 (Cayley–Hamilton). Let TT be a linear operator on a finite-dimensional vector space VV, and let f(t)f(t) be the characteristic polynomial of TT. Then f(T)=T0f(T) = T_0, the zero transformation. That is, TT “satisfies’’ its characteristic equation.

Corollary 1. Let AA be an n×nn \times n matrix, and let f(t)f(t) be the characteristic polynomial of AA. Then f(A)=Of(A) = O, the n×nn \times n zero matrix.

Theorem 5.14. Let AA be an n×nn \times n matrix and f(x)f(x) be a polynomial such that f(A)=Of(A) = O (the zero matrix). Then, every eigenvalue λ\lambda of AA must be a root of the scalar equation f(x)=0f(x) = 0.

Definition. Let B1Mm×m(F)B_1 \in M_{m \times m}(F), and let B2Mn×n(F)B_2 \in M_{n \times n}(F). We define the direct sum of B1B_1 and B2B_2, denoted B1B2B_1 \oplus B_2, as the (m+n)×(m+n)(m+n) \times (m+n) matrix AA such that

Aij={(B1)ijfor 1i,jm,(B2)(im),(jm)for m+1i,jn+m,0otherwise.\begin{equation*} A_{ij} = \begin{cases} (B_1)_{ij} & \text{for } 1 \le i, j \le m, \\[6pt] (B_2)_{(i-m),(j-m)} & \text{for } m+1 \le i, j \le n+m, \\[6pt] 0 & \text{otherwise}. \end{cases} \end{equation*}

If B1,B2,,BkB_1, B_2, \ldots, B_k are square matrices with entries from FF, then we define the direct sum of B1,B2,,BkB_1, B_2, \ldots, B_k recursively by

B1B2Bk=(B1B2Bk1)Bk.\begin{equation*} B_1 \oplus B_2 \oplus \cdots \oplus B_k = (B_1 \oplus B_2 \oplus \cdots \oplus B_{k-1}) \oplus B_k. \end{equation*}

If A=B1B2BkA = B_1 \oplus B_2 \oplus \cdots \oplus B_k, then we often write

A=(B1OOOB2OOOBk).\begin{equation*} A = \begin{pmatrix} B_1 & O & \cdots & O \\ O & B_2 & \cdots & O \\ \vdots & \vdots & \ddots & \vdots \\ O & O & \cdots & B_k \end{pmatrix}. \end{equation*}

Theorem 5.15. Let TT be a linear operator on a finite-dimensional vector space VV, and suppose that

V=W1W2Wk,\begin{equation*} V = W_1 \oplus W_2 \oplus \cdots \oplus W_k, \end{equation*}

where WiW_i is a TT-invariant subspace of VV for each ii (1ik)(1 \le i \le k). Suppose that fi(t)f_i(t) is the characteristic polynomial of TWiT_{W_i} (1ik)(1 \le i \le k). Then

f1(t)×f2(t)××fk(t)\begin{equation*} f_1(t)\times f_2(t)\times \cdots\, \times f_k(t) \end{equation*}

is the characteristic polynomial of TT.