Let V be a vector space over F. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F, denoted ⟨x,y⟩, such that for all x,y,z∈V and all c∈F, the following hold:
(a)⟨x+z,y⟩=⟨x,y⟩+⟨z,y⟩.
(b)⟨cx,y⟩=c⟨x,y⟩.
(c)⟨x,y⟩=⟨y,x⟩ where the bar denotes complex conjugation.
(d)⟨x,x⟩>0if x=0.
We assume that all vector spaces are over the field F, where F denotes either
R or C.
Note that (c) reduces to ⟨x,y⟩=⟨y,x⟩ if F=R.
Theorem 6.1. Let V be an inner product space. Then for x,y,z∈V and c∈F, the following statements are true.
(a)⟨x,y+z⟩=⟨x,y⟩+⟨x,z⟩.
(b)⟨x,cy⟩=c⟨x,y⟩.
(c)⟨x,0⟩=⟨0,x⟩=0.
(d)⟨x,x⟩=0if and only ifx=0.
(e)If ⟨x,y⟩=⟨x,z⟩ for all x∈V, then y=z.
Definition. For x=(a1,a2,…,an) and y=(b1,b2,…,bn) in Fn, define
⟨x,y⟩=i=1∑naibi.
The inner product is called the standard inner product on Fn.
When F=R the conjugations are not needed, and in early courses this
standard inner product is usually called the dot product and is denoted by
x⋅y instead of ⟨x,y⟩.
Let V be an inner product space. For x∈V, we define the norm or length
of x by
∥x∥=⟨x,x⟩.
Theorem 6.2. Let V be an inner product space over F. Then for all x,y∈V and c∈F, the following statements are true.
(a)∥cx∥=∣c∣⋅∥x∥.
(b)∥x∥=0if and only ifx=0. In any case, ∥x∥≥0.
(c)(Cauchy–Schwarz Inequality)
∣⟨x,y⟩∣≤∥x∥⋅∥y∥.
(d)(Triangle Inequality)
∥x+y∥≤∥x∥+∥y∥.
Definition. Let A∈Mm×n(F). We define the conjugate transpose or adjoint of A to be the n×m matrix A∗ such that
(A∗)ij=Ajifor all i,j.
Definition. Let V=Mn×n(F), and define
⟨A,B⟩=tr(B∗A)
for A,B∈V.
The inner product on Mn×n(F) in this example is called the Frobenius inner product.
Definition. A vector space V over F endowed with a specific inner product is called
an inner product space. If F=C, we call V a complex inner product space, whereas if F=R, we call V a real inner product space.
Let S be a nonempty subset of an inner product space V. We define S⊥ (read “S perp”) to be the set of all vectors in V that are orthogonal to every vector in S; that is,
S⊥={x∈V:⟨x,y⟩=0 for all y∈S}.
The set S⊥ is called the orthogonal complement of S.
It is easily seen that S⊥ is a subspace of V for any subset S of V.
Theorem 6.4. Let V be a nonzero finite-dimensional inner product space. Then V has an orthonormal basis β. Furthermore, if β={v1,v2,…,vn} and x∈V, then
x=i=1∑n⟨x,vi⟩vi.
Corollary 1. Let V be a finite-dimensional inner product space with an orthonormal basis β={v1,v2,…,vn}. Let T be a linear operator on V, and let A=[T]β. Then for any i and j,
Aij=⟨T(vj),vi⟩.
Theorem 6.5. Let W be a finite-dimensional subspace of an inner product space V, and let y∈V. Then there exist unique vectors u∈W and z∈W⊥ such that y=u+z. Furthermore, if {v1,v2,…,vk} is an orthonormal basis for W, then
u=i=1∑k⟨y,vi⟩vi.
Corollary 1. The vector u is the unique vector in W
that is “closest” to y; that is, for any x∈W,
∥y−x∥≥∥y−u∥,
and this inequality is an equality if and only if x=u.
The vector u in the corollary is called the orthogonal projection of y on W.
Theorem 6.6. Suppose that S={v1,v2,…,vk} is an orthonormal set in an n-dimensional inner product space V. Then
(a)S can be extended to an orthonormal basis {v1,v2,…,vk,vk+1,…,vn} for V.
(b)If W=span(S), then S1={vk+1,vk+2,…,vn} is an orthonormal basis for W⊥ .
Theorem 6.7. Let V be a finite-dimensional inner product space over F, and let
g:V→F be a linear transformation. Then there exists a unique vector y∈V such that
Even when a system of linear equations Ax=b is consistent, there may be no unique solution. In such cases, it may be desirable to find a solution of minimal norm.
A solution s to Ax=b is called a minimal solution if
∥s∥≤∥u∥
for all other solutions u.
Theorem. Let A∈Mm×n(F) and b∈Fm. Suppose that Ax=b is consistent. Then the following statements are true.
(a)There exists exactly one minimal solution s of Ax=b, and
s∈R(LA∗).
(b)The vector s is the only solution to Ax=b that lies in R(LA∗); that is, if u satisfies
Lemma. Let T be a linear operator on a finite-dimensional inner product space V. If T has an eigenvector, then so does T∗.
Theorem 6.10 (Schur). Let T be a linear operator on a finite-dimensional inner product space V.Suppose that the characteristic polynomial of Tsplits. Then there exists an orthonormal basis β for V such that the matrix [T]β is upper triangular.
Let V be an inner product space, and let T be a linear operator on V. We say that T is normal if TT∗=T∗T. An n×n real or complex matrix A is normal if AA∗=A∗A.
Theorem 6.11. Let V be an inner product space, and let T be a normal operator on V. Then the following statements are true.
(a)∥T(x)∥=∥T∗(x)∥ for all x∈V.
(b)T−cI is normal for every c∈F.
(c)If x is an eigenvector of T, then x is also an eigenvector of T∗. In fact, if T(x)=λx, then T∗(x)=λx.
(d)If λ1 and λ2 are distinct eigenvalues of T with corresponding eigenvectors x1 and x2, then x1 and x2 are orthogonal.
Theorem 6.12. Let T be a linear operator on a finite-dimensional complex inner product space V. Then T is normal if and only if there exists an orthonormal basis for V consisting of eigenvectors of T.
Let T be a linear operator on an inner product space V.We say that T is self-adjoint (Hermitian) if T=T∗. An n×n real or complex matrix A is self-adjoint (Hermitian) if A=A∗.
Theorem 6.13. Let T be a self-adjoint operator on a finite-dimensional inner product space V. Then
(a)Every eigenvalue of T is real.
(b)Suppose that V is a real inner product space. Then the characteristic polynomial of T splits.
Theorem 6.14. Let T be a linear operator on a finite-dimensional real inner product space V. Then T is self-adjoint if and only if there exists an orthonormal basis β for V consisting of eigenvectors of T.
6.5 Unitary and Orthogonal Operators and Their Matrices
Let T be a linear operator on a finite-dimensional inner product space V over F. If ∥T(x)∥=∥x∥ for all x∈V, we call T a unitary operator if F=C and an orthogonal operator if F=R.
Lemma. Let U be a self-adjoint operator on a finite-dimensional inner product space V. If ⟨x,U(x)⟩=0 for all x∈V, then U=T0.
Theorem 6.15. Let T be a linear operator on a finite-dimensional inner product space V. Then the following statements are equivalent.
(a)TT∗=T∗T=I.
(b)⟨T(x),T(y)⟩=⟨x,y⟩ for all x,y∈V.
(c)If β is an orthonormal basis for V, then T(β) is an orthonormal basis for V.
(d)There exists an orthonormal basis β for V such that T(β) is an orthonormal basis for V.
(e)∥T(x)∥=∥x∥ for all x∈V.
Theorem 6.16. Let T be a linear operator on a finite-dimensional real inner product space V. Then V has an orthonormal basis of eigenvectors of T with corresponding eigenvalues of absolute value 1 if and only if T is both self-adjoint and orthogonal.
Theorem 6.17. Let T be a linear operator on a finite-dimensional complex inner product space V. Then V has an orthonormal basis of eigenvectors of T with corresponding eigenvalues of absolute value 1 if and only if T is unitary.
Definition. Let L be a one-dimensional subspace of R2. We may view L as a line in the plane through the origin. A linear operator T on R2 is called a reflection of R2 about L if
T(x)=xfor all x∈L
and
T(x)=−xfor all x∈L⊥.
Definition. A square matrix A is called an orthogonal matrix if
AtA=AAt=I,
and unitary if
A∗A=AA∗=I.
Definition.A and B are unitarily equivalent [orthogonally equivalent] if and only if there exists a unitary [orthogonal] matrix Psuch thatA=P∗BP.
Theorem 6.18. Let A be a complex n×n matrix. Then A is normal if and only if A is unitarily equivalent to a diagonal matrix.
Theorem 6.19. Let A be a real n×n matrix. Then A is symmetric (self-adjoint) if and only if A is orthogonally equivalent to a real diagonal matrix.
6.6 Orthogonal Projections and the Spectral Theorem
If V=W1⊕W2, then a linear operator T on V is the projection on W1 along W2 if, whenever
x=x1+x2,
with x1∈W1 and x2∈W2, we have
T(x)=x1.
Let V be an inner product space, and let T:V→V be a projection. We say that T is an orthogonal projection if
R(T)⊥=N(T)
and
N(T)⊥=R(T).
Theorem 6.20. Let V be an inner product space, and let T be a linear operator on V. Then T is an orthogonal projection if and only if T has an adjoint T∗ and
T2=T=T∗.
Theorem 6.21 (The Spectral Theorem). Suppose that T is a linear operator on a finite-dimensional inner product space V over F with the distinct eigenvalues λ1,λ2,…,λk. Assume that T is normal if F=C and that T is self-adjoint if F=R. For each i(1≤i≤k), let Wi be the eigenspace of T corresponding to the eigenvalue λi, and let Ti be the orthogonal projection of V on Wi. Then the following statements are true.
(a)
V=W1⊕W2⊕⋯⊕Wk.
(b)If Wi′ denotes the direct sum of the subspaces Wj for j=i, then
Wi⊥=Wi′.
(c)
TiTj=δijTifor 1≤i,j≤k.
(d)
I=T1+T2+⋯+Tk.
(e)
T=λ1T1+λ2T2+⋯+λkTk.
Corollary 1. If F=C, then T is normal if and only if
T∗=g(T)
for some polynomial g.
Corollary 2. If F=C, then T is unitary if and only if T is normal and
∣λ∣=1
for every eigenvalue λ of T.
Corollary 3. If F=C and T is normal, then T is self-adjoint if and only if every eigenvalue of T is real.
Corollary 4. Let T be as in the spectral theorem with spectral decomposition
Theorem 6.22 (Singular Value Theorem for Linear Transformations). Let V and W be finite-dimensional inner product spaces, and let T:V→W be a linear transformation of rank r. Then there exist orthonormal bases {v1,v2,…,vn} for V and {u1,u2,…,ur} for W and positive scalars
Conversely, suppose that the preceding conditions are satisfied. Then for 1≤i≤n, vi is an eigenvector of T∗T with corresponding eigenvalue σi2 if 1≤i≤r, and 0 if i>r. Therefore the scalars σ1,σ2,…,σr are uniquely determined by T.
Each vi is an eigenvector of T∗T with corresponding eigenvalue σi2 if i≤r, and 0 if i>r.
Definition. The unique scalars σ1,σ2,…,σr are called the singular values of T. If r is less than both m and n, then the term singular value is extended to include σr+1=⋯=σk=0, where k is the minimum of m and n.
Definition. Let A be an m×n matrix. We define the singular values of A to be the singular values of the linear transformation LA.
Theorem 6.23 (Singular Value Decomposition Theorem for Matrices). Let A be an m×n matrix of rank r with the positive singular values
σ1≥σ2≥⋯≥σr,
and let Σ be the m×n matrix defined by
Σij={σi,0,i=j≤r,otherwise.
Then there exists an m×m unitary matrix U and an n×n unitary matrix V such that
A=UΣV∗.
Definition. Let A be an m×n matrix of rank r with positive singula values σ1≥σ2≥⋯≥σr. A factorization A=UΣV∗ where U and V are unitary matrices and Σ is the m×n matrix is called a singular value decomposition of A.
Let V be a vector space over a field F. A function H from the set V×V of ordered pairs of vectors to F is called a bilinear form on V if H is linear in each variable when the other variable is held fixed; that is, H is a bilinear form on V if
(a)H(ax1+x2,y)=aH(x1,y)+H(x2,y) for all x1,x2,y∈V and a∈F,
(b)H(x,ay1+y2)=aH(x,y1)+H(x,y2) for all x,y1,y2∈V and a∈F.
Definition. Let V be a vector space, let H1 and H2 be bilinear forms on V, and let a be a scalar. We define the sumH1+H2 and the scalar productaH1 by the equations
(H1+H2)(x,y)=H1(x,y)+H2(x,y)
and
(aH1)(x,y)=a(H1(x,y)),for all x,y∈V.
For any vector space V, the sum of two bilinear forms and the product of a scalar and a bilinear form on V are again bilinear forms on V. Furthermore, B(V) is a vector space with respect to these operations.
Definition. Let β={v1,v2,…,vn} be an ordered basis for an n-dimensional vector space V, and let H∈B(V). We can associate with H an n×n matrix A whose entry in row i and column j is defined by
Aij=H(vi,vj),for i,j=1,2,…,n.
The matrix A above is called the matrix representation of H with respect to the ordered basis β and is denoted by ψβ(H).
Theorem 6.24. Let F be a field, n a positive integer, and β be the standard ordered basis for Fn. Then for any H∈B(Fn), there exists a unique matrix A∈Mn×n(F), namely A=ψβ(H), such that
H(x,y)=xtAyfor all x,y∈Fn.
Definition. Let A,B∈Mn×n(F). Then B is said to be congruent to A if there exists an invertible matrix Q∈Mn×n(F) such that
B=QtAQ.
Theorem 6.25. Let V be a finite-dimensional vector space with ordered bases β={v1,v2,…,vn} and γ={w1,w2,…,wn}, and let Q be the change-of-coordinate matrix changing γ-coordinates into β-coordinates. Then, for any H∈B(V), we have
A bilinear form H on a vector space V is symmetric if H(x,y)=H(y,x) for all x,y∈V. As the name suggests, symmetric bilinear forms correspond to symmetric matrices.
**Theorem 6.26. ** Let H be a bilinear form on a finite-dimensional vector space V,
and let β be an ordered basis for V. Then H is symmetric if and only if ψβ(H) is symmetric.
Definition. A bilinear form H on a finite-dimensional vector space V is called diagonalizable if there is an ordered basis β for V such that ψβ(H) is a diagonal matrix.
Lemma. Let H be a nonzero symmetric bilinear form on a vector space V over a field F not of characteristic two. Then there is a vector x in V such that H(x,x)=0.
Theorem 6.27. Let V be a finite-dimensional vector space over a field F not of characteristic two. Then every symmetric bilinear form on V is diagonalizable.
Let V be a vector space over F. A function K:V→F is called a quadratic form if there exists a symmetric bilinear form H∈B(V) such that
K(x)=H(x,x)for all x∈V.
Definition. Given the variables t1,t2,…,tn that take values in a field F not of characteristic two and given (not necessarily distinct) scalars aij(1≤i≤j≤n), define the polynomial
f(t1,t2,…,tn)=i≤j∑aijtitj.
Any such polynomial is a quadratic form. In fact, if β is the standard ordered basis for Fn, then the symmetric bilinear form H corresponding to the quadratic form f has the matrix representation ψβ(H)=A, where
Aij=Aji={aii,21aij,if i=j,if i=j.
Theorem 6.28. Let K be a quadratic form on a finite-dimensional real inner product space V. There exists an orthonormal basis β={v1,v2,…,vn} for V and scalars λ1,λ2,…,λn (not necessarily distinct) such that if x∈V and