Lecture Notes

17. Projections

Part of the Series on Linear Algebra.

By Akshay Agrawal. Last updated Dec. 20, 2018.

Previous entry: Gram-Schmidt and the QR Factorization ; Next entry: Least Squares

In this section, we study a special type of linear operator, the projection. Projections are ubiquitous in numerical linear algebra, and orthogonal projections in particular are useful tools for solving minimization problems.

Projections

A projection is an idempotent linear operator $P : V \rightarrow V$ satisfying $P^2 = P$ .

Facts. Let $P : V \rightarrow V$ be a projection. Then

If $v \in \range{P}$ , then $Pv = v$ . To see this, note that there exists $u \in V$ such that $Pu = v$ ; hence, $Pv = P^2u = Pu = v$ .
For $v \in V$ , $Pv - v \in \null{P}: P(Pv - v) = P^2v - Pv = Pv - Pv = 0$ .
$I - P$ is a projection: $(I - P)^2 = I - 2P + P^2 = I - P$ .
$\range{I - P} = \null{P}$ ; this follows from (2).
$\null{I - P} = \range{P}$ ; this follows by replacing $P$ with $I - P$ in (4).
$\range{P} \cap \null{P} = \{0\}$ ; this follows by observing that $\null{I - P} \cap \null{P} = \{0\}$ since any $v$ in their intersection satisfies $0 = (I - P)v = v - Pv = v - 0 = v$ .

Fact (6) shows that a projection separates $V$ into a direct sum decomposition: $V = \range{P} \oplus \null{P}.$

If $S_1$ and $S_2$ are any subspaces of $V$ such that $V = S_1 \oplus S_2$ , then there exists a projection $P$ such that $\range{P} = S_1$ and $\null{P} = S_2$ . To see this, let $P$ be a linear map such that $Pv = v$ for $v \in S_1$ , $Pv = 0$ for $v \in S_2$ . Then for $v \in V$ , $v = v_1 + v_2$ with $v_1 \in S_1$ , $v_2 \in S_2$ , and

$P^2v = P^2(v_1 + v_2) = P^2(v_1) + P^2(v_2) = Pv_1 = P(v_1 + v_2) = Pv.$

Because $S_1 \oplus S_2 = V$ , every $v \in V$ can be represented uniquely as $v = v_1 + v_2$ , with $v_1 \in S_1$ , $v_2 \in S_2$ . In fact, $v_1 = Pv$ and $v_2 = (I - P)v$ , since $\range{P} = S_1$ and $\range{I - P} = S_2$ . For this reason, $I - P$ is called the complementary projection corresponding to $P$ .

Eigenvalues. To every projection $P : V \rightarrow V$ , there corresponds a basis of $V$ consisting of eigenvectors belonging to $P$ ; furthermore, the eigenvalues corresponding to these eigenvectors are $0$ and $1$ .
To see this, consider a basis $v_1, \ldots, v_k, v_{k+1}, \ldots, v_n$ of $V$ such that $v_1, \ldots, v_k$ is a basis of $\range{P}$ and $v_{k+1}, \ldots, v_n$ is a basis of $\null{P}$ (note that such a basis of $V$ is well-defined, since $V = \range{P} \oplus \null{P}$ , which means that $0$ is the only vector belonging to both $\range{P}$ and $\null{P}$ ). The vectors $v_1, \ldots, v_k$ are evidently in the eigenspace of $1$ , and $v_{k+1}, \ldots, v_n$ are in the eigenspace of $0$ . Since $v_1, \ldots, v_n$ is a list of $\dim V$ linearly independent eigenvectors of $P$ , the eigenvalues of $P$ are precisely $0$ and $1$ .

From this characterization of the spectrum, it is clear that the trace of any projection (i.e., the sum of its eigenvalues) is equal to the dimension of the subspace into which it projects.

Orthogonal Projections

An orthogonal projection is a projection $P$ satisfying $\range{P} \perp \null{P}$ . If we write $U = \range{P}$ , then it is customary to say that such an operator $P$ is an orthogonal projection onto $U$ . For clarity, we sometimes denote the orthogonal projection onto $U$ as $P_U$ . Because $V = U \oplus U^\perp$ , the orthogonal projection onto $U$ is unique: it decomposes every $v \in V$ as $v = P_{U}v + (I - P_U)v$ , where the first term is in $U$ and the second one is in $U^\perp$ , since $\range{I - P_U} = \null{P_U} = U^\perp$ .

Note that orthogonal projections satisfy $\inner{Pv - v}{v} = 0$ for all $v \in V$ . Several other properties of orthogonal projections are listed below.

Orthogonal projections are self-adjoint. We can characterize orthogonal projections algebraically: A projection $P$ onto a subspace $U$ is orthogonal if and only if $P$ is self-adjoint, i.e., $P = P^T$ .

To see this, first assume that $P$ is an orthogonal projection. Let $v_1$ and $v_2$ be two vectors in $V$ , and write $v_1 = u_1 + w_1$ , $v_2 = u_2 + w_2$ , $u_1, u_2 \in U$ , $w_1, w_2 \in U^\perp$ . Then $\inner{Pv_1}{v_2} = \inner{u_1}{u_2 + w_2} = \inner{u_1}{u_2} + \inner{u_1}{ w_2} = \inner{u_1}{u_2}$ . Similarly, $\inner{v_1}{Pv_2} = \inner{u_1 + w_1}{u_2} = \inner{u_1}{u_2}$ . This shows that $\inner{Pv_1}{v_2} = \inner{v_1}{Pv_2}$ .

For the other direction, assume that $P$ is a self-adjoint projection onto $U$ . Then $\inner{Px}{(I-P)y} = \inner{x}{P(I-P)y} = \inner{x}{Py - P^2y} = 0$ . That is, $\range{P} \perp \null{P}$ .

Orthogonal projections are positive semidefinite. Let $P$ be an orthogonal projection onto some subspace. Since $P$ is a projection, for all $v \in V$ , $v^TPv = v^TP^2v$ . Since $P$ is an orthogonal projection, $P = P^T$ , so $v^TP^2v = v^TP^T P v = \norm{Pv}^2 \geq 0$ . Hence $P$ is positive semidefinite.

Eigenvalue decomposition. Because any orthogonal projection $P$ onto a subspace $U$ is self-adjoint, it has an orthonormal basis of eigenvectors. Let $(q_1, q_2, \ldots, q_n)$ be an orthonormal basis of eigenvectors for $V$ , with $q_1, q_2, \ldots, q_k$ a basis for $U$ . With respect to the basis $(q_i)$ , the matrix of $P$ is

$\Lambda = \begin{bmatrix} I_k & 0 \\ 0 & 0 \end{bmatrix},$

where $I_k$ is the $k \times k$ block matrix of the identity. Hence, if

$Q = \begin{bmatrix} q_1 & q_2 & \ldots q_n \end{bmatrix},$

then the matrix of $P$ with respect to the standard basis is $Q\Lambda Q^T = Q_kQ_k^T$ , where $Q_k$ is obtained by dropping the last $n - k$ columns of $Q$ .

Nonexpansive property. An orthogonal projection $P$ onto a subspace $U$ is nonexpansive: for all $v \in V$ ,

$\|Pv\| \leq \|v\|.$

This is readily shown by writing $v = u + w$ , $u \in U$ , $w \in U^\perp$ , and noting that $\| Pv \|^2 = \| u \|^2 \leq \|u\|^2 + \|w\|^2 = \|v\|^2$ , where the last equality is justified by the Pythagorean theorem. Note that this also implies that $\norm{P(v - w)} \leq \norm{v - w}$ for all $v$ and $w$ in $V$ .

Orthogonal projection via the QR factorization. If $q_1, q_2, \ldots, q_n$ is an orthonormal basis for a subspace $U$ , then for each $v \in V$ ,

$P_U v = \inner{v}{q_1}q_1 + \inner{v}{q_2}q_2 + \cdots + \inner{v}{q_n}q_n.$

From the above equation, it is clear that the matrix of $P_U$ is given by $QQ^T$ . An important special case is the rank-one orthogonal projection onto the span of a vector $q$ : the matrix of $P_{\span{(q)}}$ is $\frac{q^{}q^T}{\|q\|^2}$ .

The complementary projection for an orthogonal projection $P_U$ is given by $I - P_U$ , which is an orthogonal projection onto $U^{\perp}$ since $\range{I - P_U} = \null{P_U} = U^\perp$ . For example, the orthogonal projection onto $\span{(q)}^\perp$ is given by $I - \frac{qq^T}{\|q\|^2}$ .

To compute the orthogonal projection onto the span of a list of vectors $(a_i)$ , first perform modified Gram-Schmidt to obtain a list of orthonormal vectors, and then use the above fact. Equivalently, arrange the $a_i$ as the columns of a matrix $A$ , and write

$A = \begin{bmatrix} Q_1 & Q_2 \end{bmatrix} \begin{bmatrix}R_1 \\ 0 \end{bmatrix}$

via the full QR factorization. Then $Q_1 Q_1^T$ is the orthogonal projection onto $\span{(a_i)} = \range{A}$ , and $Q_2 Q_2^T$ is the orthogonal projection onto $\range{A}^{\perp} = \null{A^T}$ . Because orthogonal projections are unique, $Q_2 Q_2^T = I - Q_1Q_1^T$ , i.e., the complementary projection $I - Q_1Q_1^T$ projects onto $\null{A^T}$ . You can play around with these facts in code:

    import numpy as np

    A = np.arange(12).reshape((4, 3))
    Q1 = np.linalg.qr(A)[0]
    pi = Q1 @ Q1.T
    cmpl = np.eye(4) - pi
    v = np.random.randn(4)

    print(A.T @ (cmpl @ v))

prints

array([8.8817842e-16, 0.0000000e+00, 0.0000000e+00])

Orthogonal projection with an arbitrary basis. We can express the orthogonal projection onto the range of a matrix $A \in \mathbf{A}^{m \times n}$ , $m \geq n$ , $\rank{A} = n$ , in terms of $A$ itself. Let $y$ be the projection of $v$ onto $\range{A}$ . Then the vector $y - x$ must be in $\range{A}^\perp = \null{A^T}$ , that is, $A^T(y - v) = 0$ . Since $y \in \range{A}$ , there exists an $x$ such that $y = Ax$ , so we can write the previous equation as $A^T(Ax - v) = 0$ , or $x = (A^TA)^{-1}A^Tv$ . This lets us conclude that the projection of $v$ onto $\range{A}$ is $y = A(A^TA)^{-1}A^Tv$ . Note that if we substitute $Q_1R_1$ for $A$ (where $Q_1$ and $R_1$ are the block matrices from $A$ ’s full QR factorization), we obtain the familiar expression $Q_1Q_1^Tv$ for the projection of $v$ onto $\range{A}$ .

The matrix $(A^TA)^{-1}A^T$ is called the (left) pseudo-inverse of $A$ ; it is denoted $A^\dagger$ . $A^\dagger$ exists if and only if $A$ has at least as many rows as columns and is full-rank (because the Gram matrix $A^TA$ is invertible if and only if $A$ has linearly independent columns). If $b \in \range{A}$ , then $A^\dagger b$ is a solution to the linear equation $Ax = b$ , since $A^{} A^{\dagger} b$ is the projection of $b$ onto $\range{A}$ .

Minimum distance to a subspace. Let $U$ be a subspace of $V$ , $v \in V$ . Then for all $u \in U$ ,

$\| v - P_U v \| \leq \|v - u\|,$

i.e., $P_U v$ is the vector in $U$ that is closest to $v$ (with respect to the induced norm). To see why, first note that

$\| v - P_U v \|^2 \leq \|v - P_U v\|^2 + \|P_U v - u\|^2.$

Because $P_U v - u \in U$ and $v - P_U v \in U^\perp$ , the righthand side is equal to $\|v - P_U v + P_U v - u\|^2 = \|v - u\|^2$ . This completes the proof.

This fact is the key to solving least squares problems, in which we are given a matrix $A \in \RR^{m \times n}$ and a vector $b \in \RR^m$ , and the goal is to find a vector $x \in \RR^n$ that minimizes $\|Ax - b\|$ . We know that any $x$ that minimizes this expression must be such that $Ax$ is the orthogonal projection of $b$ onto the range of $A$ . If $m \geq n$ and $\rank{A} = n$ , then the solution to this problem is unique, and it is given by $x = A^\dagger b = (A^TA)A^{-1}b = R_1^{-1}Q_1^Tb$ (since, as proved in the previous section, $AA^\dagger b$ is the projection of $b$ onto $\range{A}$ ).

References

Sheldon Axler. Linear Algebra Done Right.
Lloyd Trefethen and David Bau. Numerical Linear Algebra.
Stephen Boyd and Sanjay Lall. EE 263 Course Notes.