Lecture Notes

16. Gram-Schmidt and the QR Factorization.

Part of the Series on Linear Algebra.

By Akshay Agrawal. Last updated Nov. 3, 2018.

Previous entry: Singular Value Decomposition ; Next entry: Projections

In this section, we will present a procedure for converting any list of linearly independent vectors into a list of orthonormal vectors that preserves spans. Applying this procedure to the columns of a matrix $A$ and encoding the correspondence between the two lists of vectors as the product of two matrices yields the QR factorization, one of the most important factorizations in applied linear algebra.

The Gram-Schmidt Procedure

Let $v_1, v_2, \ldots v_n$ be a list of linearly independent vectors. Define $q_1 := v_1 / \|v_1\|$ , and for $k = 2, \ldots, n$ , let

$\begin{align*} \tilde{q_k} &:= v_k - \inner{v_k}{q_1}q_1 - \inner{v_k}{q_2}{q_2} - \cdots - \inner{v_k}{q_{k-1}}q_{k-1} \\ q_k &:= \frac{\tilde{q_k}}{\|\tilde{q_k}\|}. \end{align*}$

Then $q_1, q_2, \ldots, q_n$ is orthonormal, and $\span (q_1, q_2, \ldots, q_n) = \span (v_1, v_2, \ldots, v_n)$ .

When applied to a list of linearly dependent vectors, one of the $\tilde{q_k}$ will equal $0$ . Gram-Schmidt can therefore be used to determine whether a list of vectors is linearly dependent.

The QR Factorization

The Reduced QR Factorization

Let $A \in \RR^{m \times n}$ be full rank, $m \geq n$ , and let $a_1, a_2, \ldots, a_n$ denote the columns of $A$ . Running Gram-Schmidt on the columns of $A$ yields a list of vectors $q_1, q_2, \ldots, q_n$ satisfying

$\begin{align*} a_1 &= r_{11} q_1 \\ a_2 &= r_{12} q_1 + r_{22} q_2 \\ &\vdots \\ a_n &= r_{1n} q_1 + r_{2n} q_2 + \ldots + r_{nn} q_n, \\ \end{align*}$

where $r_{11} = \|v_1\|$ , $r_{ii} = \|\tilde{q_i}\|$ for $i \geq 2$ , and $r_{ij} = \inner{v_j}{q_i}$ for $i < j$ . This relation can be expressed succintly as $A = \hat Q \hat R$ , where $\hat Q \in \RR^{m \times n}$ has columns $q_1, q_2, \ldots, q_n$ and $\hat R \in \RR^{n \times n}$ is an upper triangular matrix with entires $r_{ij}$ . This decomposition is called the reduced QR factorization of $A$ .

With a slight modification to the Gram-Schmidt procedure, we can obtain a QR factorization for rank-deficient matrices $A \in \RR^{m \times n}$ , $m \geq n$ , $\rank A = r < n$ . Run Gram-Schmidt on the columns of $A$ as before. However, whenenver $\tilde{q_k} = 0$ (i.e., whenever $a_k$ is in $\span (q_1, q_2, \ldots, q_{k-1})$ ), do not form $q_k$ and instead continue to $a_{k+1}$ . In this fashion, we obtain $\hat Q \in \RR^{m \times r}$ with orthonormal columns and an upper triangular $\hat R \in \RR^{r \times n}$ such that $A = \hat Q \hat R$ (the $i$ -th column of $\hat R$ is taken to be the coordinates of $a_i$ with respect to the columns of $\hat Q$ ).

The Full QR Factorization

Let $A \in \RR^{m \times n}$ , $m \geq n$ , $\rank A = r \leq n$ . The full QR factorization of $A$ is formed as follows: construct $\hat Q \in \RR^{m \times r}$ , $\hat R \in \RR^{r \times n}$ as described in the previous section, and relabel them as $Q_1 = \hat Q$ , $R_1 = \hat R$ . Let $q_1, q_2, \ldots, q_r$ be the columns of $Q$ . Extend this list to an orthonormal basis of $\RR^m$ to obtain $q_1, q_2, \ldots, q_r, q_{r+1}, \ldots, q_m$ , and let $Q_2$ be the matrix whose columns are $q_r, q_{r+1}, \ldots, q_m$ . The full QR factorization of $A$ is written as

$A = \begin{bmatrix} Q_1 & Q_2 \end{bmatrix} \begin{bmatrix}R_1 \\ 0 \end{bmatrix}= QR,$

where $Q \in \RR^{m \times m}$ is the block matrix of $Q_1, Q_2$ , and $R \in \RR^{m \times n}$ the block matrix of $R_1, 0$ .

Here are some important properties of the QR factorization:

$\range{Q_1} = \range{A}$ .
$\range{Q_2} = \range{A}^\perp = \null{A^T}$ .
$Q$ is an orthogonal matrix.
If $A$ is invertible, $A^{-1} = R^{-1}Q^T$ .
If $A \in \RR^{m \times n}$ , $m \geq n$ , $\rank A = n$ , then $A^\dagger = (A^TA)^{-1}A^T = R_1^{-1}Q_1^T$ and the projection onto $\range{A}$ is $AA^\dagger = Q_1Q_1^T$ . (See the notes on least squares for a definition of $A^\dagger$ ).

Note. The QR factorization is typically only defined for tall-and-skinny matrices, i.e., matrices with more rows than columns. For a matrix $A \in \RR^{m \times n}, m < n$ , attempting to do a QR factorization would yield

$A = \begin{bmatrix} Q_1 & Q_2 \end{bmatrix} \begin{bmatrix}R_1 & 0 \\ 0 & 0 \end{bmatrix}.$

The rightmost matrix (which has $m$ rows and $n$ columns) is not upper triangular.

References

Sheldon Axler. Linear Algebra Done Right.
Stephen Boyd and Sanjay Lall. EE 263 Course Notes.
Lloyd Trefethen and David Bau. Numerical Linear Algebra.