Lecture Notes

4. Linear Maps and Matrices

Part of the Series on Linear Algebra.

By Akshay Agrawal. Last updated Sept. 8, 2017.

Previous entry: Basis and Dimension; Next entry: Interpretations of Linear Maps and Matrix Multiplication

The concepts we introduced previously — vector spaces, subspaces, and bases — when welded together produce the core of linear algebra: linear maps and matrices; a linear map is a type of function between vector spaces, and matrices are concrete encodings of linear maps.

In this section, we define linear maps, matrices, and some vocabulary used to describe them. In the next section, we provide examples that supplement these definitions with intuition.

4.1 Linear Maps

A from a vector space $V$ to a vector space $W$ is a function $T : V \rightarrow W$ ¹ that is both additive, meaning that $T(u + v) = Tu + Tv$ for all $u \in V$ and $v \in V$ , and homogeneous, meaning that $T(\lambda v) = \lambda Tv$ for all $\lambda \in \Field$ . Linear maps are sometimes called linear transformations or linear functions².

Examples

Identity. The map that leaves vectors unchanged is called the ; it is denoted by $I$ . That is, $Iv = v$ .

Zero. The map that sends all vectors to $0$ is called the ; it is denoted by $0$ . It should be clear from context when $0$ stands for a map and when it stands for a vector.

Linear Functionals. The map that sends vectors in $\Reals^3$ to the sum of their components is a linear map. This map is an example of a linear functional, ie, a linear map from a vector space to its underlying field.

Other examples. Matrix multiplication, differentiation, and integration over an interval are all linear maps. In fact, as we shall see in 4.2, every linear map can be encoded as a matrix, and every application of a linear map to a vector can be encoded as a matrix multiplication.

Operations on linear maps

Addition. $T + S$ is defined such that $(T + S)(v) = Tv + Sv$ .

Scalar Multiplication. $\lambda T$ is defined such that $(\lambda T) v = \lambda (Tv)$ , where $\lambda$ is a scalar.

Products. The product of two linear maps denotes composition: $ST = S \circ T$ , so $ST(v) = S(T(v))$ ; this product only makes sense if the range of $T$ is a subset of the domain of $S$ .

Equipped with the addition and scalar multiplication above, the set of all linear maps from one vector space, $V,$ to another, $W,$ is itself a vector space. This vector space is denoted by $\mathcal{L}(V, W)$ .

You should verify that linear maps satisfy associativity and the natural distributive properties.

Linear maps are completely determined by their actions upon bases

Let $v_1, v_2, \ldots v_n$ be a basis for $V$ , and let $w_1, w_2, \ldots, w_n$ be arbitrary vectors in $W$ . Then there exists a linear map $T : V \rightarrow W$ such that $Tv_i = w_i$ , and moreover this map is unique. The implication of uniqueness here is extremely important: a linear map is completely characterized by the values it assigns to basis vectors. This result should not be terribly surprising because (1) each $v \in V$ can be written uniquely as a linear combination of basis vectors and (2) linear maps are additive and homogeneous.

4.2 Matrices

The important fact that linear maps are completely determined by their actions upon bases gives rise to a compact representation of linear maps: the matrix.

A is a two-dimensional array of numbers. An $m$ -rows-by- $n$ -columns (abbreviated $m\times n$ ) matrix $A$ is represented with a block of numbers:

$A = \begin{bmatrix} A_{1,1} & \ldots & A_{1,n} \\ \vdots & \ddots & \vdots \\ A_{m,1} & \ldots & A_{m,n} \\ \end{bmatrix},$

where $A_{i,j}$ is the entry in the $i$ -th row and the $j$ -th column of $A$ .

A $v \in V$ with respect to a basis $v_1, v_2, \ldots, v_n$ is an $n \times 1$ array of numbers

$\mat{v} = \begin{bmatrix} c_1 \\ c_2 \\ \vdots \\ c_n \end{bmatrix},$

where the $c_i$ are the (unique) scalars that satisfy

$v = c_1 v_1 + c_2 v_2 + \ldots c_n v_n.$

The scalars $c_1, c_2, \ldots, c_n$ are called the of $v$ with respect to the basis $v_1, v_2, \ldots, v_n$ . By convention, if the basis is not otherwise stated, it is assumed to be the standard basis (see the section on bases).

The matrix is the canonical data structure for representing linear maps

Let $v_1, v_2, \ldots, v_n$ be a basis for a vector space $V$ , let $T$ be any linear transformation from $V$ to another vector space $W$ , and let $v = c_1v_1 + c_2v_2 + \cdots + c_nv_n$ be an arbitrary vector in $V$ . To evaluate $Tv$ , all one needs to know is how $T$ acts upon the basis vectors because $Tv = \sum_{i=1}^{n} c_i Tv_i$ . This motivates the definition of the matrix of a linear map.

Let $T$ be a linear map from $V$ to $W$ , $v_1, v_2, \ldots, v_n$ a basis for $V$ and $w_1, w_2, \ldots, w_m$ a basis for $W$ . The is an $m\times n$ matrix $\mat{T}$ whose $i$ -th column records the coordinates of $T v_i$ in the selected basis for $W$ :

$T v_i = \mat{T}_{1,i}w_1 + \mat{T}_{2,i}w_2 + \cdots + \mat{T}_{m,n}w_m.$

If the bases for $V$ and $W$ are the standard bases, then the $i$ -th column of the matrix of a linear map $T$ is precisely equal to $Tv_i$ . More generally, the $i$ -th column of the matrix for $T$ records where the $i$ -th basis vector for $V$ lands in $W$ after passing through $T$ .

Matrix-vector multiplication corresponds to the application of a linear map

Let $T : V \rightarrow W$ be a linear map, and let $v$ be a vector in $V$ whose coordinates are $c_1, c_2, \ldots, c_n$ (with respect to a basis $v_1, v_2, \ldots, v_n$ ). The $\mat{T}\mat{v}$ yields a vector in W such that

$\begin{align*} \mat{T}\mat{v} &\equiv c_1 \mat{T}_{:, 1} + \cdots + c_n \mat{T}_{:, n} \\ &= c_1 Tv_1 + c_2 Tv_2 + \cdots c_n Tv_n \\ &= T(c_1 v_1 + c_2 v_2 + \cdots c_n v_n) \\ &= T(v) \end{align*},$

where $\mat{T}_{:, i}$ denotes the $i$ -th column of $\mat{T}$ . In other words, $\mat{T}\mat{v}$ is a linear combination of the columns of $\mat{T}$ in which the coefficient of the $i$ -th column is the $i$ -th coordinate of $v$ .

This definition is equivalent to a definition you may have previously encountered that casts matrix-vector multiplication as an operation on the rows of $\mat{T}$ : per the row-wise definition, the $i$ -th coordinate of $\mat{T}\mat{v}$ is the dot product of the $i$ -th row of $\mat{T}$ with $\mat{v}$ . Check for yourself that these two definitions are in fact equivalent.

Matrix multiplication corresponds to the composition of linear maps

Matrix-vector multiplication is an instance of matrix multiplication, in which one matrix is multiplied by another. Matrix multiplication is defined such that it corresponds to the composition of linear maps. Let $T: V \rightarrow W$ and $S: W \rightarrow X$ be linear maps. The definition of matrix multiplication is such that the matrix of the composition $ST$ is precisely equal to the matrix of $S$ times the matrix of $T$ .

Let $T: V \rightarrow W$ and $S: W \rightarrow X$ be linear maps, and let $\dim V = n$ , $\dim W = m$ , and $\dim X = p$ (so $\mat{T}$ is $m \times n$ and $\mat{S}$ is $p \times m$ ). The $\mat{S}\mat{T}$ is the $p \times n$ matrix whose $i$ -th column equals the product of $\mat{S}$ with the $i$ -th column of $\mat{T}$ ; that is,

$(\mat{S}\mat{T})_{:, i} = \mat{S}(\mat{T}_{:, i}).$

It is instructive to understand why $\mat{ST} = \mat{S}\mat{T}$ . Letting $v_1, v_2, \ldots, v_n$ be our basis for $V$ as usual, the composition $ST$ sends $v_i$ to $(ST)(v_i) = S(T(v_i))$ . This implies by definition 4.4 that the $i$ -th column of $\mat{ST}$ is precisely equal to $\mat{S}\mat{Tv_i} = \mat{S}(\mat{T}_{:, i})$ , thus furnishing the motivation for our definition of the matrix product above.

Summary

Matrices are used to compactly represent vectors and linear maps. The matrix of a vector encodes the coordinates of the vector with respect to a basis, and the i-th column of the matrix of a linear map $T$ contains the coordinates, with respect to a basis of $W$ , of $T$ applied to the i-th input basis vector. Matrix multiplication is defined to correspond to the composition of linear maps, whereas matrix-vector multiplication corresponds to passing a vector through a linear map.

4.3 Exercises

Prove that every linear map fixes the origin: $T(0) = 0$ for every linear map $T$ .
Convince yourself that the column-wise definition of matrix-vector multiplication is equivalent to the row-wise definition: that is, verify that $Av = \sum_{j=1}^{n} c_1 A_{:, i}$ implies $(Av)_i = \sum_{j=1}^{n}A_{i,j}c_j$ , where the $c_i$ are the coordinates of $v$ .

4.4 References

Linear Algebra Done Right, by Sheldon Axler.

4.5 Footnotes

[1]: The notation $T: V \rightarrow W$ means that $T$ is a function that takes as input a vector in $V$ and returns as output a vector in $W$ .

[1]: A function the form $f(x) = ax + b$ is only linear if $b = 0$ ; otherwise, it is more properly called an affine function. Affine functions are nonetheless sometimes colloquially referred to as linear.