$$ \newcommand{\qed}{\tag*{$\square$}} \newcommand{\span}{\operatorname{span}} \newcommand{\dim}{\operatorname{dim}} \newcommand{\rank}{\operatorname{rank}} \newcommand{\norm}[1]{\|#1\|} \newcommand{\grad}{\nabla} \newcommand{\prox}[1]{\operatorname{prox}_{#1}} \newcommand{\inner}[2]{\langle{#1}, {#2}\rangle} \newcommand{\mat}[1]{\mathcal{M}[#1]} \newcommand{\null}[1]{\operatorname{null} \left(#1\right)} \newcommand{\range}[1]{\operatorname{range} \left(#1\right)} \newcommand{\rowvec}[1]{\begin{bmatrix} #1 \end{bmatrix}^T} \newcommand{\Reals}{\mathbf{R}} \newcommand{\RR}{\mathbf{R}} \newcommand{\Complex}{\mathbf{C}} \newcommand{\Field}{\mathbf{F}} \newcommand{\Pb}{\operatorname{Pr}} \newcommand{\E}[1]{\operatorname{E}[#1]} \newcommand{\Var}[1]{\operatorname{Var}[#1]} \newcommand{\argmin}[2]{\underset{#1}{\operatorname{argmin}} {#2}} \newcommand{\optmin}[3]{ \begin{align*} & \underset{#1}{\text{minimize}} & & #2 \\ & \text{subject to} & & #3 \end{align*} } \newcommand{\optmax}[3]{ \begin{align*} & \underset{#1}{\text{maximize}} & & #2 \\ & \text{subject to} & & #3 \end{align*} } \newcommand{\optfind}[2]{ \begin{align*} & {\text{find}} & & #1 \\ & \text{subject to} & & #2 \end{align*} } $$
The concepts we introduced previously — vector spaces, subspaces, and bases — when welded together produce the core of linear algebra: linear maps and matrices; a linear map is a type of function between vector spaces, and matrices are concrete encodings of linear maps.
In this section, we define linear maps, matrices, and some vocabulary used to describe them. In the next section, we provide examples that supplement these definitions with intuition.
Definition 4.1 A linear map from a vector space to a vector space is a function 1 that is both additive, meaning that for all and , and homogeneous, meaning that for all . Linear maps are sometimes called linear transformations or linear functions2.
Identity. The map that leaves vectors unchanged is called the identity map; it is denoted by . That is, .
Zero. The map that sends all vectors to is called the zero map; it is denoted by . It should be clear from context when stands for a map and when it stands for a vector.
Linear Functionals. The map that sends vectors in to the sum of their components is a linear map. This map is an example of a linear functional, ie, a linear map from a vector space to its underlying field.
Other examples. Matrix multiplication, differentiation, and integration over an interval are all linear maps. In fact, as we shall see in 4.2, every linear map can be encoded as a matrix, and every application of a linear map to a vector can be encoded as a matrix multiplication.
Addition. is defined such that .
Scalar Multiplication. is defined such that , where is a scalar.
Products. The product of two linear maps denotes composition: , so ; this product only makes sense if the range of is a subset of the domain of .
Equipped with the addition and scalar multiplication above, the set of all linear maps from one vector space, to another, is itself a vector space. This vector space is denoted by .
You should verify that linear maps satisfy associativity and the natural distributive properties.
Let be a basis for , and let be arbitrary vectors in . Then there exists a linear map such that , and moreover this map is unique. The implication of uniqueness here is extremely important: a linear map is completely characterized by the values it assigns to basis vectors. This result should not be terribly surprising because (1) each can be written uniquely as a linear combination of basis vectors and (2) linear maps are additive and homogeneous.
The important fact that linear maps are completely determined by their actions upon bases gives rise to a compact representation of linear maps: the matrix.
Definition 4.2 A matrix is a two-dimensional array of numbers. An -rows-by--columns (abbreviated ) matrix is represented with a block of numbers:
where is the entry in the -th row and the -th column of .
Definition 4.3 A matrix of a vector with respect to a basis is an array of numbers
where the are the (unique) scalars that satisfy
The scalars are called the coordinates of with respect to the basis . By convention, if the basis is not otherwise stated, it is assumed to be the standard basis (see the section on bases).
Let be a basis for a vector space , let be any linear transformation from to another vector space , and let be an arbitrary vector in . To evaluate , all one needs to know is how acts upon the basis vectors because . This motivates the definition of the matrix of a linear map.
Definition 4.4 Let be a linear map from to , a basis for and a basis for . The matrix of a linear map is an matrix whose -th column records the coordinates of in the selected basis for :
If the bases for and are the standard bases, then the -th column of the matrix of a linear map is precisely equal to . More generally, the -th column of the matrix for records where the -th basis vector for lands in after passing through .
Definition 4.5 Let be a linear map, and let be a vector in whose coordinates are (with respect to a basis ). The matrix-vector product yields a vector in W such that
where denotes the -th column of . In other words, is a linear combination of the columns of in which the coefficient of the -th column is the -th coordinate of .
This definition is equivalent to a definition you may have previously encountered that casts matrix-vector multiplication as an operation on the rows of : per the row-wise definition, the -th coordinate of is the dot product of the -th row of with . Check for yourself that these two definitions are in fact equivalent.
Matrix-vector multiplication is an instance of matrix multiplication, in which one matrix is multiplied by another. Matrix multiplication is defined such that it corresponds to the composition of linear maps. Let and be linear maps. The definition of matrix multiplication is such that the matrix of the composition is precisely equal to the matrix of times the matrix of .
Definition 4.6 Let and be linear maps, and let , , and (so is and is ). The matrix product is the matrix whose -th column equals the product of with the -th column of ; that is,
It is instructive to understand why . Letting be our basis for as usual, the composition sends to . This implies by definition 4.4 that the -th column of is precisely equal to , thus furnishing the motivation for our definition of the matrix product above.
Matrices are used to compactly represent vectors and linear maps. The matrix of a vector encodes the coordinates of the vector with respect to a basis, and the i-th column of the matrix of a linear map contains the coordinates, with respect to a basis of , of applied to the i-th input basis vector. Matrix multiplication is defined to correspond to the composition of linear maps, whereas matrix-vector multiplication corresponds to passing a vector through a linear map.
Linear Algebra Done Right, by Sheldon Axler.
[1]: The notation means that is a function that takes as input a vector in and returns as output a vector in .
[1]: A function the form is only linear if ; otherwise, it is more properly called an affine function. Affine functions are nonetheless sometimes colloquially referred to as linear.