$$ \newcommand{\qed}{\tag*{$\square$}} \newcommand{\span}{\operatorname{span}} \newcommand{\dim}{\operatorname{dim}} \newcommand{\rank}{\operatorname{rank}} \newcommand{\norm}[1]{\|#1\|} \newcommand{\grad}{\nabla} \newcommand{\prox}[1]{\operatorname{prox}_{#1}} \newcommand{\inner}[2]{\langle{#1}, {#2}\rangle} \newcommand{\mat}[1]{\mathcal{M}[#1]} \newcommand{\null}[1]{\operatorname{null} \left(#1\right)} \newcommand{\range}[1]{\operatorname{range} \left(#1\right)} \newcommand{\rowvec}[1]{\begin{bmatrix} #1 \end{bmatrix}^T} \newcommand{\Reals}{\mathbf{R}} \newcommand{\RR}{\mathbf{R}} \newcommand{\Complex}{\mathbf{C}} \newcommand{\Field}{\mathbf{F}} \newcommand{\Pb}{\operatorname{Pr}} \newcommand{\E}[1]{\operatorname{E}[#1]} \newcommand{\Var}[1]{\operatorname{Var}[#1]} \newcommand{\argmin}[2]{\underset{#1}{\operatorname{argmin}} {#2}} \newcommand{\optmin}[3]{ \begin{align*} & \underset{#1}{\text{minimize}} & & #2 \\ & \text{subject to} & & #3 \end{align*} } \newcommand{\optmax}[3]{ \begin{align*} & \underset{#1}{\text{maximize}} & & #2 \\ & \text{subject to} & & #3 \end{align*} } \newcommand{\optfind}[2]{ \begin{align*} & {\text{find}} & & #1 \\ & \text{subject to} & & #2 \end{align*} } $$
The rate of convergence of an iterative optimization algorithm quantifies how quickly the algorithm converges to a fixed point. Let be the iterates produced by the iteration , with initial iterate . Let denote the fixed point obtained by iterating . Let be the errors of the iteration, that is, .
Suppose there exists a constant and an exponent such that for sufficiently large ,
If , the iteration is said to converge linearly to its fixed point; if , it converges quadratically. Additionally, if , the iteration is said to converge sublinearly, and if is converges superlinearly.
It is instructive to unroll the recursion in the above inequality. If , then we have
For this reason, linear convergence is sometimes referred to as geometric convergence. Note also that for , a log-linear plot of error versus iterations decays linearly, and obtaining an accuracy on the order of requires iterations.
If , then
Even though this convergence is termed quadratic, a log-linear plot of quadratically convergent errors versus iterations actually decays as an exponential. Obtaining an -suboptimal iterate in a quadratically convergent regime requires iterations.
Lloyd Trefethen and David Bau. Numerical Linear Algebra.