Iterative Methods

The conjugate gradient algorithm can be interpreted as the following optimization problem: minimize $\phi (x)$ defined by

\[ \phi (x) = 1/2 x^ T Ax - x^ T b \]

where ${b\in R^ n}$ and $A\in R^{n\times n}$ are symmetric and positive definite.

At each iteration $\phi (x)$ is minimized along an $A$-conjugate direction, constructing orthogonal residuals:

\[  r_ i \; \bot \;  {\mathcal K}_ i(A;\,  r_0),\quad r_ i = A x_ i - b  \]

where ${\mathcal K}_ i$ is a Krylov subspace:

\[  {\mathcal K}_ i\, (A; r) = \mbox{\textrm{span}} \{ r,\,  Ar,\,  A^2r,\,  \ldots ,\,  A^{i - 1}r\}   \]

Minimum residual algorithms work by minimizing the Euclidean norm $\|  Ax - b\| _2$ over ${\mathcal K}_ i$. At each iteration, $x_ i$ is the vector in ${\mathcal K}_ i$ that gives the smallest residual.

The biconjugate gradient algorithm belongs to a more general class of Petrov-Galerkin methods, where orthogonality is enforced in a different $i$-dimensional subspace ($x_ i$ remains in ${\mathcal K}_ i$):

\[  r_ i \; \bot \;  \{ w,\,  A^ T w,\,  (A^ T)^{2} w,\,  \ldots ,\,  (A^ T)^{\, i - 1} w\}   \]