Conditions of Optimality

Linear Programming

A standard linear program has the following formulation:

\[  \begin{array}{rl} \displaystyle \mathop \textrm{minimize}&  \mathbf{c}^{T} \mathbf{x} \\ \textrm{subject to}&  \mathbf{A} \mathbf{x} \ge \mathbf{b} \\ &  \mathbf{x} \ge 0 \end{array}  \]

where

$\mathbf{x}$

$\in $

$\mathbb {R}^{n}$

is the vector of decision variables

$\mathbf{A}$

$\in $

$\mathbb {R}^{m \times n}$

is the matrix of constraints

$\mathbf{c}$

$\in $

$\mathbb {R}^{n} $

is the vector of objective function coefficients

$\mathbf{b}$

$\in $

$\mathbb {R}^{m}$

is the vector of constraints right-hand sides (RHS)

This formulation is called the primal problem. The corresponding dual problem (see the section Dual Values) is

\[  \begin{array}{rl} \displaystyle \mathop \textrm{maximize}&  \mathbf{b}^{T} \mathbf{y} \\ \textrm{subject to}&  \mathbf{A}^{T} \mathbf{y} \le \mathbf{c} \\ &  \mathbf{y} \ge 0 \end{array}  \]

where $\mathbf{y} \in \mathbb {R}^{m}$ is the vector of dual variables.

The vectors $\mathbf{x}$ and $\mathbf{y}$ are optimal to the primal and dual problems, respectively, only if there exist primal slack variables $\mathbf{s} = \mathbf{A} \mathbf{x} - \mathbf{b}$ and dual slack variables $\mathbf{w} = \mathbf{A}^{T} \mathbf{y} - \mathbf{c}$ such that the following Karush-Kuhn-Tucker (KKT) conditions are satisfied:

$\displaystyle \begin{array}{rcccc} \mathbf{A} \mathbf{x} + \mathbf{s} &  = &  \mathbf{b}, &  \mathbf{x} \ge 0, &  \mathbf{s} \ge 0 \\ \mathbf{A}^{T} \mathbf{y} + \mathbf{w} &  = &  \mathbf{c}, &  \mathbf{y} \ge 0, &  \mathbf{w} \ge 0 \\ \mathbf{s}^{T} \mathbf{y} &  = &  0 \\ \mathbf{w}^{T} \mathbf{x} &  = &  0 \\ \end{array} $

The first line of equations defines primal feasibility, the second line of equations defines dual feasibility, and the last two equations are called the complementary slackness conditions.

Nonlinear Programming

To facilitate discussion of optimality conditions in nonlinear programming, you write the general form of nonlinear optimization problems by grouping the equality constraints and inequality constraints. You also write all the general nonlinear inequality constraints and bound constraints in one form as $\ge $ inequality constraints. Thus, you have the following formulation:

\[  \begin{array}{lll} \displaystyle \mathop \textrm{minimize}_{x\in {\mathbb R}^ n} &  f(x) \\ \textrm{subject to}&  c_ i(x) = 0, &  i \in {\mathcal E} \\ &  c_ i(x) \ge 0, &  i \in {\mathcal I} \end{array}  \]

where $\mathcal E$ is the set of indices of the equality constraints, $\mathcal I$ is the set of indices of the inequality constraints, and $m=|{\mathcal E}|+|{\mathcal I}|$.

A point $x$ is feasible if it satisfies all the constraints $c_ i(x) = 0, i\in {\mathcal E}$ and $c_ i(x) \ge 0, i\in {\mathcal I}$. The feasible region ${\mathcal F}$ consists of all the feasible points. In unconstrained cases, the feasible region ${\mathcal F}$ is the entire ${\mathbb R}^ n$ space.

A feasible point $x^*$ is a local solution of the problem if there exists a neighborhood ${\mathcal N}$ of $x^*$ such that

\[  f(x)\ge f(x^*)\; \; {\mathrm for\  all\  } x\in {\mathcal N}\cap {\mathcal F}  \]

Further, a feasible point $x^*$ is a strict local solution if strict inequality holds in the preceding case; that is,

\[  f(x) > f(x^*)\; \; {\mathrm for\  all\  } x\in {\mathcal N}\cap {\mathcal F}  \]

A feasible point $x^*$ is a global solution of the problem if no point in ${\mathcal F}$ has a smaller function value than $f(x^*$); that is,

\[  f(x)\ge f(x^*)\; \;  {\mathrm for \  all \  } x\in {\mathcal F}  \]
Unconstrained Optimization

The following conditions hold true for unconstrained optimization problems:

  • First-order necessary conditions:  If $x^*$ is a local solution and $f(x)$ is continuously differentiable in some neighborhood of $x^*$, then

    \[  \nabla \! f(x^*) = 0  \]
  • Second-order necessary conditions:  If $x^*$ is a local solution and $f(x)$ is twice continuously differentiable in some neighborhood of $x^*$, then $\nabla ^2\! f(x^*)$ is positive semidefinite.

  • Second-order sufficient conditions:  If $f(x)$ is twice continuously differentiable in some neighborhood of $x^*$, $\nabla \! f(x^*) = 0$, and $\nabla ^2\! f(x^*)$ is positive definite, then $x^*$ is a strict local solution.

Constrained Optimization

For constrained optimization problems, the Lagrangian function is defined as follows:

\[  L(x,\lambda ) = f(x) - \sum _{i\in {\mathcal E}\cup {\mathcal I}} \lambda _ i c_ i(x)  \]

where $\lambda _ i,i\in {\mathcal E}\cup {\mathcal I}$, are called Lagrange multipliers. $\nabla \! _ x L(x,\lambda )$ is used to denote the gradient of the Lagrangian function with respect to $x$, and $\nabla _{\! x}^2 L(x,\lambda )$ is used to denote the Hessian of the Lagrangian function with respect to $x$. The active set at a feasible point $x$ is defined as

\[  {\mathcal A}(x)={\mathcal E}\cup \{ i\in {\mathcal I}: c_ i(x)=0\}   \]

You also need the following definition before you can state the first-order and second-order necessary conditions:

  • Linear independence constraint qualification and regular point:  A point $x$ is said to satisfy the linear independence constraint qualification if the gradients of active constraints

    \[  \nabla \! c_ i(x),\quad i\in {\mathcal A}(x)  \]

    are linearly independent. Such a point $x$ is called a regular point.

You now state the theorems that are essential in the analysis and design of algorithms for constrained optimization:

  • First-order necessary conditions:  Suppose that $x^*$ is a local minimum and also a regular point. If $f(x)$ and $c_ i(x),i \in {\mathcal E}\cup {\mathcal I}$, are continuously differentiable, there exist Lagrange multipliers $\lambda ^*\in {\mathbb R}^ m$ such that the following conditions hold:

    $\displaystyle  $
    $\displaystyle  $
    $\displaystyle  \; \: \nabla \! _ x L(x^*,\lambda ^*) = \nabla \! f(x^*)-\displaystyle \mathop {\sum }_{ i\in {\mathcal E}\cup {\mathcal I}}\lambda _ i^*\nabla \! c_ i(x^*) = 0  $
    $\displaystyle  $
    $\displaystyle  $
    $\displaystyle \begin{array}{rccl} c_ i(x^*) &  = &  0, &  i\in {\mathcal E} \\ c_ i(x^*) &  \ge &  0, &  i\in {\mathcal I} \\ \lambda _ i^* &  \ge &  0, &  i\in {\mathcal I} \\ \lambda _ i^* c_ i(x^*) &  = &  0, &  i\in {\mathcal I} \end{array} $

    The preceding conditions are often known as the Karush-Kuhn-Tucker conditions, or KKT conditions for short.

  • Second-order necessary conditions:  Suppose that $x^*$ is a local minimum and also a regular point. Let $\lambda ^*$ be the Lagrange multipliers that satisfy the KKT conditions. If $f(x)$ and $c_ i(x),i\in {\mathcal E}\cup {\mathcal I}$, are twice continuously differentiable, the following conditions hold:

    \[  z^ T \nabla _{\! x}^2 L(x^*,\lambda ^*)z \ge 0  \]

    for all $z\in {\mathbb R}^ n$ that satisfy

    \[  \nabla \! c_ i(x^*)^ Tz = 0,\quad i\in {\mathcal A}(x^{*})  \]
  • Second-order sufficient conditions:  Suppose there exist a point $x^*$ and some Lagrange multipliers $\lambda ^*$ such that the KKT conditions are satisfied. If

    \[  z^ T \nabla _{\! x}^2 L(x^*,\lambda ^*)z > 0  \]

    for all $z\in {\mathbb R}^ n$ that satisfy

    \[  \nabla \! c_ i(x^*)^ Tz = 0,\quad i\in {\mathcal A}(x^*)  \]

    then $x^*$ is a strict local solution.

    Note that the set of all such $z$’s forms the null space of the matrix $\left[\nabla \! c_ i(x^*)^ T \right]_{i\in {\mathcal A}(x^{*})}$. Thus, you can search for strict local solutions by numerically checking the Hessian of the Lagrangian function projected onto the null space. For a rigorous treatment of the optimality conditions, see Fletcher (1987) and Nocedal and Wright (1999).