The Nonlinear Programming Solver

Constrained Optimization

A function that plays a pivotal role in establishing conditions that characterize a local minimum of an NLP problem is the Lagrangian function $\mathcal{L}(x,y,z)$, which is defined as

\[  \mathcal{L}(x,y,z) = f(x) - \sum _{i\in \mathcal{E}} y_{i} h_{i}(x) - \sum _{i\in \mathcal{I}} z_{i} g_{i}(x)  \]

Note that the Lagrangian function can be seen as a linear combination of the objective and constraint functions. The coefficients of the constraints, $y_{i}, i\in \mathcal{E}$, and $z_{i}, i\in \mathcal{I}$, are called the Lagrange multipliers or dual variables. At a feasible point $\hat{x}$, an inequality constraint is called active if it is satisfied as an equality—that is, $g_{i}(\hat{x}) = 0$. The set of active constraints at a feasible point $\hat{x}$ is then defined as the union of the index set of the equality constraints, $\mathcal{E}$, and the indices of those inequality constraints that are active at $\hat{x}$; that is,

\[  \mathcal{A}(\hat{x}) = \mathcal{E} \cup \{  i\in \mathcal{I}: g_{i}(\hat{x}) = 0 \}   \]

An important condition that is assumed to hold in the majority of optimization algorithms is the so-called linear independence constraint qualification (LICQ). The LICQ states that at any feasible point $\hat{x}$, the gradients of all the active constraints are linearly independent. The main purpose of the LICQ is to ensure that the set of constraints is well-defined in a way that there are no redundant constraints or in a way that there are no constraints defined such that their gradients are always equal to zero.

The First-Order Necessary Optimality Conditions

If $x^{*}$ is a local minimum of the NLP problem and the LICQ holds at $x^{*}$, then there are vectors of Lagrange multipliers $y^{*}$ and $z^{*}$, with components $y_{i}^{*}, i \in \mathcal{E}$, and $z_{i}^{*}, i \in \mathcal{I}$, respectively, such that the following conditions are satisfied:

\[  \begin{array}{rclr} \nabla _{x} \mathcal{L}(x^{*}, y^{*}, z^{*}) & =& 0 & \\ h_{i}(x^{*}) & =& 0, &  i \in \mathcal{E} \\ g_{i}(x^{*}) & \ge & 0, &  i \in \mathcal{I} \\ z_{i}^{*} & \ge & 0, &  i \in \mathcal{I} \\ z_{i}^{*} g_{i}(x^{*}) & =& 0, &  i \in \mathcal{I} \end{array}  \]

where $\nabla _{x} \mathcal{L}(x^{*}, y^{*}, z^{*})$ is the gradient of the Lagrangian function with respect to x, defined as

\[  \nabla _{x} \mathcal{L}(x^{*}, y^{*}, z^{*}) = \nabla f(x) - \sum _{i\in \mathcal{E}} y_{i} \nabla h_{i}(x) - \sum _{i\in \mathcal{I}} z_{i} \nabla g_{i}(x)  \]

The preceding conditions are often called the Karush-Kuhn-Tucker (KKT) conditions. The last group of equations ($z_{i} g_{i}(x) = 0, i \in I$) is called the complementarity condition. Its main aim is to try to force the Lagrange multipliers, $z_{i}^{*}$, of the inactive inequalities (that is, those inequalities with $g_{i}(x^{*}) > 0$) to zero.

The KKT conditions describe the way the first derivatives of the objective and constraints are related at a local minimum $x^{*}$. However, they are not enough to fully characterize a local minimum. The second-order optimality conditions attempt to fulfill this aim by examining the curvature of the Hessian matrix of the Lagrangian function at a point that satisfies the KKT conditions.

The Second-Order Necessary Optimality Condition

Let $x^{*}$ be a local minimum of the NLP problem, and let $y^{*}$ and $z^{*}$ be the corresponding Lagrange multipliers that satisfy the first-order optimality conditions. Then $d^\mr {T} \nabla _{x}^{2} \mathcal{L}(x^{*},y^{*},z^{*}) d \ge 0$ for all nonzero vectors d that satisfy the following conditions:

  1. $\nabla h_{i}^\mr {T} (x^{*})d = 0$, $\forall i\in \mathcal{E}$

  2. $\nabla g_{i}^\mr {T} (x^{*})d = 0$, $\forall i\in \mathcal{A}(x^{*}) \cap \mathcal{I}$, such that $z_{i}^{*} > 0$

  3. $\nabla g_{i}^\mr {T} (x^{*})d \ge 0$, $\forall i\in \mathcal{A}(x^{*}) \cap \mathcal{I}$, such that $z_{i}^{*} = 0$

The second-order necessary optimality condition states that, at a local minimum, the curvature of the Lagrangian function along the directions that satisfy the preceding conditions must be nonnegative.