The Nonlinear Programming Solver

Active-Set Method

Active-set methods differ from interior point methods in that no barrier term is used to ensure that the algorithm remains interior with respect to the inequality constraints. Instead, attempts are made to learn the true active set. For simplicity, use the same initial slack formulation used by the interior point method description,

\[ \begin{array}{ll} \displaystyle \mathop {\textrm{minimize}}& f(x) \\ \textrm{subject to}& g_{i}(x) - s_{i} = 0, i \in \mathcal{I} \\ & s \ge 0 \end{array} \]

where $s = (s_{1}, \ldots , s_{q})^\mr {T}$ is the vector of slack variables, which are required to be nonnegative. Begin by absorbing the equality constraints as before into a penalty function, but keep the slack bound constraints explicitly:

\[ \begin{array}{ll} \displaystyle \mathop {\textrm{minimize}}& \mathcal{M}(x,s) = f(x) + \dfrac {1}{2\mu }\| g(x) -s\| _2^2 \\ \textrm{subject to}& s \ge 0 \end{array} \]

where $\mu $ is a positive parameter. Given a solution pair $(x(\mu ), s(\mu ))$ for the preceding problem, you can define the active-set projection matrix P as follows:

\[ P_{ij} = \left\{ \begin{array}{cl} 1 & \text {if } i=j \text { and } s_ i(\mu ) = 0 \\ 0 & \text {otherwise.} \end{array} \right. \]

Then $(x(\mu ),s(\mu ))$ is also a solution of the equality constraint subproblem:

\[ \begin{array}{ll} \displaystyle \mathop {\textrm{minimize}}& \mathcal{M}(x,s) = f(x) + \dfrac {1}{2\mu }\| g(x) -s\| _2^2 \\ \textrm{subject to}& Ps = 0. \end{array} \]

The minimizer of the preceding subproblem must be a stationary point of the Lagrangian function

\[ \mathcal{L}^{\mu }(x,s,z) = f(x) + \dfrac {1}{2\mu }\| g(x) -s\| _2^2 - z^{T} Ps \]

which gives the optimality equations

\[ \begin{array}{lclclc} \nabla _ x \mathcal{L}^{\mu }(x,s,z) & =& \nabla f(x) - J(x)^{T} y & =& 0 \\ \nabla _ s \mathcal{L}^{\mu }(x,s,z) & =& y - P^{T}z & =& 0 \\ & =& Ps & =& 0 \end{array} \]

where $y=-(g(x) -s)/\mu $. Using the second equation, you can simplify the preceding equations to get the following optimality conditions for the bound-constrained penalty subproblem:

\[ \begin{array}{lclclc} \nabla f(x) - J(x)^{T} P^{T}z & =& 0 \\ P(g(x) -s) + \mu z & =& 0 \\ Ps & =& 0 \end{array} \]

Using the third equation directly, you can reduce the system further to

\[ \begin{array}{lclclc} \nabla f(x) - J(x)^{T} P^{T}z & =& 0 \\ P g(x) + \mu z & =& 0 \\ \end{array} \]

At iteration k, the primal-dual active-set algorithm approximately solves the preceding system by using Newton’s method. The Newton system is

\[ \left[\begin{array}{ccc} H_{\mathcal{L}}(x^{k}, z^{k}) & -J_{\mathcal A}^\mr {T} \\ J_{\mathcal A} & \mu I \end{array}\right] \left[\begin{array}{c} \Delta x^{k} \\ \Delta z^{k} \end{array}\right] = - \left[\begin{array}{c} \nabla _{x} f(x^{k}) - J_{\mathcal A}^\mr {T} z\\ Pg(x^{k}) + \mu z^{k} \end{array}\right] \]

where $J_{\mathcal A} = PJ(x^ k)$ and $H_{\mathcal{L}}$ denotes the Hessian of the Lagrangian function $f(x) - z^\mr {T} P g(x)$. The solution $(\Delta x^{k}, \Delta z^{k})$ of the Newton system provides a direction to move from the current iteration $(x^{k},s^{k},z^{k})$ to the next,

\[ (x^{k+1}, z^{k+1}) = (x^{k},z^{k}) + \alpha (\Delta x^{k}, \Delta z^{k}) \]

where $\alpha $ is the step length along the Newton direction. The corresponding slack variable update $s^{k+1}$ is defined as the solution to the following subproblem whose solution can be computed analytically:

\[ \begin{array}{ll} \displaystyle \mathop {\textrm{minimize}}& \mathcal{M}(x^{k+1},s) = f(x) + \dfrac {1}{2\mu }\| g(x^{k+1}) -s\| _2^2 \\ \textrm{subject to}& s \ge 0 \end{array} \]

The step length $\alpha $ is then determined in a similar manner to the preceding interior point approach. At each iteration, the definition of the active-set projection matrix P is updated with respect to the new value of the constraint function $g(x^{k+1})$. For large-scale NLP, the computational bottleneck typically arises in seeking to solve the Newton system. Thus active-set methods can achieve substantial computational savings when the size of $J_{\mathcal A}$ is much smaller than $J(x)$; however, convergence can be slow if the active-set estimate changes combinatorially. Further, the active-set algorithm is often the superior algorithm when only bound constraints are present. In practice, both the interior point and active-set approach incorporate more sophisticated merit functions than those described in the preceding sections; however, their description is beyond the scope of this document. See Gill and Robinson (2010) for further reading.