Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The NETFLOW Procedure

Interior Point Algorithmic Details

After preprocessing, the Linear Program to be solved is

min {cT x}
subject to
A x = b
x \geq 0
This is the primal problem. The matrices of d, z, and Q of NPSC (defined in the section "Mathematical Description of NPSC" in Chapter 5, "The INTPOINT Procedure,") have been renamed c, x, and A, respectively, as these symbols are by convention used more, the problem to be solved is different from the original because of preprocessing, and there has been a change of primal variable to transform the LP into one whose variables have zero lower bounds. To simplify the algebra here, assume that variables have infinite upper bounds, and constraints are equalities. (Interior Point algorithms do efficiently handle finite upper bounds, and it is easy to introduce primal slack variables to change inequalities into equalities.) The problem has n variables. i is a variable number. k is an iteration number, and if used as a subscript or superscript it denotes "of iteration k".

There exists an equivalent problem, the dual problem, stated as
max {bT y}
subject to
AT y + s = c
s \geq 0
where
y are dual variables, and s are dual constraint slacks

What the Interior Point has to do is to solve the system of equations to satisfy the Karush-Kuhn-Tucker (KKT) conditions for optimality:
A x = b
AT y + s = c
X S e = 0
x \geq 0
s \geq 0
where
S = diag(s), (that is, Si,j = si if i=j, Si,j = 0 otherwise)
X = diag(x), and
e_i=1 \forall i

These are the conditions for feasibility, with the complementarity condition X S e = 0 added. cT x = bT y must occur at the optimum. Complementarity forces the optimal objectives of the primal and dual to be equal, cT xopt = bT yopt, as
0 = xTopt sopt = sTopt xopt = (c - AT yopt)T xopt = cT xopt - yTopt (A xopt) = cT xopt - bT yopt
therefore
0 = cT xopt - bT yopt

Before the optimum is reached, a solution (x, y, s) may not satisfy the KKT conditions:

The Interior Point algorithm works by using Newton's method to find a direction to move (\Delta x^k, \Delta y^k, \Delta s^k) from the current solution (xk, yk, sk) toward a better solution:
(x^{k+1}, y^{k+1}, s^{k+1})=(x^k, y^k, s^k) + \alpha (\Delta x^k, \Delta y^k, \Delta s^k)
\alpha is the step length and is assigned a value as large as possible and not so large that a xk+1i or sk+1i is "too close" to zero. The direction in which to move is found using
A \Delta x^k=infeas_c
A^T \Delta y^k + \Delta s^k=infeas_d
S^k \Delta x^k + X^k \Delta s^k=- X^k S^k e

To greatly improve performance, the third equation is changed to
S^k \Delta x^k + X^k \Delta s^k=- X^k S^k e + \sigma_k \mu_k e
where
\mu_k=(x^k)^T s^k / n, the average complementarity, and
0 \leq \sigma_k \leq 1
The effect now is to find a direction in which to move to reduce infeasibilities and to reduce the complementarity toward zero, but if any xki ski is too close to zero, it is "nudged out" to \mu,and any xki ski that is larger than \mu is "nudged into" \mu.A \sigma_k close to or equal to 0.0 biases a direction toward the optimum, and a value for \sigma_k close to or equal to 1.0 "centers" the direction toward a point where all pairwise products x^k_i s^k_i=\mu.Such points make up the Central Path in the interior. Although centering directions make little, if any, progress in reducing \mu and moving the solution closer to the optimum, substantial progress toward the optimum can usually be made in the next iteration.

The Central Path is crucial to why the Interior Point algorithm is so efficient. As \mu is decreased, this path "guides" the algorithm to the optimum through the interior of feasible space. Without centering, the algorithm would find a series of solutions near each other close to the boundary of feasible space. Step lengths along the direction would be small and many more iterations would probably be required to reach the optimum.

That in a nutshell is the Primal-Dual Interior Point algorithm. Varieties of the algorithm differ in the way \alpha and \sigma_kare chosen and the direction adjusted during each iteration. A wealth of information can be found in the texts by Roos, Terlaky, and Vial (1997), Wright (1996), and Ye (1996).

The calculation of the direction is the most time-consuming step of the Interior Point algorithm. Assume the kth iteration is being performed, so the subscript and superscript k can be dropped from the algebra:
A \Delta x=infeas_c
A^T \Delta y + \Delta s=infeas_d
S \Delta x + X \Delta s=- X S e + \sigma \mu e
Rearranging the second equation
\Delta s=infeas_d - A^T \Delta y
Rearranging the third equation
\Delta s=X^{-1}(- S \Delta x - X S e + \sigma \mu e)
\Delta s=- \Theta \Delta x - S e + X^{-1} \sigma \mu e
where
\Theta=S X^{-1}

Equating these two expressions for \Delta s and rearranging
-\Theta \Delta x - S e + X^{-1} \sigma \mu e=infeas_d - A^T \Delta y
-\Theta \Delta x=S e - X^{-1} \sigma \mu e + infeas_d - A^T \Delta y
\Delta x=\Theta^{-1}(-S e + X^{-1} \sigma \mu e - infeas_d + A^T \Delta y)
\Delta x=\rho + \Theta^{-1} A^T \Delta y
where
\rho=\Theta^{-1}(-S e + X^{-1} \sigma \mu e - infeas_d)
Substituting into the first direction equation
A \Delta x=infeas_c
A (\rho + \Theta^{-1} A^T \Delta y)=infeas_c
A \Theta^{-1} A^T \Delta y=infeas_c - A \rho
\Delta y=(A \Theta^{-1} A^T)^{-1}(infeas_c - A \rho)
\Theta, \rho, \Delta y, \Delta x, and \Delta s are calculated in that order. The hardest term is the factorization of the (A \Theta^{-1} A^T) matrix to determine \Delta y. Fortunately, although the values of (A \Theta^{-1} A^T) are different for each iteration, the locations of the nonzeros in this matrix remain fixed; the nonzero locations are the same as those in the matrix (A AT). This is because \Theta^{-1}=X S^{-1} is a diagonal matrix that has the effect of merely scaling the columns of (A AT).

The fact that the nonzeros in A \Theta^{-1} A^T have a constant pattern is exploited by all Interior Point algorithms and is a major reason for their excellent performance. Before iterations begin, A AT is examined and its rows and columns are symmetrically permutated so that during Cholesky factorization, the number of fillins created is smaller. A list of arithmetic operations to perform the factorization is saved in concise computer data structures (working with memory locations rather than actual numerical values). This is called symbolic factorization. During iterations, when memory has been initialized with numerical values, the operations list is performed sequentially. Determining how the factorization should be performed again and again is unnecessary.

The Primal-Dual Predictor-Corrector IntPoint Algorithm

The variant of the Interior Point algorithm implemented in PROC NETFLOW is a Primal-Dual Predictor-Corrector Interior Point algorithm. At first, Newton's method is used to find a direction (\Delta x^k_{aff}, \Delta y^k_{aff}, \Delta s^k_{aff}) to move, but calculated as if \mu is zero, that is, as a step with no centering, known as an affine step:
A \Delta x^k_{aff}=infeas_c
A^T \Delta y^k_{aff} + \Delta s^k_{aff}=infeas_d
S^k \Delta x^k_{aff} + X^k \Delta s^k_{aff}=- X^k S^k e
(x^k_{aff}, y^k_{aff}, s^k_{aff})=(x^k, y^k, s^k) + \alpha (\Delta x^k_{aff}, \Delta y^k_{aff}, \Delta s^k_{aff})
\alpha is the step length as before.

Complementarity xT s is calculated at (xkaff, ykaff, skaff) and compared with the complementarity at the starting point (xk, yk, sk), and the success of the affine step is gauged. If the affine step was successful in reducing the complementarity by a substantial amount, the need for centering is not great, and \sigma_k in the following linear system is assigned a value close to zero. If, however, the affine step was unsuccessful, centering would be beneficial, and \sigma_k in the following linear system is assigned a value closer to 1.0. The value of \sigma_k is therefore adaptively altered depending on the progress made toward the optimum.

A second linear system is solved to determine a centering vector (\Delta x^k_{c}, \Delta y^k_{c}, \Delta s^k_{c}) from (xkaff, ykaff, skaff):
A \Delta x^k_{c}=0
A^T \Delta y^k_{c} + \Delta s^k_{c}=0
S^k \Delta x^k_{c} + X^k \Delta s^k_{c}=- X^k_{aff} S^k_{aff} e + \sigma_k \mu_k e
then
(\Delta x^k, \Delta y^k, \Delta s^k)=(\Delta x^k_{aff}, \Delta y^k_{aff}, \Delta s^k_{aff}) + (\Delta x^k_{c}, \Delta y^k_{c}, \Delta s^k_{c})

(x^{k+1}, y^{k+1}, s^{k+1})=(x^k, y^k, s^k) + \alpha (\Delta x^k, \Delta y^k, \Delta s^k)
where, as before, \alpha is the step length assigned a value as large as possible but not so large that a xk+1i or sk+1i is "too close" to zero.

Although the Predictor-Corrector variant entails solving two linear systems instead of one, fewer iterations are usually required to reach the optimum. The additional overhead of calculating the second linear system is small, as the factorization of the (A \Theta^{-1} A^T) matrix has already been performed to solve the first linear system.

The variant of the IntPoint algorithm implemented in PROC NETFLOW is a Primal-Dual Predictor-Corrector IntPoint algorithm. At first, Newton's method is used to find a direction to move (\Delta x^k_{aff}, \Delta y^k_{aff}, \Delta s^k_{aff}),but calculated as if \mu is zero, that is, a step with no centering, known as an affine step:
A \Delta x^k_{aff}=-infeas_c
A^T \Delta y^k_{aff} + \Delta s^k_{aff}=-infeas_d
S^k \Delta x^k_{aff} + X^k \Delta s^k_{aff}=- X^k S^k e
(x^k_{aff}, y^k_{aff}, s^k_{aff})=(x^k, y^k, s^k) + \alpha (\Delta x^k_{aff}, \Delta y^k_{aff}, \Delta s^k_{aff})
\alpha is the step length as before.

Complementarity xT s is calculated at (xkaff, ykaff, skaff) and compared with the complementarity at the starting point (xk, yk, sk), and the success of the affine step is gauged. If the affine step was successful in reducing the complementarity by a substantial amount, the need for centering is not great, and the value of \sigma_k in the following linear system is assigned a value close to zero. If, however, the affine step was unsuccessful, centering would be beneficial, and the value of \sigma_k in the following linear system is assigned a value closer to 1.0. The value of \sigma_k is therefore adaptively altered depending on the progress made toward the optimum.

A second linear system is solved to determine a centering vector (\Delta x^k_{c}, \Delta y^k_{c}, \Delta s^k_{c}) from (xkaff, ykaff, skaff)
A \Delta x^k_{c}=0
A^T \Delta y^k_{c} + \Delta s^k_{c}=0
S^k \Delta x^k_{c} + X^k \Delta s^k_{c}=- X^k S^k e
S^k \Delta x^k + X^k \Delta s^k=- X^k_{aff} S^k_{aff} e + \sigma_k \mu_k e
then
(\Delta x^k, \Delta y^k, \Delta s^k)=(\Delta x^k_{aff}, \Delta y^k_{aff}, \Delta s^k_{aff}) + (\Delta x^k_{c}, \Delta y^k_{c}, \Delta s^k_{c})

(x^{k+1}, y^{k+1}, s^{k+1})=(x^k, y^k, s^k) + \alpha (\Delta x^k, \Delta y^k, \Delta s^k)
where, as before, \alpha is the step length assigned a value as large as possible but not so large that a xk+1i or sk+1i is "too close" to zero.

Although the Predictor-Corrector variant entails solving two linear systems instead of one, fewer iterations are usually required to reach the optimum. The additional overhead of calculating the second linear system is small, as the factorization of the (A \Theta^{-1} A^T) matrix has already been performed to solve the first linear system.

Stopping Criteria

There are several reasons why PROC NETFLOW stops Interior Point optimization. Optimization stops when: PROC NETFLOW may stop optimization when it detects that the rate at which the complementarity or duality gap is being reduced is too slow, that is, there are consecutive iterations when the complementarity or duality gap has stopped getting smaller and the infeasibilities, if nonzero, have also stalled. Sometimes, this indicates the problem is infeasible.

The reasons to stop optimization outlined in the previous paragraph will be termed the usual stopping conditions in the following explanation.

However, when solving some problems, especially if the problems are large, the usual stopping criteria are inappropriate. PROC NETFLOW might stop prematurely. If it were allowed to perform additional optimization, a better solution would be found. On other occasions, PROC NETFLOW might do too much work. A sufficiently good solution might be reached several iterations before PROC NETFLOW eventually stops.

You can see PROC NETFLOW's progress to the optimum by specifying PRINTLEVEL2=2. PROC NETFLOW will produce a table on the SAS log. A row of the table is generated during each iteration and consists of values of the affine step complementarity, the complementarity of the solution for the next iteration, the total bound infeasibility \sum_{i=1}^n infeas_{b i} (see the infeasb array in the "Interior Point: Upper Bounds" section), the total constraint infeasibility \sum_{i=1}^m infeas_{c i} (see the infeasc array in the "Interior Point Algorithmic Details" section), and the total dual infeasibility \sum_{i=1}^n infeas_{d i} (see the infeasd array in the "Interior Point Algorithmic Details" section). As optimization progresses, the values in all columns should converge to zero.

To tailor stopping criteria to your problem, you can use two sets of parameters: the STOP_x and the KEEPGOING_x parameters. The STOP_x parameters (STOP_C, STOP_DG, STOP_IB, STOP_IC, and STOP_ID) are used to test for some condition at the beginning of each iteration and if met, to stop immediately. The KEEPGOING_x parameters (KEEPGOING_C, KEEPGOING_DG, KEEPGOING_IB, KEEPGOING_IC, and KEEPGOING_ID) are used when PROC NETFLOW would ordinarily stop but does not if some conditions are not met.

For the sake of conciseness, a set of options might be referred to as the part of the option name they have in common followed by the suffix x. For example, STOP_C, STOP_DG, STOP_IB, STOP_IC, and STOP_ID will collectively be referred to as STOP_x.

At the beginning of each iteration, PROC NETFLOW will test whether complementarity is <= STOP_C (provided you specified a STOP_C parameter) and if it is, PROC NETFLOW will stop. If the duality gap is <= STOP_DG (provided you specified a STOP_DG parameter), PROC NETFLOW will stop immediately. This is true as well for the other STOP_x parameters that are related to infeasibilities, STOP_IB, STOP_IC, and STOP_ID.

For example, if you want PROC NETFLOW to stop optimizing for the usual stopping conditions, plus the additional condition, complementarity <= 100 or duality gap <= 0.001, then use
   proc netflow stop_c=100 stop_dg=0.001


If you want PROC NETFLOW to stop optimizing for the usual stopping conditions, plus the additional condition, complementarity <= 1000 and duality gap <= 0.001 and constraint infeasibility <= 0.0001, then use
   proc netflow 
      and_stop_c=1000 and_stop_dg=0.01 and_stop_ic=0.0001


Unlike the STOP_x parameters that cause PROC NETFLOW to stop when any one of them is satisfied, the corresponding AND_STOP_x parameters (AND_STOP_C, AND_STOP_DG, AND_STOP_IB, AND_STOP_IC, and AND_STOP_ID) cause PROC NETFLOW to stop only if all (more precisely, all that are specified) options are satisfied. For example, if PROC NETFLOW should stop when then use
   proc netflow 
      stop_c=100 stop_dg=0.001 
      and_stop_c=1000 and_stop_dg=0.01 and_stop_ic=0.0001


Just as the STOP_x parameters have AND_STOP_x partners, the KEEPGOING_x parameters have AND_KEEPGOING_x partners. The role of the KEEPGOING_x and AND_KEEPGOING_x parameters is to prevent optimization from stopping too early, even though a usual stopping criteria is met.

When PROC NETFLOW detects that it should stop for a usual stopping condition,

If all these tests to decide whether more optimization should be performed are false, optimization is stopped.

For example,
   proc netflow 
      stop_c=1000 
      and_stop_c=2000 and_stop_dg=0.01 
      and_stop_ib=1 and_stop_ic=1 and_stop_id=1 
      keepgoing_c=1500 
      and_keepgoing_c=2500 and_keepgoing_dg=0.05 
      and_keepgoing_ib=1 and_keepgoing_ic=1 and_keepgoing_id=1


At the beginning of each iteration, PROC NETFLOW will stop if When PROC NETFLOW determines it should stop because a usual stopping condition is met, it will stop only if

Interior Point: Upper Bounds

If the LP model had upper bounds (0 \leq x \leq u where u is the upper bound vector), then the primal and dual problems, the duality gap, and the KKT conditions would have to be expanded.

The primal Linear Program to be solved is
min {cT x}
subject to
A x = b
0 \leq x \leq u

0 \leq x \leq u is split into x \geq 0 and x \leq u. Let z be primal slack so that x + z = u, and associate dual variables w with these constraints. The Interior Point solves the system of equations to satisfy the Karush-Kuhn-Tucker (KKT) conditions for optimality:
A x = b
x + z = u
AT y + s - w = c
xT s = 0
zT w = 0
x,s,z,w \geq 0

These are the conditions for feasibility, with the complementarity conditions xT s = 0 and zT w = 0 added. cT x = bT y - uT w must occur at the optimum. Complementarity forces the optimal objectives of the primal and dual to be equal, cT xopt = bT yopt - uT wopt, as
0 = zTopt wopt = (u - xopt)T wopt = uT wopt - xTopt wopt
0 = xTopt sopt = sTopt xopt = (c - AT yopt + wopt)T xopt = cT xopt - yTopt (A xopt) + wopt)T xopt= cT xopt - bT yopt + uT wopt
0 = cT xopt - bT yopt + uT wopt

Before the optimum is reached, a solution (x, y, s, z, w) might not satisfy the KKT conditions:

The calculations of the Interior point algorithm can easily be derived in a fashion similar to calculations for when an LP has no upper bounds. See the paper by Lustig, Marsten, and Shanno (1992). An important point is that upper bounds can be handled by specializing the algorithm and not by generating the constraints x + z = u and adding these to the main primal constraints A x = b.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.