The OPTQP Procedure

Example 14.1 Linear Least Squares Problem

The linear least squares problem arises in the context of determining a solution to an overdetermined set of linear equations. In practice, these equations could arise in data fitting and estimation problems. An overdetermined system of linear equations can be defined as

\[ \mathbf{Ax} = \mathbf{b} \]

where $\mathbf{A} \in \mathbb {R}^{m \times n}$, $\mathbf{x}\in \mathbb {R}^ n$, $\mathbf{b} \in \mathbb {R}^ m$, and $m > n$. Since this system usually does not have a solution, you need to be satisfied with some sort of approximate solution. The most widely used approximation is the least squares solution, which minimizes $\|  \mathbf{Ax} - \mathbf{b}\| _2^2$.

This problem is called a least squares problem for the following reason. Let $\mathbf{A}$, $\mathbf{x}$, and $\mathbf{b}$ be defined as previously. Let $k_ i(x)$ be the kth component of the vector $\mathbf{Ax} - \mathbf{b}$:

\[ k_ i(x) = a_{i1}x_1 + a_{i2}x_2 + \cdots + a_{in}x_ n - b_ i,\; i = 1,2,\ldots , m \]

By definition of the Euclidean norm, the objective function can be expressed as follows:

\[ \| \mathbf{Ax} - \mathbf{b}\| _2^2 = \sum \limits _{i = 1}^ m k_ i(x)^2 \]

Therefore, the function you minimize is the sum of squares of m terms $k_ i(x)$; hence the term least squares. The following example is an illustration of the linear least squares problem; that is, each of the terms $k_ i$ is a linear function of x.function $\sum _{ij}a_{ij}x_ j$ plus a constant, $-b_ i$.

Consider the following least squares problem defined by

\[ \mathbf{A} = \left[\begin{array}{rc} 4 & 0 \\ -1 & 1 \\ 3 & 2 \end{array}\right],\quad \mathbf{b} = \left[\begin{array}{c} 1 \\ 0 \\ 1 \end{array}\right] \]

This translates to the following set of linear equations:

\[ 4x_1 = 1, \quad -x_1 + x_2 = 0, \quad 3x_1 + 2x_2 = 1 \]

The corresponding least squares problem is

\[ \mbox{minimize} \quad (4x_1 -1)^2 + (-x_1 + x_2)^2 + (3x_1 + 2x_2 -1)^2 \]

The preceding objective function can be expanded to

\[ \mbox{minimize} \quad 26x_1^2 + 5x_2^2 + 10 x_1x_2 - 14 x_1 -4x_2 + 2 \]

In addition, you impose the following constraint so that the equation $3x_1 + 2x_2 = 1$ is satisfied within a tolerance of 0.1:

\[ 0.9 \leq 3x_1 + 2x_2 \leq 1.1 \]

You can create the QPS-format input data set by using the following SAS statements:

data lsdata;
   input field1 $ field2 $ field3 $ field4 field5 $ field6 @;
   datalines;
NAME     .      LEASTSQ     .            .         .
ROWS     .      .           .            .         .
N        OBJ    .           .            .         .
G        EQ3    .           .            .         .
COLUMNS  .      .           .            .         .
.        X1     OBJ        -14           EQ3       3
.        X2     OBJ        -4            EQ3       2
RHS      .      .           .            .         .
.        RHS    OBJ        -2            EQ3       0.9
RANGES   .      .           .            .         .
.        RNG    EQ3         0.2          .         .
BOUNDS   .      .           .            .         .
FR       BND1   X1          .            .         .
FR       BND1   X2          .            .         .
QUADOBJ  .      .           .            .         .
.        X1     X1          52           .         .
.        X1     X2          10           .         .
.        X2     X2          10           .         .
ENDATA   .      .           .            .         .
;

The decision variables $x_1$ and $x_2$ are free, so they have bound type FR in the BOUNDS section of the QPS-format data set.

You can use the following SAS statements to solve the least squares problem:

proc optqp data=lsdata
   printlevel = 0
   primalout = lspout;
run;

The optimal solution is displayed in Output 14.1.1.

Output 14.1.1: Solution to the Least Squares Problem

Primal Solution

Obs Objective
Function ID
RHS ID Variable
Name
Variable
Type
Linear Objective
Coefficient
Lower Bound Upper Bound Variable Value Variable
Status
1 OBJ RHS X1 F -14 -1.7977E308 1.7977E308 0.23810 O
2 OBJ RHS X2 F -4 -1.7977E308 1.7977E308 0.16190 O



The iteration log is shown in Output 14.1.2.

Output 14.1.2: Iteration Log

NOTE: The problem LEASTSQ has 2 variables (2 free, 0 fixed).                    
NOTE: The problem has 1 constraints (0 LE, 0 EQ, 0 GE, 1 range).                
NOTE: The problem has 2 constraint coefficients.                                
NOTE: The objective function has 2 Hessian diagonal elements and 1 Hessian      
      elements above the diagonal.                                              
NOTE: The QP presolver value AUTOMATIC is applied.                              
NOTE: The QP presolver removed 0 variables and 0 constraints.                   
NOTE: The QP presolver removed 0 constraint coefficients.                       
NOTE: The presolved problem has 2 variables, 1 constraints, and 2 constraint    
      coefficients.                                                             
NOTE: The QP solver is called.                                                  
NOTE: The Interior Point algorithm is used.                                     
NOTE: The deterministic parallel mode is enabled.                               
NOTE: The Interior Point algorithm is using up to 4 threads.                    
                                        Primal       Bound        Dual          
      Iter  Complement Duality Gap      Infeas      Infeas      Infeas   Time   
         0  1.9181E-02  5.8936E-03  1.9637E-08  0.0000E+00  3.5390E-04      0   
         1  9.0486E-04  2.8311E-04  8.6896E-10  1.1565E-17  1.3055E-05      0   
         2  1.5370E-05  4.9441E-06  6.4151E-11  2.3130E-17  1.3055E-07      0   
         3  1.5357E-07  4.9397E-08  1.7428E-12  5.7824E-18  1.3056E-09      0   
NOTE: Optimal.                                                                  
NOTE: Objective = 0.0095238095.                                                 
NOTE: The Interior Point solve time is 0.02 seconds.                            
NOTE: The data set WORK.LSPOUT has 2 observations and 9 variables.