The OPTQP Procedure

Example 12.1 Linear Least Squares Problem

The linear least squares problem arises in the context of determining a solution to an overdetermined set of linear equations. In practice, these equations could arise in data fitting and estimation problems. An overdetermined system of linear equations can be defined as

\[  \mathbf{Ax} = \mathbf{b}  \]

where $\mathbf{A} \in \mathbb {R}^{m \times n}$, $\mathbf{x}\in \mathbb {R}^ n$, $\mathbf{b} \in \mathbb {R}^ m$, and $m > n$. Since this system usually does not have a solution, you need to be satisfied with some sort of approximate solution. The most widely used approximation is the least squares solution, which minimizes $\|  \mathbf{Ax} - \mathbf{b}\| _2^2$.

This problem is called a least squares problem for the following reason. Let $\mathbf{A}$, $\mathbf{x}$, and $\mathbf{b}$ be defined as previously. Let $k_ i(x)$ be the kth component of the vector $\mathbf{Ax} - \mathbf{b}$:

\[  k_ i(x) = a_{i1}x_1 + a_{i2}x_2 + \cdots + a_{in}x_ n - b_ i,\;  i = 1,2,\ldots , m  \]

By definition of the Euclidean norm, the objective function can be expressed as follows:

\[  \|  \mathbf{Ax} - \mathbf{b}\| _2^2 = \sum \limits _{i = 1}^ m k_ i(x)^2  \]

Therefore, the function you minimize is the sum of squares of m terms $k_ i(x)$; hence the term least squares. The following example is an illustration of the linear least squares problem; that is, each of the terms $k_ i$ is a linear function of x.function $\sum _{ij}a_{ij}x_ j$ plus a constant, $-b_ i$.

Consider the following least squares problem defined by

\[  \mathbf{A} = \left[\begin{array}{rc} 4 &  0 \\ -1 &  1 \\ 3 &  2 \end{array}\right],\quad \mathbf{b} = \left[\begin{array}{c} 1 \\ 0 \\ 1 \end{array}\right]  \]

This translates to the following set of linear equations:

\[  4x_1 = 1, \quad -x_1 + x_2 = 0, \quad 3x_1 + 2x_2 = 1  \]

The corresponding least squares problem is

\[  \mbox{minimize} \quad (4x_1 -1)^2 + (-x_1 + x_2)^2 + (3x_1 + 2x_2 -1)^2  \]

The preceding objective function can be expanded to

\[  \mbox{minimize} \quad 26x_1^2 + 5x_2^2 + 10 x_1x_2 - 14 x_1 -4x_2 + 2  \]

In addition, you impose the following constraint so that the equation $3x_1 + 2x_2 = 1$ is satisfied within a tolerance of $0.1$:

\[  0.9 \leq 3x_1 + 2x_2 \leq 1.1  \]

You can create the QPS-format input data set by using the following SAS statements:

data lsdata;
   input field1 $ field2 $ field3 $ field4 field5 $ field6 @;
   datalines;
NAME     .      LEASTSQ     .            .         .
ROWS     .      .           .            .         .
N        OBJ    .           .            .         .
G        EQ3    .           .            .         .
COLUMNS  .      .           .            .         .
.        X1     OBJ        -14           EQ3       3
.        X2     OBJ        -4            EQ3       2
RHS      .      .           .            .         .
.        RHS    OBJ        -2            EQ3       0.9
RANGES   .      .           .            .         .
.        RNG    EQ3         0.2          .         .
BOUNDS   .      .           .            .         .
FR       BND1   X1          .            .         .
FR       BND1   X2          .            .         .
QUADOBJ  .      .           .            .         .
.        X1     X1          52           .         .
.        X1     X2          10           .         .
.        X2     X2          10           .         .
ENDATA   .      .           .            .         .
;

The decision variables $x_1$ and $x_2$ are free, so they have bound type FR in the BOUNDS section of the QPS-format data set.

You can use the following SAS statements to solve the least squares problem:

proc optqp data=lsdata 
   printlevel = 0
   primalout = lspout; 
run; 

The optimal solution is displayed in Output 12.1.1.

Output 12.1.1: Solution to the Least Squares Problem

 
Primal Solution

Obs Objective
Function ID
RHS ID Variable
Name
Variable
Type
Linear Objective
Coefficient
Lower Bound Upper Bound Variable Value Variable
Status
1 OBJ RHS X1 F -14 -1.7977E308 1.7977E308 0.23810 O
2 OBJ RHS X2 F -4 -1.7977E308 1.7977E308 0.16190 O


The iteration log is shown in Output 12.1.2.

Output 12.1.2: Iteration Log

 

NOTE: The problem LEASTSQ has 2 variables (2 free, 0 fixed).                    
NOTE: The problem has 1 constraints (0 LE, 0 EQ, 0 GE, 1 range).                
NOTE: The problem has 2 constraint coefficients.                                
NOTE: The objective function has 2 Hessian diagonal elements and 1 Hessian      
      elements above the diagonal.                                              
NOTE: The QP presolver value AUTOMATIC is applied.                              
NOTE: The QP presolver removed 0 variables and 0 constraints.                   
NOTE: The QP presolver removed 0 constraint coefficients.                       
NOTE: The presolved problem has 2 variables, 1 constraints, and 2 constraint    
      coefficients.                                                             
NOTE: The QP solver is called.                                                  
NOTE: The Interior Point algorithm is used.                                     
NOTE: The deterministic parallel mode is enabled.                               
NOTE: The Interior Point algorithm is using up to 4 threads.                    
                                           Primal        Bound         Dual     
      Iter   Complement  Duality Gap       Infeas       Infeas       Infeas     
         0  4.35616e-02  7.22160e-03  1.96367e-08  1.17851e-01  1.32417e-03     
         1  5.86807e-03  1.94022e-03  1.97061e-10  1.17851e-03  1.32417e-05     
         2  1.99615e-04  6.67063e-05  5.20022e-12  2.46618e-05  2.77099e-07     
         3  2.07995e-06  6.95270e-07  1.71890e-13  2.47945e-07  2.78591e-09     
         4  0.00000e+00  7.42506e-17  0.00000e+00  0.00000e+00  3.94044e-17     
NOTE: Optimal.                                                                  
NOTE: Objective = 0.00952380952.                                                
NOTE: The Interior Point solve time is 0.00 seconds.                            
NOTE: The data set WORK.LSPOUT has 2 observations and 9 variables.