The linear least squares problem arises in the context of determining a solution to an overdetermined set of linear equations. In practice, these equations could arise in data fitting and estimation problems. An overdetermined system of linear equations can be defined as
![\[ \mathbf{Ax} = \mathbf{b} \]](images/ormpug_optqp0069.png)
where
,
,
, and
. Since this system usually does not have a solution, you need to be satisfied with some sort of approximate solution. The
most widely used approximation is the least squares solution, which minimizes
.
This problem is called a least squares problem for the following reason. Let
,
, and
be defined as previously. Let
be the kth component of the vector
:
![\[ k_ i(x) = a_{i1}x_1 + a_{i2}x_2 + \cdots + a_{in}x_ n - b_ i,\; i = 1,2,\ldots , m \]](images/ormpug_optqp0077.png)
By definition of the Euclidean norm, the objective function can be expressed as follows:
![\[ \| \mathbf{Ax} - \mathbf{b}\| _2^2 = \sum \limits _{i = 1}^ m k_ i(x)^2 \]](images/ormpug_optqp0078.png)
Therefore, the function you minimize is the sum of squares of m terms
; hence the term least squares. The following example is an illustration of the linear least squares problem; that is, each of the terms
is a linear function of x.function
plus a constant,
.
Consider the following least squares problem defined by
![\[ \mathbf{A} = \left[\begin{array}{rc} 4 & 0 \\ -1 & 1 \\ 3 & 2 \end{array}\right],\quad \mathbf{b} = \left[\begin{array}{c} 1 \\ 0 \\ 1 \end{array}\right] \]](images/ormpug_optqp0082.png)
This translates to the following set of linear equations:
![\[ 4x_1 = 1, \quad -x_1 + x_2 = 0, \quad 3x_1 + 2x_2 = 1 \]](images/ormpug_optqp0083.png)
The corresponding least squares problem is
![\[ \mbox{minimize} \quad (4x_1 -1)^2 + (-x_1 + x_2)^2 + (3x_1 + 2x_2 -1)^2 \]](images/ormpug_optqp0084.png)
The preceding objective function can be expanded to
![\[ \mbox{minimize} \quad 26x_1^2 + 5x_2^2 + 10 x_1x_2 - 14 x_1 -4x_2 + 2 \]](images/ormpug_optqp0085.png)
In addition, you impose the following constraint so that the equation
is satisfied within a tolerance of 0.1:
![\[ 0.9 \leq 3x_1 + 2x_2 \leq 1.1 \]](images/ormpug_optqp0087.png)
You can create the QPS-format input data set by using the following SAS statements:
data lsdata; input field1 $ field2 $ field3 $ field4 field5 $ field6 @; datalines; NAME . LEASTSQ . . . ROWS . . . . . N OBJ . . . . G EQ3 . . . . COLUMNS . . . . . . X1 OBJ -14 EQ3 3 . X2 OBJ -4 EQ3 2 RHS . . . . . . RHS OBJ -2 EQ3 0.9 RANGES . . . . . . RNG EQ3 0.2 . . BOUNDS . . . . . FR BND1 X1 . . . FR BND1 X2 . . . QUADOBJ . . . . . . X1 X1 52 . . . X1 X2 10 . . . X2 X2 10 . . ENDATA . . . . . ;
The decision variables
and
are free, so they have bound type FR in the BOUNDS section of the QPS-format data set.
You can use the following SAS statements to solve the least squares problem:
proc optqp data=lsdata printlevel = 0 primalout = lspout; run;
The optimal solution is displayed in Output 14.1.1.
Output 14.1.1: Solution to the Least Squares Problem
The iteration log is shown in Output 14.1.2.
Output 14.1.2: Iteration Log
| NOTE: The problem LEASTSQ has 2 variables (2 free, 0 fixed). |
| NOTE: The problem has 1 constraints (0 LE, 0 EQ, 0 GE, 1 range). |
| NOTE: The problem has 2 constraint coefficients. |
| NOTE: The objective function has 2 Hessian diagonal elements and 1 Hessian |
| elements above the diagonal. |
| NOTE: The QP presolver value AUTOMATIC is applied. |
| NOTE: The QP presolver removed 0 variables and 0 constraints. |
| NOTE: The QP presolver removed 0 constraint coefficients. |
| NOTE: The presolved problem has 2 variables, 1 constraints, and 2 constraint |
| coefficients. |
| NOTE: The QP solver is called. |
| NOTE: The Interior Point algorithm is used. |
| NOTE: The deterministic parallel mode is enabled. |
| NOTE: The Interior Point algorithm is using up to 4 threads. |
| Primal Bound Dual |
| Iter Complement Duality Gap Infeas Infeas Infeas Time |
| 0 1.9181E-02 5.8936E-03 1.9637E-08 0.0000E+00 3.5390E-04 0 |
| 1 9.0486E-04 2.8311E-04 8.6896E-10 1.1565E-17 1.3055E-05 0 |
| 2 1.5370E-05 4.9441E-06 6.4151E-11 2.3130E-17 1.3055E-07 0 |
| 3 1.5357E-07 4.9397E-08 1.7428E-12 5.7824E-18 1.3056E-09 0 |
| NOTE: Optimal. |
| NOTE: Objective = 0.0095238095. |
| NOTE: The Interior Point solve time is 0.02 seconds. |
| NOTE: The data set WORK.LSPOUT has 2 observations and 9 variables. |