The Nonlinear Programming Solver

Covariance Matrix

You must specify the COVEST=() option to compute an approximate covariance matrix for the parameter estimates under asymptotic theory for least squares, maximum likelihood, or Bayesian estimation, with or without corrections for degrees of freedom as specified in the VARDEF= option.

The standard form of this class of the problems is one of following:

  • least squares (LSQ): $ \min f(x) = s \sum _{i = 1 }^{m} f^2_{i}(x) \hspace{2.0in} $

  • minimum or maximum (MIN/MAX): $ \mr{opt} \,  f(x) = s \sum _{i = 1 }^{m} f_{i}(x) \hspace{4.8in} $

For example, two groups of six different forms of covariance matrices (and therefore approximate standard errors) can be computed corresponding to the following two situations, where TERMS is an index set.

  • LSQ: The objective function consists solely of a positively scaled sum of squared terms, which means that least squares estimates are being computed:

    \[  \min f(x) = s \sum _{(i,j)\in \mbox{TERMS}} f^2_{ij}(x)  \]

    where $s > 0$.

  • MIN or MAX: The MIN or MAX declaration is specified, and the objective is not in least squares form. Together, these characteristics mean that maximum likelihood or Bayesian estimates are being computed:

    \[  \mr{opt} \,  f(x) = s \sum _{(i,j)\in \mbox{TERMS}} f_{ij}(x)  \]

    where $ \mr{opt}$ is either $ \min $ or $ \max $ and s is arbitrary.

In the preceding section, TERMS is used to denote an arbitrary index set. For example, if your problem is

\[  \min z = 0.5 \sum _{ i \in \mathcal{I}} (g_1^2[i] + g_2^2[i])  \]

then TERMS = $ \{ (i, j): i \in \mathcal{I} \mbox{ and } j \in \{ 1, 2 \}  \}  $, where $\mathcal{I}$ is the index set of input data. The following rules apply when you specify your objective function:

  • The terms $f_{ij}(x)$ are defined by using the IMPVAR declaration. The i and j values can be partitioned among observation and function indices as needed. Any number of indices can be used, including non-array indices to implicit variables.

  • The objective consists of a scaled sum of terms (or squared terms for least squares). The scaling, shown as s in the preceding equations, consists of outer multiplication or division by constants of the unscaled sum of terms (or squared terms for least squares). The unary $+$ or $-$ operators can also be used for scaling.

  • Least squares objectives require the scaling to be positive ($s > 0$). The individual $f_{ij}$ values are scaled by $\sqrt {2s}$ by PROC OPTMODEL.

  • Objectives that are not least squares allow arbitrary scaling. The scale value is distributed to the $f_{ij}$ values.

  • The summation of terms (or squared terms for least squares) is constructed with the binary $+$, SUM, and IF-THEN-ELSE operators (where IF-THEN-ELSE must have a first operand that does not depend on variables). The operands can be terms or a summation of terms (or squared terms for least squares).

  • A squared term is specified as term^2 or term**2.

  • The terms can either be IMPVAR expression or constant expressions (expressions that do not depend on variables). Each IMPVAR element can be referenced at most once in the objective.

  • The number of $f_{ij}$ terms is determined by counting the nonconstant terms.

The following PROC OPTMODEL statements demonstrate these rules:

var x{VARS};
impvar g{OBS} = ...;
impvar h{OBS} = ...;

/* This objective is okay. */
min z1 = sum{i in OBS} (g[i] + h[i]);

/* This objective is okay. */
min z2 = 0.5*sum{i in OBS} (g[i]^2 + h[i]^2);

/* This objective is okay. It demonstrates multiple levels of scaling. */
min z3 = 3*(sum{i in OBS} (g[i]^2 + h[i]^2))/2;

/* This objective is okay. */
min z4 = (sum{i in OBS} (g[i]^2 + h[i]^2))/2;

Note that the following statements are not accepted:


/* This objective causes an error because individual scaling is not allowed. */
/* (division applies to inner term)                                          */
min z5 = sum{i in OBS} (g[i]^2 + h[i]^2)/2;

/* This objective causes an error because individual scaling is not allowed. */
min z6 = sum{i in OBS} 0.5*g[i]^2;

/* This objective causes an error because the element g[1] is repeated. */
min z7 = g[1] + sum{i in OBS} g[i];

The covariance matrix is always positive semidefinite. For MAX type problems, the covariance matrix is converted to MIN type by using negative Hessian, Jacobian, and function values in the computation. You can use the following options to check for a rank deficiency of the covariance matrix:

  • The ASINGULAR= and MSINGULAR= options enable you to set two singularity criteria for the inversion of the matrix $ A$ that is needed to compute the covariance matrix, when $ A$ is either the Hessian or one of the crossproduct Jacobian matrices. The singularity criterion that is used for the inversion is

    \[  |d_{j,j}| \le \max (\Argument{asing}, \Argument{msing} \times \max (|A_{1,1}|,\ldots ,|A_{n,n}|))  \]

    where $ d_{j,j}$ is the diagonal pivot of the matrix $ A$, and asing and msing are the specified values of the ASINGULAR= and MSINGULAR= options, respectively.

  • If the matrix $ A$ is found to be singular, the NLP solver computes a generalized inverse that satisfies Moore-Penrose conditions. The generalized inverse is computed using the computationally expensive but numerically stable eigenvalue decomposition, $ A = Z \Lambda Z^ T$, where $ Z$ is the orthogonal matrix of eigenvectors and $\Lambda $ is the diagonal matrix of eigenvalues, $ \Lambda = \mr{diag}(\lambda _1,\ldots ,\lambda _ n) $. The generalized inverse of $ A$ is set to

    \[  A^- = Z \Lambda ^- Z^ T  \]

    where the diagonal matrix $\Lambda ^- = \mr{diag}(\lambda ^-_1,\ldots ,\lambda ^-_ n)$ is defined as follows, where covsing is the specified value of the COVSING= option:

    \[  \lambda ^-_ i = \left\{  \begin{array}{ll} 1 / \lambda _ i &  \mbox{if $|\lambda _ i| > \Argument{covsing}$} \\ 0 &  \mbox{if $|\lambda _ i| \le \Argument{covsing}$} \end{array} \right.  \]

    If the COVSING= option is not specified, then the default is $ \max (\Argument{asing}, \Argument{msing} \times \max (|A_{1,1}|,\ldots ,|A_{n,n}|))$, where asing and msing are the specified values of the ASINGULAR= and MSINGULAR= options, respectively.

For problems of the MIN or LSQ type, the matrices that are used to compute the covariance matrix are

\[  G = \nabla ^2 f(x)  \]
\[  J(f) = (\nabla f_1,\ldots ,\nabla f_ m) = \left( \frac{\partial f_ i}{\partial x_ j} \right)  \]
\[  JJ(f) = J(f)^ T J(f)  \]
\[  V = J(f)^ T \textrm{diag}(f_ i^2) J(f)  \]
\[  W = J(f)^ T (\textrm{diag}(f_ i))^{-1} J(f)  \]

where $f_ i$ is defined in the standard form of the covariance matrix problem. Note that when some $f_ i$ are 0, $(\textrm{diag}(f_ i))^{-1}$ is computed as a generalized inverse.

For unconstrained minimization, the formulas of the six types of covariance matrices are given in Table 10.2. The value of d in the table depends on the VARDEF= option and the values of the NDF= and NTERMS= options, ndf and nterms, respectively, as follows:

\[  d = \left\{  \begin{array}{ll} \max (1,\Argument{nterms} - \Argument{ndf}) &  \mbox{for VARDEF=DF} \\ \Argument{nterms} &  \mbox{for VARDEF=N} \end{array} \right.  \]

The value of $\sigma ^2$ depends on the specification of the SIGSQ= option and on the value of d,

\[  \sigma ^2 = \left\{  \begin{array}{ll} \Argument{sq} \times \Argument{nterms}/ d &  \mbox{if SIGSQ=}\Argument{sq}\mbox{ is specified} \\ 2 f(x^*) / d &  \mbox{if SIGSQ= is not specified} \end{array} \right.  \]

where $ f(x^*)$ is the value of the objective function at the optimal solution $ x^*$.

Because of the analytic definition, in exact arithmetic the covariance matrix is positive semidefinite at the solution. A warning message is issued if numerical computation does not result in a positive semidefinite matrix. This can happen because round-off error is introduced or the incorrect type of covariance matrix for a specified problem is selected.