The NLMIXED Procedure

Finite-Difference Approximations of Derivatives

The FD= and FDHESSIAN= options specify the use of finite-difference approximations of the derivatives. The FD= option specifies that all derivatives are approximated using function evaluations, and the FDHESSIAN= option specifies that second-order derivatives are approximated using gradient evaluations.

Computing derivatives by finite-difference approximations can be very time-consuming, especially for second-order derivatives based only on values of the objective function (FD= option). If analytical derivatives are difficult to obtain (for example, if a function is computed by an iterative process), you might consider one of the optimization techniques that use first-order derivatives only (QUANEW, DBLDOG, or CONGRA). In the expressions that follow, $\btheta $ denotes the parameter vector, $h_ i$ denotes the step size for the ith parameter, and $\mb{e}_ i$ is a vector of zeros with a 1 in the ith position.

Forward-Difference Approximations

The forward-difference derivative approximations consume less computer time, but they are usually not as precise as approximations that use central-difference formulas.

  • For first-order derivatives, n additional function calls are required:

    \begin{align*}  g_ i & = \frac{\partial f}{\partial \theta _ i} \approx \frac{f(\btheta + h_ i\mb{e}_ i) - f(\btheta )}{h_ i} \end{align*}
  • For second-order derivatives based on function calls only (Dennis and Schnabel, 1983, p. 80), $n+n^2/2$ additional function calls are required for dense Hessian:

    \begin{align*}  \frac{\partial ^2 f}{\partial \theta _ i \partial \theta _ j} & \approx \frac{f(\btheta +h_ i\mb{e}_ i+h_ j\mb{e}_ j) - f(\btheta +h_ i\mb{e}_ i) - f(\btheta +h_ j\mb{e}_ j) + f(\btheta )}{h_ i h_ j} \end{align*}
  • For second-order derivatives based on gradient calls (Dennis and Schnabel, 1983, p. 103), n additional gradient calls are required:

    \begin{align*}  \frac{\partial ^2 f}{\partial \theta _ i \partial \theta _ j} & \approx \frac{g_ i(\btheta + h_ j\mb{e}_ j) - g_ i(\btheta )}{2h_ j} + \frac{g_ j(\btheta + h_ i\mb{e}_ i) - g_ j(\btheta )}{2h_ i} \end{align*}

Central-Difference Approximations

Central-difference approximations are usually more precise, but they consume more computer time than approximations that use forward-difference derivative formulas.

  • For first-order derivatives, 2n additional function calls are required:

    \begin{align*}  g_ i & = \frac{\partial f}{\partial \theta _ i} \approx \frac{f(\btheta + h_ i\mb{e}_ i) - f(\btheta - h_ i\mb{e}_ i)}{2h_ i} \end{align*}
  • For second-order derivatives based on function calls only (Abramowitz and Stegun, 1972, p. 884), $2n+4n^2/2$ additional function calls are required.

    \begin{align*}  \frac{\partial ^2 f}{\partial \theta ^2_ i} & \approx \frac{-f(\btheta + 2h_ i\mb{e}_ i) + 16f(\btheta + h_ i\mb{e}_ i) - 30f(\btheta ) + 16f(\btheta - h_\mb {i}e_ i) - f(\btheta - 2h_ i\mb{e}_ i)}{12h^2_ i} \end{align*}
    \begin{align*}  \frac{\partial ^2 f}{\partial \theta _ i \partial \theta _ j} & \approx \frac{f(\btheta +h_ i\mb{e}_ i+h_ j\mb{e}_ j) - f(\btheta +h_ i\mb{e}_ i-h_ j\mb{e}_ j) - f(\btheta -h_ i\mb{e}_ i+h_ j\mb{e}_ j) + f(\btheta -h_ i\mb{e}_ i-h_ j\mb{e}_ j)}{4h_ ih_ j} \nonumber \end{align*}
  • For second-order derivatives based on gradient calls, 2n additional gradient calls are required:

    \begin{align*}  \frac{\partial ^2 f}{\partial \theta _ i \partial \theta _ j} & \approx \frac{g_ i(\btheta + h_ j\mb{e}_ j) - g_ i(\btheta - h_ j\mb{e}_ j)}{4h_ j} + \frac{g_ j(\btheta + h_ i\mb{e}_ i) - g_ j(\btheta - h_ i\mb{e}_ i)}{4h_ i} \end{align*}

You can use the FDIGITS= option to specify the number of accurate digits in the evaluation of the objective function. This specification is helpful in determining an appropriate interval size h to be used in the finite-difference formulas.

The step sizes $h_ j$, $j=1,\ldots ,n$ are defined as follows:

  • For the forward-difference approximation of first-order derivatives that use function calls and second-order derivatives that use gradient calls, $h_ j = \sqrt [2]{\eta } (1 + |\theta _ j|)$.

  • For the forward-difference approximation of second-order derivatives that use only function calls and all central-difference formulas, $h_ j = \sqrt [3]{\eta } (1 + |\theta _ j|)$.

The value of $\eta $ is defined by the FDIGITS= option:

  • If you specify the number of accurate digits by using FDIGITS= r, $\eta $ is set to $10^{-r}$.

  • If you do not specify the FDIGITS= option, $\eta $ is set to the machine precision $\epsilon $.