The NLIN Procedure

Notation for Nonlinear Regression Models

This section briefly introduces the basic notation for nonlinear regression models that applies in this chapter. Additional notation is introduced throughout as needed.

The $(n \times 1)$ vector of observed responses is denoted as $\mb{y}$ . This vector is the realization of an $(n \times 1)$ random vector $\mb{Y}$ . The NLIN procedure assumes that the variance matrix of this random vector is $\sigma ^2\mb{I}$ . In other words, the observations have equal variance (are homoscedastic) and are uncorrelated. By defining the special variable _WEIGHT_ in your NLIN programming statements, you can introduce heterogeneous variances. If a _WEIGHT_ variable is present, then $\mr{Var}[\mb{Y}] = \sigma ^2\mb{W}^{-1}$ , where $\mb{W}$ is a diagonal matrix containing the values of the _WEIGHT_ variable.

The mean of the random vector is represented by a nonlinear model that depends on parameters $\beta _1,\cdots ,\beta _ p$ and regressor (independent) variables $z_1,\cdots ,z_ k$ :

$\mr{E}[Y_ i] = f\left(\beta _1,\beta _2,\cdots ,\beta _ p;z_{i1},\cdots ,z_{ik}\right)$

In contrast to linear models, the number of regressor variables (k) does not necessarily equal the number of parameters (p) in the mean function $f(\, )$ . For example, the model fitted in the next subsection contains a single regressor and two parameters.

To represent the mean of the vector of observations, boldface notation is used in an obvious extension of the previous equation:

$\mr{E}[\mb{Y}] = \mb{f}(\bbeta ;\mb{z}_1,\cdots ,\mb{z}_ k)$

The vector $\mb{z}_1$ , for example, is an $(n \times 1)$ vector of the values for the first regressor variables. The explicit dependence of the mean function on $\bbeta$ and/or the $\mb{z}$ vectors is often omitted for brevity.

In summary, the stochastic structure of models fit with the NLIN procedure is mathematically captured by

$\begin{align*} \mb{Y} & = \mb{f}(\bbeta ;\mb{z}_1,\cdots ,\mb{z}_ k) + \bepsilon \\ \mr{E}[\bepsilon ] & = \mb{0} \\ \mr{Var}[\bepsilon ] & = \sigma ^2\mb{I} \end{align*}$

Note that the residual variance $\sigma ^2$ is typically also unknown. Since it is not estimated in the same fashion as the other p parameters, it is often not counted in the number of parameters of the nonlinear regression. An estimate of $\sigma ^2$ is obtained after the model fit by the method of moments based on the residual sum of squares.

A matrix that plays an important role in fitting nonlinear regression models is the $(n \times p)$ matrix of the first partial derivatives of the mean function $\mb{f}$ with respect to the p model parameters. It is frequently denoted as

$\mb{X} = \frac{\partial \mb{f}\left(\bbeta ;\mb{z}_1,\cdots ,\mb{z}_ k\right)}{\partial \bbeta }$

The use of the symbol $\mb{X}$ —common in linear statistical modeling—is no accident here. The first derivative matrix plays a similar role in nonlinear regression to that of the $\mb{X}$ matrix in a linear model. For example, the asymptotic variance of the nonlinear least-squares estimators is proportional to $(\mb{X}’\mb{X})^{-1}$ , and projection-type matrices in nonlinear regressions are based on $\mb{X}(\mb{X}’\mb{X})^{-1}\mb{X}’$ . Also, fitting a nonlinear regression model can be cast as an iterative process where a nonlinear model is approximated by a series of linear models in which the derivative matrix is the regressor matrix. An important difference between linear and nonlinear models is that the derivatives in a linear model do not depend on any parameters (see previous subsection). In contrast, the derivative matrix $\partial \mb{f}(\bbeta )/\partial \bbeta$ is a function of at least one element of $\bbeta$ . It is this dependence that lies at the core of the fact that estimating the parameters in a nonlinear model cannot be accomplished in closed form, but it is an iterative process that commences with user-supplied starting values and attempts to continually improve on the parameter estimates.