This section describes a number of problems that might occur in your analysis with PROC NLIN.
If you specify a grid of starting values that contains many points, the analysis might take excessive time since the procedure must go through the entire data set for each point on the grid.
The analysis might also take excessive time if your problem takes many iterations to converge, since each iteration requires as much time as a linear regression with predicted values and residuals calculated.
The matrix of partial derivatives can be singular, possibly indicating an overparameterized model. For example, if b0
starts at zero in the following model, the derivatives for b1
are all zero for the first iteration:
parms b0=0 b1=.022; model pop=b0*exp(b1*(year-1790)); der.b0=exp(b1*(year-1790)); der.b1=(year-1790)*b0*exp(b1*(year-1790));
The first iteration changes a subset of the parameters; then the procedure can make progress in succeeding iterations. This
singularity problem is local. The next example displays a global problem. The term b2
in the exponent is not identifiable since it trades roles with b0
.
parms b0=3.9 b1=.022 b2=0; model pop=b0*exp(b1*(year-1790)+b2); der.b0 = exp(b1*(year-1790)+b2); der.b1 = (year-1790)*b0*exp(b1*(year-1790)+b2); der.b2 = b0*exp(b1*(year-1790)+b2);
The method can lead to steps that do not improve the estimates even after a series of step halvings. If this happens, the procedure issues a message stating that it is unable to make further progress, but it then displays the following warning message:
PROC NLIN failed to converge
Then it displays the results. This often means that the procedure has not converged at all. If you provided your own derivatives, check them carefully and then examine the residual sum of squares surface. If PROC NLIN has not converged, try a different set of starting values, a different METHOD= specification, the G4 option, or a different model.
The iterative process might diverge, resulting in overflows in computations. It is also possible that parameters enter a space where arguments to such functions as LOG and SQRT become invalid. For example, consider the following model:
parms b=0; model y = x / b;
Suppose that y
contains only zeros, and suppose that the values for variable x
are not zero. There is no least squares estimate for b
since the SSE declines as b
approaches infinity or minus infinity. To avoid the problem, the same model could be parameterized as y = a*x
.
If you have divergence problems, try reparameterizing the model, selecting different starting values, increasing the maximum allowed number of iterations (the MAXITER= option), specifying an alternative METHOD= option, or including a BOUNDS statement.
The program might converge to a local rather than a global minimum. For example, consider the following model:
parms a=1 b=-1; model y=(1-a*x)*(1-b*x);
Once a solution is found, an equivalent solution with the same SSE can be obtained by swapping the values of a
and b
.
The computational methods assume that the model is a continuous and smooth function of the parameters. If this is not true, the method does not work. For example, the following models do not work:
model y=a+int(b*x);
model y=a+b*x+4*(z>c);
PROC NLIN does not necessarily produce a good solution the first time. Much depends on specifying good initial values for the parameters. You can specify a grid of values in the PARMS statement to search for good starting values. While most practical models should give you no trouble, other models can require switching to a different iteration method or a different computational method for matrix inversion. Specifying the option METHOD=MARQUARDT sometimes works when the default method (Gauss-Newton) does not work.