Numerical Solution Methods :: SAS/ETS(R) 12.1 User's Guide

Single-Equation Solution

For normalized form equation systems, the solution either can simultaneously satisfy all the equations or can be computed for each equation separately, by using the actual values of the solution variables in the current period to compute each predicted value. By default, PROC MODEL computes a simultaneous solution. The SINGLE option in the SOLVE statement selects single-equation solutions.

Single-equation simulations are often used to produce residuals (which estimate the random terms of the stochastic equations) rather than the predicted values themselves. If the input data and range are the same as those used for parameter estimation, a static single-equation simulation reproduces the residuals of the estimation.

Newton’s Method

The NEWTON option in the SOLVE statement requests Newton’s method to simultaneously solve the equations for each observation. Newton’s method is the default solution method. Newton’s method is an iterative scheme that uses the derivatives of the equations with respect to the solution variables, ${\bJ }$ , to compute a change vector as

${\Delta } \mb {y} ^{i} = {\bJ }^{-1}\mb {q} (\mb {y} ^{i},\mb {x}, {{\btheta }})$

PROC MODEL builds and solves ${\bJ }$ by using efficient sparse matrix techniques. The solution variables y $^{i}$ at the ith iteration are then updated as

$\mb {y} ^{i+1} = \mb {y} ^{i} + d \times {\Delta } \mb {y} ^{i}$

where d is a damping factor between 0 and 1 chosen iteratively so that

${\Vert } \mb {q} (\mb {y} ^{i+1},\mb {x}, {{\btheta }}) {\Vert } < {\Vert } \mb {q} (\mb {y} ^{i},\mb {x}, {{\btheta }}) {\Vert }$

The number of subiterations that are allowed for finding a suitable d is controlled by the MAXSUBITER= option. The number of iterations of Newton’s method that are allowed for each observation is controlled by MAXITER= option. See Ortega and Rheinbolt (1970) for more details.

Optimization Method

The OPTIMIZE option in the SOLVE statement requests that an optimization algorithm be used to minimize a norm of the errors in equations subject to constraints on the solution variables. The OPTIMIZE method is the only solution method that supports constraints on solution variables that are specified using the BOUNDS and RESTRICT statements. Constraints are ignored by the other solution methods. The OPTIMIZE method performs the following optimization:

$\displaystyle \textrm{minimize} \qquad$	$\displaystyle \Vert \mb {q}(\mb {y},\mb {x},\btheta )\Vert$
$\displaystyle \textrm{subject to} \qquad$	$\displaystyle \mb {y}_ l \leq \mb {y} \leq \mb {y}_ u$
$\displaystyle \textrm{and} \qquad$	$\displaystyle f(\mb {y}) \geq 0)$

The norm used in the minimization process is

$\Vert \mb {q}(\mb {y},\mb {x},\btheta )\Vert = \mb {q}(\mb {y},\mb {x},\btheta )’\textrm{diag}(\mb {S})^{-1}\mb {q}(\mb {y},\mb {x},\btheta )$

where the $\mb {S}$ matrix is the covariance of equation errors that is specified by the SDATA= option in the SOLVE statement. If no SDATA= option is specified, the identity matrix is used. Both strict inequality and inequality constraints on the solution variables can be imposed using the BOUNDS or RESTRICT statement. For bounded problems, each lower and upper strict inequality is transformed into an inequality by using the equations

$\displaystyle y_ l$	$\displaystyle = (y_{\textrm{lower strict}} + \epsilon )/(1 - \epsilon )$
$\displaystyle y_ u$	$\displaystyle = (y_{\textrm{upper strict}} - \epsilon )/(1 + \epsilon )$

When strict inequality expressions are imposed using the RESTRICT statement, these expressions are transformed into an inequality by using the equation

$f(\mb {y}) = (f_{\textrm{strict}}(\mb {y}) + \epsilon )/(1 - \epsilon )$

where $f_{\textrm{strict}}(\mb {y})$ is a nonlinear strict inequality constraint. The tolerance $\epsilon$ is controlled by the EPSILON= option in the SOLVE statement and defaults to $10^{-8}$ . To achieve the best performance from the minimization algorithm, both the first and second analytic derivatives of the equation erorrs with respect to the solution variables are used to compute the gradient and second derivatives of the objective function, $\Vert \mb {q}(\mb {y},\mb {x},\btheta )\Vert$ . Analytic derivatives of the restriction expressions that are used to specify constraints are also used in the minimization. The gradient of the objective function is

$\nabla \Vert \mb {q}(\mb {y},\mb {x},\btheta )\Vert = 2\, \mb {J’}\textrm{diag}(\mb {S})^{-1}\mb {q}(\mb {y},\mb {x},\btheta )$

The matrix of second derivatives of the objective function with respect to the solution variables is

$\frac{\partial ^2\Vert \mb {q}(\mb {y},\mb {x},\btheta )\Vert }{\partial \mb {y}^2} = 2\left(\mb {J’}\textrm{diag}(\mb {S})^{-1}\mb {J} + \sum _{k=1}^ d \frac{\partial ^2q_ k(\mb {y},\mb {x},\btheta )}{\partial \mb {y}^2} \textrm{diag}(\mb {S})^{-1}q_ k(\mb {y},\mb {x},\btheta ) \right)$

where is the number of equations.

The algorithm that is used to find a minimum of $\Vert \mb {q}(\mb {y},\mb {x},\btheta )\Vert$ subject to bounds on the solution variables employs the interior point technique for nonlinear optimization problems. For further information about this optimization method, see Chapter 8: The Nonlinear Programming Solver in SAS/OR 12.1 User's Guide: Mathematical Programming.

When constraints are active in a solution, the minimum value of the objective function, $\Vert \mb {q}(\mb {y},\mb {x},\btheta )\Vert$ , is typically greater than 0. The diagnostic quantities that are produced by the OUTOBJVALS and OUTVIOLATIONS options are available to help identify and characterize solutions that have active bounds constraints. The following program contains a boundary constraint that becomes active in steps 6, 8, 10, 12, 13, and 16 of a Monte Carlo simulation:

proc model data=d sdata=s;
   dependent rate stock;
   parms theta   0.2
         kappa   0.002
         sigma   0.4
         sinit   1
         vol     .1;
   id i;

   bounds rate >= 0;

   rate   = zlag(rate) + kappa*(theta - zlag(rate));
   h.rate = sigma**2 * zlag(rate);
   eq.stock = log(stock/sinit) - (rate + vol*vol/2);
   h.stock = vol**2;

   solve / optimize converge=1e-6 seed=1 random=1 out=o outobjvals outviolations;
quit;   

proc print data=o(where=(_objval_>1e-6));
run;

Figure 19.84 shows how the OUTOBJVALS option can be used to identify simulation steps with an active bounds constraint, and how the OUTVIOLATIONS option can be used to determine that the RATE equation is not satisfied for those steps.

Figure 19.84: Objective Function and Violation Values

Obs	i	_TYPE_	_MODE_	_REP_	_OBJVAL_	rate	stock
51	6	PREDICT	SIMULATE	1	.000363415	0.000027	1.03050
52	6	VIOL	SIMULATE	1	.000363415	-0.019073	0.00000
55	8	PREDICT	SIMULATE	1	.000123866	0.000045	1.08828
56	8	VIOL	SIMULATE	1	.000123866	-0.011151	0.00000
59	10	PREDICT	SIMULATE	1	.000330766	0.000028	0.96248
60	10	VIOL	SIMULATE	1	.000330766	-0.018207	-0.00000
63	12	PREDICT	SIMULATE	1	.000034095	0.000086	0.85526
64	12	VIOL	SIMULATE	1	.000034095	-0.005895	-0.00000
65	13	PREDICT	SIMULATE	1	.000011997	0.000141	1.10514
66	13	VIOL	SIMULATE	1	.000011997	-0.003573	-0.00000
71	16	PREDICT	SIMULATE	1	.000118982	0.000046	1.07103
72	16	VIOL	SIMULATE	1	.000118982	-0.010931	0.00000

Jacobi Method

The JACOBI option in the SOLVE statement selects a matrix-free alternative to Newton’s method. This method is the traditional nonlinear Jacobi method found in the literature. The Jacobi method as implemented in PROC MODEL substitutes predicted values for the endogenous variables and iterates until a fixed point is reached. Then necessary derivatives are computed only for the diagonal elements of the jacobian, J.

If the normalized form equation is

$\mb {y} = \mb {f} (\mb {y} ,\mb {x} , {{\btheta }})$

the Jacobi iteration has the form

$\mb {y} ^{i+1} = \mb {f} (\mb {y} ^{i},\mb {x} , {{\btheta }})$

Seidel Method

The Seidel method is an order-dependent alternative to the Jacobi method. You select the Seidel method by specifying the SEIDEL option in the SOLVE statement. The Seidel method is like the Jacobi method, except that in the Seidel method the model is further edited to substitute the predicted values into the solution variables immediately after they are computed. The Seidel method thus differs from the other methods in that the values of the solution variables are not fixed within an iteration. With the other methods, the order of the equations in the model program makes no difference, but the Seidel method might work much differently when the equations are specified in a different sequence. This fixed-point method is the traditional nonlinear Seidel method found in the literature.

The iteration has the form

$\mb {y} ^{i+1}_{j} = \mb {f} (\mb {\hat{y}} ^{i},\mb {x}, {{\btheta }})$

where ${ \mb {y} ^{i+1}_{j}}$ is the jth equation variable at the ith iteration and

$\mb {\hat{y}} ^{i} = ( y^{i+1}_{1}, y^{i+1}_{2}, y^{i+1}_{3}, {\ldots }, y^{i+1}_{j-1}, y^{i}_{j}, y^{i}_{j+1}, {\ldots }, y^{i}_{g})’$

If the model is recursive, and if the equations are in recursive order, the Seidel method converges at once. If the model is block-recursive, the Seidel method might converge faster if the equations are grouped by block and the blocks are placed in block-recursive order. The BLOCK option can be used to determine the block-recursive form.

Jacobi and Seidel Methods with General Form Equations

Jacobi and Seidel solution methods support general form equations.

There are two cases where derivatives are (automatically) computed. The first case is for equations with the solution variable on the right-hand side and on the left-hand side of the equation

$y^ i = f( \mb {x}, y^ i )$

In this case the derivative of ERROR. with respect to is computed, and the new approximation is computed as

$y^{i+1} = y^{ i} - \frac{ f( \mb {x}, y^{i} ) - y^{i}}{ {{\partial }( f( \mb {x}, y^{i}) - y^{i} ) / {\partial }y}}$

The second case is a system of equations that contains one or more EQ. equations. In this case, the MODEL procedure assigns a unique solution variable to each equation if such an assignment exists. Use the DETAILS option in the SOLVE statement to print a listing of the assigned variables.

Once the assignment is made, the new approximation is computed as

$y^{i+1} = y^{i} - \frac{ f( \mb {x}, \mb {y}^{i}) - y^{i}}{ {{\partial }( f( \mb {x}, \mb {y}^{i}) - y^{i} ) / {\partial }y}}$

If is the number of general form equations, then derivatives are required.

The convergence properties of the Jacobi and Seidel solution methods remain significantly poorer than the default Newton’s method.

Comparison of Methods

Newton’s method is the default and should work better than the others for most small- to medium-sized models. The Seidel method is always faster than the Jacobi for recursive models with equations in recursive order. For very large models and some highly nonlinear smaller models, the Jacobi or Seidel methods can sometimes be faster. Newton’s method uses more memory than the Jacobi or Seidel methods.

Both the Newton’s method and the Jacobi method are order-invariant in the sense that the order in which equations are specified in the model program has no effect on the operation of the iterative solution process. In order-invariant methods, the values of the solution variables are fixed for the entire execution of the model program. Assignments to model variables are automatically changed to assignments to corresponding equation variables. Only after the model program has completed execution are the results used to compute the new solution values for the next iteration.

Troubleshooting Problems

In solving a simultaneous nonlinear dynamic model you might encounter some of the following problems.

Missing Values

For SOLVE tasks, there can be no missing parameter values. Missing right-hand-side variables result in missing left-hand-side variables for that observation.

Unstable Solutions

A solution might exist but be unstable. An unstable system can cause the Jacobi and Seidel methods to diverge.

Explosive Dynamic Systems

A model might have well-behaved solutions at each observation but be dynamically unstable. The solution might oscillate wildly or grow rapidly with time.

Propagation of Errors

During the solution process, solution variables can take on values that cause computational errors. For example, a solution variable that appears in a LOG function might be positive at the solution but might be given a negative value during one of the iterations. When computational errors occur, missing values are generated and propagated, and the solution process might collapse.

Convergence Problems

The following items can cause convergence problems:

There are illegal function values ( for example ${\sqrt {-1}}$ ).
There are local minima in the model equation.
No solution exists.
Multiple solutions exist.
Initial values are too far from the solution.
The CONVERGE= value is too small.

When PROC MODEL fails to find a solution to the system, the current iteration information and the program data vector are printed. The simulation halts if actual values are not available for the simulation to proceed. Consider the following program, which produces the output shown in Figure 19.85:

data test1;
   do t=1 to 50;
      x1 = sqrt(t) ;
      y = .;
      output;
   end;

proc model data=test1;
   exogenous x1 ;
   control a1 -1 b1 -29 c1 -4 ;
   y = a1 * sqrt(y) + b1 * x1 * x1 + c1 * lag(x1);
   solve y / out=sim forecast dynamic ;
run;

Figure 19.85: SOLVE Convergence Problems

The MODEL Procedure

Dynamic Single-Equation Forecast

Could not reduce norm of residuals in 10 subiterations.

The solution failed because 1 equations are missing or have extreme values for observation 1 at NEWTON iteration 1.

Note:

Additional information on the values of the variables at this observation, which may be helpful in determining the cause of the failure of the solution process, is printed below.

Observation	1	Iteration	1	CC	-1.000000
		Missing	1

Iteration Errors - Missing.

                              The MODEL Procedure                               
                        Dynamic Single-Equation Forecast                        
                                                                                
                     --- Listing of Program Data Vector ---                     
   _N_:               12     ACTUAL.x1:    1.41421     ACTUAL.y:           .    
   ERROR.y:            .     PRED.y:             .     a1:                -1    
   b1:               -29     c1:                -4     x1:           1.41421    
   y:           -0.00109                                                        
   @PRED.y/@y:           .   @ERROR.y/@y:          .

Note:

Check for missing input data or uninitialized lags.

(Note that the LAG and DIF functions return missing values for the initial lag starting observations. This is a change from the 1982 and earlier versions of SAS/ETS which returned zero for uninitialized lags.)

Note:

Simulation aborted.

At the first observation, a solution to the following equation is attempted:

$y = - \sqrt {y} - 62$

There is no solution to this problem. The iterative solution process got as close as it could to making Y negative while still being able to evaluate the model. This problem can be avoided in this case by altering the equation.

In other models, the problem of missing values can be avoided by either altering the data set to provide better starting values for the solution variables or by altering the equations.

You should be aware that, in general, a nonlinear system can have any number of solutions and the solution found might not be the one that you want. When multiple solutions exist, the solution that is found is usually determined by the starting values for the iterations. If the value from the input data set for a solution variable is missing, the starting value for it is taken from the solution of the last period (if nonmissing) or else the solution estimate is started at 0.

Iteration Output

The iteration output, produced by the ITPRINT option, is useful in determining the cause of a convergence problem. The ITPRINT option forces the printing of the solution approximation and equation errors at each iteration for each observation. A portion of the ITPRINT output from the following statements is shown in Figure 19.86.

proc model data=test1;
   exogenous x1 ;
   control a1 -1 b1 -29 c1 -4 ;
   y = a1 * sqrt(abs(y)) + b1 * x1 * x1 + c1 * lag(x1);
   solve y / out=sim forecast dynamic itprint;
run;

For each iteration, the equation with the largest error is listed in parentheses after the Newton convergence criteria measure. From this output you can determine which equation or equations in the system are not converging well.

Figure 19.86: SOLVE, ITPRINT Output

The MODEL Procedure

Dynamic Single-Equation Forecast

Observation	1	Iteration	0	CC	613961.39	ERROR.y	-62.01010

Predicted Values
y
0.0001000

Iteration Errors
y
-62.01010

Observation	1	Iteration	1	CC	50.902771	ERROR.y	-61.88684

Predicted Values
y
-1.215784

Iteration Errors
y
-61.88684

Observation	1	Iteration	2	CC	0.364806	ERROR.y	41.752112

Predicted Values
y
-114.4503

Iteration Errors
y
41.75211