|
Chapter Contents |
Previous |
Next |
| The LOESS Procedure |

The relevant formulae are

where n is the number of observations and

You invoke automatic smoothing parameter selection by specifying the SELECT=criterion option in the MODEL statement, where criterion is one of AICC1, AICC, or GCV. PROC LOESS evaluates the specified criterion for a sequence of smoothing parameter values and selects the value in this sequence that minimizes the specified criterion. If multiple values yield the optimum, then the largest of these values is selected. The results are summarized in the "Smoothing Criterion" table. This table is displayed whenever automatic smoothing parameter selection is performed. You can obtain details of the sequence of models examined by specifying the DETAILS(MODELSUMMARY) option in the model statement to display the "Model Summary" table.
There are several ways in which you can control the sequence of models examined by PROC LOESS. If you specify the SMOOTH=value-list option in the MODEL statement, then only the values in this list are examined in performing the selection. For example, the following statements select the model that minimizes the AICC1 criterion among the three models with smoothing parameter values 0.1, 0.3, and 0.4:
proc loess data=notReal;
model y= x1/ smooth=0.1 0.3 0.4 select=AICC1;
run;
If you do not specify the SMOOTH= option in the model statement, then by default PROC LOESS uses a golden section search method to find a local minimum of the specified criterion in the range (0,1]. You can use the RANGE(lower,upper) modifier in the SELECT= option to change the interval in which the golden section search is performed. For example, the following statements request a golden section search to find a local minimizer of the GCV criterion for smoothing parameter values in the interval [0.1,0.5]:
proc loess data=notReal;
model y= x1/select=GCV( range(0.1,0.5) );
run;
If you want to be sure of obtaining a global minimum in the range of smoothing parameter values examined, you can specify the GLOBAL modifier in the SELECT= option. For example, the following statements request that a global minimizer of the AICC criterion be obtained for smoothing parameter values in the interval [0.2,0.8]:
proc loess data=notReal;
model y= x1/select=AICC( global range(0.2,0.8) );
run;
Note that even though the smoothing parameter is a continuous variable, a given range of smoothing parameter values corresponds to a finite set of local models. For example, for a data set with 100 observations, the range [0.2,0.4] corresponds to models with 20,21,22, ... ,40 points in the local neighborhoods. If the GLOBAL modifier is specified, all possible models in the range are evaluated sequentially.
Note that by default PROC LOESS displays a "Fit Summary" and other optionally requested tables only for the selected model. You can request that these tables be displayed for all models in the selection process by adding the STEPS modifier in the SELECT= option. Also note that by default scoring requested with SCORE statements is done only for the selected model. However, if you specify the STEPS in both the MODEL and SCORE statements, then all models evaluated in the selection process are scored.
In terms of computation, AICC and GCV depend on the
smoothing matrix L only through its trace. In the direct method,
this trace can be computed efficiently. In the interpolated
method using kd trees, there is some additional computational cost
but the overall work is not significant compared to the rest of
the computation. In contrast, the quantities
,
,
and
, which appear in the AICC1 criterion, depend on the
entire L matrix and for this reason, the time needed to compute these
quantities dominates the time required for the model fitting. Hence
SELECT=AICC1 is much more computationally expensive than SELECT=AICC
and SELECT=GCV, especially when combined with the GLOBAL modifier.
Hurvich, Simonoff, and Tsai (1998) note that AICC can be regarded
as an approximation of AICC1 and that "the AICC selector
generally performs well in all circumstances."
For models with one dependent variable, PROC LOESS uses SELECT=AICC as its default, if you specify neither the SMOOTH= nor SELECT= options in the MODEL statement. With two or more dependent variables, automatic smoothing parameter selection needs to be done separately for each dependent variable. For this reason automatic smoothing parameter selection is not available for models with multiple dependent variables. In such cases you should use a separate PROC LOESS step for each dependent variable, if you want to use automatic smoothing parameter selection.
|
Chapter Contents |
Previous |
Next |
Top |
Copyright © 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.