Selection of Smoothing Parameters |
The smoothers discussed here have a single smoothing parameter. In choosing the smoothing parameter, cross validation can be used. Cross validation works by leaving points out one at a time, estimating the squared residual for smooth function at
based on the remaining
data points, and choosing the smoother to minimize the sum of those squared residuals. This mimics the use of training and test samples for prediction. The cross validation function is defined as
![]() |
where indicates the fit at
, computed by leaving out the
th data point. The quantity
is sometimes called the prediction sum of squares, or
(Allen 1974).
All of the smoothers fit by the GAM procedure can be formulated as a linear combination of the sample responses
![]() |
for some matrix , which depends on
. (The matrix
depends on
and the sample data as well, but this dependence is suppressed in the preceding equation.) Let
be the
th diagonal element of
. Then the
function can be expressed as
![]() |
In most cases, it is very time-consuming to compute the quantity individually. To solve this computational problem, Wahba (1990) has proposed the generalized cross validation function (
) that can be used to solve a wide variety of problems involving selection of a parameter to minimize the prediction risk.
The function is defined as
![]() |
The formula simply replaces the
with
. Therefore, it can be viewed as a weighted version of
. In most of the cases of interest,
is closely related to
but much easier to compute. Specify the METHOD=GCV option in the MODEL statement in order to use the
function to choose the smoothing parameters.
The estimated GAM model can be expressed as
![]() |
Because the weights are calculated based on previous iteration during the local scoring iteration, the matrices might depend on
for non-Gaussian data. However, for the final iteration, the
matrix for the spline smoothers has the same role as the projection matrix in linear regression; therefore, nonparametric degrees of freedom (DF) for the
th spline smoother can be defined as
![]() |
For loess smoothers is not symmetric and so is not a projection matrix. In this case PROC GAM uses
![]() |
The GAM procedure gives you the option of specifying the degrees of freedom for each individual smoothing component. If you choose a particular value for the degrees of freedom, then during every local scoring iteration the procedure will search for a corresponding smoothing parameter lambda that yields the specified value or comes as close as possible. The final estimate for the smoother during this local scoring iteration will be based on this lambda. Note that for univariate spline and loess components, an additional degree of freedom is used by default to account for the linear portion of the model, so the value displayed in the “Fit Summary” and “Analysis of Deviance” tables will be one less than the value you specify.