Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The GAM Procedure

Selection of Smoothing Parameters

CV and GCV

The smoothers discussed here have a single smoothing parameter. In choosing the smoothing parameter, cross validation can be used. Cross validation works by leaving points (xi, yi) out one at a time, estimating the squared residual for smooth function at xi based on the remaining n-1 data points, and choosing the smoother to minimize the sum of those squared residuals. This mimics the use of training and test samples for prediction. The cross validation function is defined as
CV(\lambda)=\frac{1}n\sum_{i=1}^n (y_i - \hat{\eta}_\lambda^{-i}(x_i))^2
where \hat{\eta}_\lambda^{-i}(x_i) indicates the fit at xi, computed by leaving out the ith data point. The quantity nCV(\lambda) is sometimes called the prediction sum of squares or PRESS (Allen 1974).

All of the smoothers fit by the GAM procedure can be formulated as a linear combination of the sample responses

\hat{\eta}(x)=A(\lambda)Y
for some matrix A(\lambda), which depends on \lambda. (The matrix A(\lambda) depends on x and the sample data, as well, but this dependence is suppressed in the preceding equation.) Let aii be the diagonal elements of the A(\lambda). Then the CV function can be expressed as

CV(\lambda)=\frac{1}n\sum_{i=1}^n (\frac{(y_i - \hat{\eta}_\lambda (x_i))}{1-a_{ii}})^2

In most cases, it is very time consuming to compute the quantity aii. To solve this computational problem, Wahba (1990) has proposed the generalized cross validation function (GCV) that can be used to solve a wide variety of problems involving selection of a parameter to minimize the prediction risk.

The GCV function is defined as

GCV(\lambda)=\frac{\sum_{i=1}^n (y_i - \hat{\eta}_\lambda(x_i))^2} {(n-{tr}(A(\lambda)))^2}

The GCV formula simply replaces the aii with {tr}(A(\lambda))/n. Therefore, it can be viewed as a weighted version of CV. In most of the cases of interest, GCV is closely related to CV but much easier to compute. The GAM procedure uses the GCV function as the criterion for choosing the smoothing parameters.

The A matrix has the same role as the projection matrix in linear regression; therefore, nonparametric degrees of freedom (DF) for the model can be defined as tr(A).

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.