The GAM Procedure

Confidence Intervals for Smoothers

Buja, Hastie, and Tibshirani (1989) showed that each smoothing function estimate from the backfitting algorithm is the result of a linear mapping applied to the working response, if the backfitting algorithm converges.

The smoothing function estimate can be expressed as

\[  \hat{s}_ j(\mb {x}_ j) = \mb {H}_ j\mb {z}  \]

where $\mb {x}_ j$ is the jth covariate and $\mb {z}$ is the adjusted dependent variable that is formed in the local scoring algorithm. If the errors are independent and identically distributed, then

\[  \mathrm{Cov}(\hat{s}_ j)=\sigma ^2\mb {H}_ j\mb {H}_ j’  \]

where $\sigma ^2=\mathrm{Var}(\mb {z})$.

However, direct computation of $\mb {H}_ j$ is formidable within the backfitting framework. Hastie and Tibshirani (1990) proposed using each individual smoothing matrix $\mb {A}_ j(\lambda _ j)$ as a substitute for the linear operator $\mb {H}_ j$ when computing confidence intervals. In the GAM procedure, curvewise confidence intervals for smoothing splines and pointwise confidence intervals for loess are provided in the output data set.

Curvewise Confidence Interval for Smoothing Spline Smoothers

Viewing the spline model as a Bayesian model, Wahba (1983) proposes Bayesian confidence intervals for smoothing spline estimates as:

\[  \hat{s}_\lambda (x_ i) \pm z_{\alpha /2} \sqrt {\hat{\mb {V}}_{ii}(\lambda )}  \]

where $\hat{\mb {V}}_{ii}(\lambda )$ is the ith diagonal element of the Bayesian posterior covariance matrix $\hat{\mb {V}}$ and $z_{\alpha /2}$ is the $1-\alpha /2$ quantile of the standard normal distribution. The confidence intervals are interpreted as intervals across the function as opposed to pointwise intervals.

Suppose that you fit a spline estimate to experimental data that consist of a true function f and a random error term $\epsilon _ i$. In repeated experiments, it is likely that about $100(1-\alpha )\% $ of the confidence intervals cover the corresponding true values, although some values are covered every time and other values are not covered by the confidence intervals most of the time. This effect is more pronounced when the true response curve or surface has small regions of particularly rapid change.

In the GAM procedure, let the smoothing matrix for the nonlinear part of the jth spline term be $\tilde{\mb {A}}_ j$ after the linear part is separated out from $\mb {A}_ j(\lambda )$. The Bayesian posterior variance for the nonlinear part is computed as

\[  \hat{\mb {V}}_ j=\hat{\phi }\tilde{\mb {A}}_ j\mb {W}^{-1}  \]

where $\hat{\phi }$ is the dispersion parameter estimate and $\mb {W}$ is the weight matrix from the final local scoring iteration. If you specify UCLM, LCLM, ADIAG, and STD options in the OUTPUT statement, the statistics are derived based on $\hat{\mb {V}}_ j$.

When you request both the ADDITIVE and CLM suboptions in the PLOTS=COMPONENTS option, each of the smoothing component plots displays a confidence band for the total contribution of each smoothing spline smoother. The confidence band is derived from the total variance that is contributed by both linear and nonlinear parts by the jth term

\[  \hat{\phi }\left(\mb {x}_ j’(\mb {X}’\mb {W}\mb {X})^{-1}\mb {x}_ j+ \tilde{\mb {A}}_ j\mb {W}^{-1}\right)  \]

Pointwise Confidence Interval for Loess Smoothers

As shown in Cleveland, Devlin, and Grosse (1988), the smoothing matrix $\mb {A}(\lambda )$ for a loess smoother is asymmetric. The confidence intervals are computed as follows:

\[  \hat{s}_\lambda (x_ i) \pm z_{\alpha /2} \sqrt {\hat{\mb {V}}_{ii}(\lambda )}  \]

where $\hat{\mb {V}}_{ii}(\lambda )$ is the ith diagonal element of the covariance matrix $\hat{\mb {V}}$ and $z_{\alpha /2}$ is the $1-\alpha /2$ quantile of the standard normal distribution.

In the GAM procedure, let the smoothing matrix for the nonlinear part of the jth loess term be $\tilde{\mb {A}}_ j$ after the linear part is separated out from $\mb {A}_ j(\lambda )$. The covariance matrix for the nonlinear part is then

\[  \hat{\mb {V}}_ j=\hat{\phi }\tilde{\mb {A}}_ j\mb {W}^{-1}\tilde{\mb {A}}_ j’  \]

where $\hat{\phi }$ is the dispersion parameter estimate and $\mb {W}$ is the weight matrix from the final local scoring iteration. If you specify UCLM, LCLM, and STD options in the OUTPUT statement, the statistics are derived based on $\hat{\mb {V}}_ j$.

When you request both the ADDITVE and CLM suboptions in the PLOTS=COMPONENTS option, each of the smoothing component plots displays confidence intervals for total prediction of each loess smoother. The confidence intervals are derived from the total variance that is contributed by both the linear and nonlinear parts by the jth term

\[  \hat{\phi }\left(\mb {x}_ j’(\mb {X}’\mb {W}\mb {X})^{-1}\mb {x}_ j+ \tilde{\mb {A}}_ j\mb {W}^{-1}\tilde{\mb {A}}_ j’\right)  \]