23805 - Computation of predicted values for the response and smoothing components in PROC GAM

Usage Note 23805: Computation of predicted values for the response and smoothing components in PROC GAM

In the "Getting Started" section of the GAM documentation, the following normal response model is fitted:

   proc gam data=diabetes;
      model logCP = spline(Age) spline(BaseDeficit);
      output out=estimates pred;
      run;

The OUTPUT statement is added here to create a data set containing predicted values for each smoothing component and overall predicted values. The partial prediction for a predictor is the portion of the predicted response that is attributable to that predictor. Note that the variables produced by the PRED option in the OUTPUT statement, P_Age and P_BaseDeficit, contain only the nonparametric portions of their partial predictions. In the case of SPLINE-smoothed predictors like these, there are also linear components that appear in the "Regression Model Analysis, Parameter Estimates" table as Linear(Age) and Linear(BaseDeficit). Adding the linear components to the nonparametric components results in the partial predictions for the predictors. The partial prediction for Age is computed as:

   PP_Age = 0.01437*Age + P_Age

Similarly for the the partial prediction for BaseDeficit:

   PP_BaseDeficit = 0.00807*BaseDeficit + P_BaseDeficit

The linear predictor is the sum of the partial predictions and the intercept:

   P_logCP = 1.48141 + PP_Age + PP_BaseDeficit

If the model involves variables in a PARAM effect (whether as CLASS or as continuous predictors), then these are also added to the linear predictor.

The inverse link function must be applied to the linear predictor to obtain the predicted mean. Beginning in SAS 9.2, the LINP option in the OUTPUT statement provides the linear predictor values, and the inverse link is applied to these to compute the P_<response> variable which contains the predicted means. In this example, P_logCP is both the linear predictor and the predicted mean since the identity link is used for normal models.

Prior to SAS 9.2, P_logCP is the linear predictor, not the estimated mean for nonnormal response models. To compute predicted means, the fitted values must be transformed using the appropriate inverse link function. For example, for a Poisson model the estimated means are computed by exponentiating the predicted response values. For a logistic model, the binomial mean, p, is estimated using the inverse logit transformation where P_Y is the predicted response variable:

   p = 1 / (1 + exp(-P_Y))

This estimated binomial mean can also be computed using the LOGISTIC function in the DATA step:

   p = logistic(P_Y);

Operating System and Release Information

Product Family	Product	System	SAS Release
			Reported	Fixed*
SAS System	SAS/STAT	All	n/a

* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.

Type:	Usage Note
Priority:	low
Topic:	SAS Reference ==> Procedures ==> GAM Analytics ==> Nonparametric Analysis Analytics ==> Categorical Data Analysis Analytics ==> Regression

Date Modified:	2019-05-01 16:07:08
Date Created:	2004-03-02 11:15:22

Support

Usage Note 23805: Computation of predicted values for the response and smoothing components in PROC GAM

Operating System and Release Information