Previous Page | Next Page

The GAM Procedure

OUTPUT Statement
OUTPUT OUT = SAS-data-set < keyword <=prefix> keyword <=prefix>> ;

The OUTPUT statement creates a new SAS data set that contains diagnostic measures calculated after fitting the model.

All the variables in the original data set are included in the new data set, along with the variables created by specifying keywords in the OUTPUT statement. These new variables contain the values of a variety of statistics and diagnostic measures that are calculated for each observation in the data set. If no keywords are present, the OUT=  data set contains only the original data set and predicted values. The predicted values include the linear predictor for the response and the prediction for each smoothing term in the model. When you specify a distribution family with the DIST=  or LINK=  option in the MODEL statement, predicted response values after applying the inverse link function are also included. Predicted values are computed for observations with missing response values whose values of the specified explanatory variables are nonmissing, and whose values of the specified smoothing variables are within the smoothing ranges of the fitted model.

Details on the specifications in the OUTPUT statement are as follows.

OUT=SAS-data-set

specifies the name of the new data set to contain the diagnostic measures. This specification is required.

keyword <=prefix>

specifies the statistics to include in the output data set. The keywords and the statistics they represent are as follows:

PREDICTED

predicted values for each smoothing component and overall predicted values on the response scale at design points. The prediction for each spline or loess term is only for the nonlinear component of each smoother.

LINP

linear prediction values on the link scale at design points

UCLM

upper confidence limits for each predicted smoothing component

LCLM

lower confidence limits for each predicted smoothing component

ADIAG

diagonal element of the hat matrix associated with the observation for each smoothing spline component

RESIDUAL

residual standardized by its weights

STD

standard deviation of the prediction for each smoothing component

ALL

all statistics in this list

The names of the new variables that contain the statistics are formed by concatenating the user supplied prefix and the corresponding variable names. If you do not specify a prefix, the names are formed by using default prefixes listed in the following table:

Keyword

Prefix

PRED

P_

LINP

LINP_

UCLM

UCLM_

LCLM

LCLM_

ADIAG

ADIAG_

RESID

R_

STD

STD_ (for spline)

 

STDP_ (for loess)

For example, suppose that you have a dependent variable y and an independent smoothing variable x, and you specify the keywords PRED=MyP_ and ADIAG=MyA_. In this case, in addition to the variables in the input data set, the output SAS data set will contain the variables MyP_y, MyP_x, and MyA_x. If the keywords PRED and ADIAG are specified without prefixes, the output SAS data set will contain the variables P_y, P_x, and ADIAG_x.

Previous Page | Next Page | Top of Page