Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The GAM Procedure

MODEL Statement

MODEL dependent = <PARAM(effects)> <smoothing effects> </options> ;

The MODEL statement specifies the dependent variable and the independent effects you want to use to model its values. Specify the independent parametric variables inside the parentheses of PARAM(). The parametric variables can be either CLASS variables or continuous variables. Any number of smoothing effects can be specified, as follows:

Smoothing Effect Meaning
spline(variable, <df=number>)fit smoothing spline with the
 variable and with DF=number
spline2(variable, variable, <df=number>)fit bivariate thin-plate spline
 with DF=number


Both parametric effects and smoothing effects are optional, but at least one of them must be present.

If only parametric variables are present, PROC GAM fits a parametric linear model using the terms inside the parentheses of PARAM(). If only smoothing effects are present, PROC GAM fits a nonparametric additive model. If both types of effect are present, PROC GAM fits a semiparametric model using the parametric effects as the linear part of the model.

The following table shows how to specify various models for a dependent variable y and independent variables x, x1, and x2.

Table 4.1: Syntax for Common GAM Models
Type of Model Syntax Mathematical Form
Parametricmodel y = param(x);E(y)=\beta_0 + \beta_1 x
Nonparametricmodel y = spline(x);E(y)=\beta_0 + s(x_2)
Semiparametricmodel y = param(x1) spline(x2);E(y)=\beta_0 + \beta_1 x_1 + s(x_2)
Additivemodel y = spline(x1) spline(x2);E(y)=\beta_0 + s_1(x_1) + s_2(x_2)
Thin-plate splinemodel y = spline(x1,x2);E(y)=\beta_0 + s(x_1,x_2)


You can specify the following options in the MODEL statement.

ALPHA=number
specifies the significance level \alpha of the confidence limits on the final nonparametric component estimates when you request confidence limits to be included in the output data set. Specify number as a value between 0 and 1. The default value is 0.05. Refer to the "OUTPUT Statement" section for more information on the OUTPUT statement.

DIST=distribution-id
specifies the distribution family used in the model. The distribution-id can be GAUSSIAN or LOGISTIC. The canonical link is used with those distributions. Although theoretically, alternative links are possible, with nonparametric models the final fit is relatively insensitive to the precise choice of link function. Therefore, only the canonical link for each distribution family is implemented in PROC GAM.

EPSILON=number
specifies the convergence criterion for the back-fitting algorithm.

ITPRINT
produces an iteration table for the smoothing effects.

MAXITER=number
specifies the maximum number of iterations for the back-fitting algorithm.

METHOD=GCV
specifies that the value of the smoothing parameter should be selected by generalized cross validation. If you specify both METHOD=GCV and the DF= option for the smoothing effects, the user-specified DF= is used, and the METHOD=GCV option is ignored. Refer to the "Selection of Smoothing Parameters" section for more details on the GCV method.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.