MODEL response <(response-options)> = <PARAM(effects)> <spline-effects> </ model-options>;
MODEL events / trials = <PARAM(effects)> <spline-effects> </ model-options>;
The MODEL statement specifies the response (dependent or target) variable and the predictor (independent or explanatory) effects of the model. You can specify the response in the form of a single variable or in the form of a ratio of two variables, which are denoted events/trials. The first form applies to all distribution families; the second form applies only to summarized binomial response data. When you have binomial data, the events variable contains the number of positive responses (or events) and the trials variable contains the number of trials. The values of both events and (trials – events) must be nonnegative, and the value of trials must be positive. If you specify a single response variable that is in a CLASS statement, then the response is assumed to be binary.
You can specify parametric effects that are constructed from variables in the input data set and include the effects in the parentheses of a PARAM( ) option, which can appear multiple times. For information about constructing the model effects, see the section Specification and Parameterization of Model Effects.
You can specify spline-effects by including independent variables inside the parentheses of the SPLINE( ) option. Only continuous variables (not classification variables) can be specified in spline-effects. Each spline-effect can have at least one variable and optionally some spline-options . You can specify any number of spline-effects. The following table shows some examples.
Table 7.3: continued
Spline Effect Specification |
Meaning |
---|---|
|
Constructs the univariate spline with |
|
Constructs the univariate spline by using |
|
Constructs the bivariate spline by using |
|
Constructs the trivariate spline by using |
Both parametric effects and spline effects are optional. If none are specified, a model that contains only an intercept is fitted. If only parametric effects are present, PROC GAMPL fits a parametric generalized linear model by using the terms inside the parentheses of all PARAM( ) terms. If only spline effects are present, PROC GAMPL fits a nonparametric additive model. If both types of effects are present, PROC GAMPL fits a semiparametric model by using the parametric effects as the linear part of the model.
There are three sets of options in the MODEL statement. The response-options determine how the GAMPL procedure models probabilities for binary data. The spline-options controls how each spline term forms basis expansions. The model-options control other aspects of model formation and inference. Table 7.4 summarizes these options.
Table 7.4: MODEL Statement Options
Option |
Description |
---|---|
Response Variable Options for Binary Models |
|
Reverses the response categories |
|
Specifies the event category |
|
Specifies the sort order |
|
Specifies the reference category |
|
Smoothing Options for Spline Effects |
|
Requests detailed spline information |
|
Specifies the fixed degrees of freedom |
|
Specifies the starting value for the smoothing parameter |
|
Specifies the knots to be used for constructing the spline |
|
Specifies polynomial orders for constructing the spline |
|
Specifies the maximum degrees of freedom |
|
Specifies the maximum number of knots to be used for constructing the spline |
|
Specifies the upper bound for the smoothing parameter |
|
Specifies the lower bound for the smoothing parameter |
|
Specifies a fixed smoothing parameter |
|
Model Options |
|
Requests all nonmissing values of spline variables for constructing spline basis functions regardless of other model variables |
|
Specifies the model evaluation criterion |
|
Specifies the fixed dispersion parameter |
|
Specifies the response distribution |
|
Requests a finite-difference Hessian for smoothing parameter selection |
|
Specifies the starting value of the dispersion parameter |
|
Specifies the link function |
|
Requests normalized spline basis functions for model fitting |
|
Specifies the upper bound for searching the dispersion parameter |
|
Specifies the algorithm for selecting smoothing parameters |
|
Specifies the lower bound for searching the dispersion parameter |
|
Specifies the offset variable |
|
Specifies the ridge parameter |
|
Specifies the method for estimating the dispersion parameter |
Response variable options determine how the GAMPL procedure models probabilities for binary data.
You can specify the following response-options by enclosing them in parentheses after the response variable.