The FMM Procedure

MODEL Statement

  • MODEL response <(response-options)> = <effects> </ model-options>;

  • MODEL events/trials = <effects> </ model-options>;

  • MODEL + <effects> </ model-options>;

The MODEL statement defines elements of the mixture model, such as the model effects, the distribution, and the link function. At least one MODEL statement is required. You can specify more than one MODEL statement. Each MODEL statement identifies one or more components of a mixture. For example, if components differ in their distributions, link functions, or regressor variables, then you can use separate MODEL statements to define the components. If the finite mixture model is homogeneous—meaning that all components share the same regressors, distribution, and link function—then you can specify the mixture model by using a single MODEL statement and specifying the K= option.

The FMM procedure includes an intercept in each model by default. You can remove it by using the NOINT option.

You can specify the dependent variable by using either the response syntax or the events/trials syntax. The events/trials syntax is specific to models for binomial-type data. A binomial(n, $\pi $) variable is the sum of n independent Bernoulli trials with event probability $\pi $. Each Bernoulli trial results in either an event or a nonevent (with probability 1$-\pi $). The value of the second variable, trials, gives the number n of Bernoulli trials. The value of the first variable, events, is the number of events out of n. The values of both events and (trialsevents) must be nonnegative, and the value of trials must be positive. Other distributions that allow the events/trials syntax are the beta-binomial distribution and the binomial cluster model.

If you use the events/trials syntax, the FMM procedure defaults to the binomial distribution. If you use the response syntax, the procedure defaults to the normal distribution unless the response variable is a character variable or is listed in the CLASS statement.

You use a similar syntax to fit multinomial models, Dirichlet-multinomial models, and multinomial cluster models, except that for these distributions you specify multiple dependent variables, one for each value of the multinomial response. For these models, you can specify either multiple response variables or multiple event variables. If you use multiple response or multiple event variables, the FMM procedure defaults to the multinomial distribution. If you use the multiple response syntax, PROC FMM treats the total of these responses as fixed.

The FMM procedure supports a continuation-style syntax in MODEL statements. Because a mixture has only one set of response variables, it is sufficient to specify the response variable in one MODEL statement. Other MODEL statements can use the continuation symbol "+" before the specification of effects. For example, the following statements fit a three-component binomial mixture model:

class A;
model y/n = x / k=2;
model     + A;

The first MODEL statement uses the "=" sign to separate response information from effect information and specifies the response variable by using the events/trials syntax. This determines that the distribution is binomial. This MODEL statement adds to the mixture model two components that have different intercepts and regression slopes. The second MODEL statement adds to the mixture model another component in which the mean is a function of the classification main effect for variable A. The response is also binomial; it is a continuation from the previous MODEL statement.

There are two sets of options in the MODEL statement. The response-options determine how the FMM procedure models probabilities for binary data. The model-options control other aspects of model formation and inference. Table 39.4 summarizes the response-options and model-options available in the MODEL statement. These are subsequently discussed in detail in alphabetical order by option category.

Table 39.4: Summary of MODEL Statement Options

Option

Description

Response Variable Options

DESCENDING

Reverses the order of response categories

EVENT=

Specifies the event category in binary models

ORDER=

Specifies the sort order for the response variable

REFERENCE=

Specifies the reference category in binary models

Model Building

DIST=

Specifies the response distribution

LINK=

Specifies the link function

K=

Specifies the number of mixture components

KMAX=

Specifies the maximum number of mixture components

KMIN=

Specifies the minimum number of mixture components

KRESTART

Requests that the starting values for each analysis be determined separately instead of sequentially

NOINT

Excludes fixed-effect intercept from model

OFFSET=

Specifies the offset variable for linear predictor

Statistical Computations and Output

ALPHA= $\alpha $

Determines the confidence level ($\Mathtext{1}-\alpha $)

CL

Displays confidence limits for fixed-effects parameter estimates

EQUATE=

Imposes simple equality constraints on parameters in this model

LABEL=

Identifies the model

PARMS

Provides starting values for the parameters in this model