The FMM procedure calculates the log likelihood that corresponds to a particular response distribution according to the following formulas. The response distribution is the distribution specified (or chosen by default) through the DIST= option in the MODEL statement. The parameterizations used for log-likelihood functions of these distributions were chosen to facilitate expressions in terms of mean parameters that are modeled through an (inverse) link functions and in terms of scale parameters. These are not necessarily the parameterizations in which parameters of prior distributions are specified in a Bayesian analysis of homogeneous mixtures. See the section Prior Distributions for details about the parameterizations of prior distributions.
The FMM procedure includes all constant terms in the computation of densities or mass functions. In the expressions that follow, l denotes the log-likelihood function, denotes a general scale parameter, is the “mean”, and is a weight from the use of a WEIGHT statement.
For some distributions (for example, the Weibull distribution) is not the mean of the distribution. The parameter is the quantity that is modeled as , where is the inverse link function and the vector is constructed based on the effects in the MODEL statement. Situations in which the parameter does not represent the mean of the distribution are explicitly mentioned in the list that follows.
The parameter is frequently labeled as a “Scale” parameter in output from the FMM procedure. It is not necessarily the scale parameter of the particular distribution.
|
|
|
|
|
|
This parameterization of the beta distribution is due to Ferrari and Cribari-Neto (2004) and has properties , .
|
|
|
|
|
|
|
|
|
|
|
|
|
|
where and are the events and trials in the events/trials syntax and . This parameterization of the beta-binomial model presents the distribution as a special case of the Dirichlet-Multinomial distribution—see, for example, Neerchal and Morel (1998). In this parameterization, and . The FMM procedure models the parameter and labels it “Scale” on the procedure output. For other parameterizations of the beta-binomial model, see Griffiths (1973) or Williams (1975).
|
|
|
|
|
|
|
|
where and are the events and trials in the events/trials syntax and . In this parameterization , .
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In this parameterization, and . The binomial cluster model is a two-component mixture of a binomial and a binomial random variable. This mixture is unusual in that it fixes the number of components and because the mixing probability appears in the moments of the mixture components. For further details, see Morel and Nagaraj (1993); Morel and Neerchal (1997); Neerchal and Morel (1998) and Example 37.1 in this chapter. The expressions for the mean and variance in the binomial cluster model are identical to those of the beta-binomial model shown previously, with , .
The FMM procedure models the parameter through the MODEL statement and the parameter through the PROBMODEL statement.
The extreme value when is chosen so that yields a likelihood of zero. You can change this value with the INVALIDLOGL= option in the PROC FMM statement. The constant distribution is useful for modeling overdispersion due to zero-inflation (or inflation of the process at support c).
In this parameterization, and .
|
|
|
|
If X has a normal distribution with mean and variance , then has a folded normal distribution and log-likelihood function for . The folded normal distribution arises, for example, when normally distributed measurements are observed, but their signs are not observed. The mean and variance of the folded normal in terms of the underlying distribution are
|
|
|
|
The FMM procedure models the folded normal distribution through the mean and variance of the underlying normal distribution. When the FMM procedure computes output statistics for the response variable (for example when you use the OUTPUT statement), the mean and variance of the response Y are reported. Similarly, the fit statistics apply to the distribution of , not the distribution of X. When you model a folded normal variable, the response input variable should be positive; the FMM procedure treats negative values of Y as a support violation.
In this parameterization, and . This parameterization of the gamma distribution differs from that in the GLIMMIX procedure, which expresses the log-likelihood function in terms of in order to achieve a variance function suitable for mixed model analysis.
|
|
|
|
In this parameterization, and . The geometric distribution is a special case of the negative binomial distribution with .
|
|
|
|
|
|
|
|
|
|
|
|
In this parameterization, , and . The FMM procedure models the mean through the effects in the MODEL statement and applies a log link by default. The generalized Poisson distribution provides an overdispersed alternative to the Poisson distribution; produces the mass function of a regular Poisson random variable. For details about the generalized Poisson distribution and a comparison with the negative binomial distribution, see Joe and Zhu (2005).
The variance is .
|
|
|
|
If has a normal distribution with mean and variance , then Y has the log-likelihood function . The FMM procedure models the lognormal distribution and not the “shortcut” version you can obtain by taking the logarithm of a random variable and modeling that as normally distributed. The two approaches are not equivalent, and the approach taken by PROC FMM is the actual lognormal distribution. Although the lognormal model is a member of the exponential family of distributions, it is not in the “natural” exponential family because it cannot be written in canonical form.
In terms of the parameters and of the underlying normal process for X, the mean and variance of Y are and , respectively, where . When you request predicted values with the OUTPUT statement, the FMM procedure computes and not .
|
|
|
|
The variance is .
For a given , the negative binomial distribution is a member of the exponential family. The parameter is related to the scale of the data because it is part of the variance function. However, it cannot be factored from the variance, as is the case with the parameter in many other distributions.
The mean and variance are and , respectively,
The mean and variance are and .
|
|
|
|
|
|
In this parameterization and . Note that this form of the t distribution is not a non-central distribution, but that of a shifted central t random variable.
|
|
|
|
where
is the lower incomplete gamma function. The mean and variance are
|
|
|
|
|
|
|
|
|
|
|
|
where is the cumulative distribution function of the standard normal distribution. The mean and variance are
|
|
|
|
|
|
|
|
|
|
The mean and variance are
|
|
|
|
|
|
|
|
where is the cumulative distribution function of the standard normal distribution. The mean and variance are
|
|
|
|
|
|
where is the probability density function of the standard normal distribution.
The mean and variance are
|
|
|
|
The mean and variance are and .
|
|
|
|
In this particular parameterization of the two-parameter Weibull distribution, the mean and variance of the random variable Y are and .