The FMM Procedure

Log-Likelihood Functions for Response Distributions

The FMM procedure calculates the log likelihood that corresponds to a particular response distribution according to the following formulas. The response distribution is the distribution specified (or chosen by default) through the DIST= option in the MODEL statement. The parameterizations used for log-likelihood functions of these distributions were chosen to facilitate expressions in terms of mean parameters that are modeled through an (inverse) link functions and in terms of scale parameters. These are not necessarily the parameterizations in which parameters of prior distributions are specified in a Bayesian analysis of homogeneous mixtures. See the section Prior Distributions for details about the parameterizations of prior distributions.

The FMM procedure includes all constant terms in the computation of densities or mass functions. In the expressions that follow, l denotes the log-likelihood function, $\phi$ denotes a general scale parameter, $\mu _ i$ is the “mean”, and $w_ i$ is a weight from the use of a WEIGHT statement.

For some distributions (for example, the Weibull distribution) $\mu _ i$ is not the mean of the distribution. The parameter $\mu _ i$ is the quantity that is modeled as $g^{-1}(\mb {x}‘\bbeta )$ , where $g^{-1}(\cdot )$ is the inverse link function and the $\mb {x}$ vector is constructed based on the effects in the MODEL statement. Situations in which the parameter $\mu$ does not represent the mean of the distribution are explicitly mentioned in the list that follows.

The parameter $\phi$ is frequently labeled as a “Scale” parameter in output from the FMM procedure. It is not necessarily the scale parameter of the particular distribution.

Beta $(\mu ,\phi )$

$\begin{align*} l(\mu _ i,\phi ;y_ i,w_ i) & = \log \left\{ \frac{\Gamma (\phi /w_ i)}{\Gamma (\mu _ i\phi /w_ i)\Gamma ((1-\mu _ i)\phi /w_ i)}\right\} \\ & \mbox{} + \, (\mu _ i\phi /w_ i - 1)\log \{ y_ i\} \\ & \mbox{} + \, ((1-\mu _ i)\phi /w_ i - 1)\log \{ 1-y_ i\} \end{align*}$

This parameterization of the beta distribution is due to Ferrari and Cribari-Neto (2004) and has properties $\mr {E}[Y] = \mu$ , $\mr {Var}[Y] = \mu (1-\mu )/(1+\phi ), \, \phi > 0$ .

Beta-binomial $(n;\mu ,\phi )$

$\begin{align*} \phi & = (1-\rho ^2)/\rho ^2 \\ l(\mu _ i,\rho ; y_ i) & = \log \{ \Gamma (n_ i+1)\} - \log \{ \Gamma (y_ i+1)\} \\ & - \log \{ \Gamma (n_ i-y_ i+1)\} \\ & + \log \{ \Gamma (\phi )\} - \log \{ \Gamma (n_ i+\phi )\} + \log \{ \Gamma (y_ i+\phi \mu _ i)\} \\ & + \log \{ \Gamma (n_ i-y_ i + \phi (1-\mu _ i))\} - \log \{ \Gamma (\phi \mu _ i)\} \\ & - \log \{ \Gamma (\phi (1-\mu _ i))\} \\ l(\mu _ i,\rho ; y_ i,w_ i) & = w_ i l(\mu _ i,\rho ; y_ i) \end{align*}$

where $y_ i$ and $n_ i$ are the events and trials in the events/trials syntax and $0 < \mu < 1$ . This parameterization of the beta-binomial model presents the distribution as a special case of the Dirichlet-Multinomial distribution—see, for example, Neerchal and Morel (1998). In this parameterization, $\mr {E}[Y] = n\mu$ and $\mr {Var}[Y] = n\mu (1-\mu )(1+(n-1)/(\phi +1)), \, 0 \le \rho \le 1$ . The FMM procedure models the parameter $\phi$ and labels it “Scale” on the procedure output. For other parameterizations of the beta-binomial model, see Griffiths (1973) or Williams (1975).

Binomial $(n;\mu )$

$\begin{align*} l(\mu _ i;y_ i) & = y_ i \log \{ \mu _ i\} + (n_ i - y_ i) \log \{ 1-\mu _ i\} \\ & + \log \{ \Gamma (n_ i+1)\} - \log \{ \Gamma (y_ i+1)\} \\ & - \log \{ \Gamma (n_ i-y_ i+1)\} \\ l(\mu _ i;y_ i,w_ i) & = w_ i \, l(\mu _ i;y_ i) \end{align*}$

where $y_ i$ and $n_ i$ are the events and trials in the events/trials syntax and $0 < \mu < 1$ . In this parameterization $\mr {E}[Y] = n\mu$ , $\mr {Var}[Y] = n\mu (1-\mu )$ .

Binomial cluster $(n;\mu ,\pi )$

$\begin{align*} z & = \log \{ \Gamma (n_ i+1)\} - \log \{ \Gamma (y_ i+1)\} - \log \{ \Gamma (n_ i-y_ i+1)\} \\ \mu ^*_ i & = (1-\mu _ i)\pi \end{align*}$

$\begin{align*} l(\mu _ i,\pi ;y_ i) = z + \log \{ & \pi (\mu _ i^* + \mu _ i)^{y_ i} (1 - \mu _ i^* - \mu _ i)^{n_ i - y_ i} \\ & + (1-\pi ) (\mu _ i^*)^{y_ i} (1-\mu _ i^*)^{n_ i - y_ i} \} \end{align*}$

$\begin{align*} l(\mu _ i,\pi ;y_ i,w_ i) & = w_ i l(\mu _ i,\pi ;y_ i) \end{align*}$

In this parameterization, $\mr {E}[Y] = n\pi$ and $\mr {Var}[Y] = n\pi (1-\pi )\left\{ 1+\mu ^2(n-1)\right\}$ . The binomial cluster model is a two-component mixture of a binomial $(n,\mu ^*+\mu )$ and a binomial $(n,\mu ^*)$ random variable. This mixture is unusual in that it fixes the number of components and because the mixing probability $\pi$ appears in the moments of the mixture components. For further details, see Morel and Nagaraj (1993); Morel and Neerchal (1997); Neerchal and Morel (1998) and Example 39.1 in this chapter. The expressions for the mean and variance in the binomial cluster model are identical to those of the beta-binomial model shown previously, with $\pi _{bc} = \mu _{bb}$ , $\mu _{bc} = \rho _{bb}$ .

The FMM procedure models the parameter $\mu$ through the MODEL statement and the parameter $\pi$ through the PROBMODEL statement.

Constant(c)

$l(y_ i) = \left\{ \begin{array}{ll} 0 & |y_ i - c | < \epsilon \cr -1\mr {E}20 & |y_ i - c | \ge \epsilon \end{array}\right.$

The extreme value when $|y_ i - c | \ge \epsilon$ is an approximation for $\mr {log}(0)=-\infty$ , chosen so that $\exp \{ l(y_ i)\}$ yields a likelihood of zero. You can change this value with the INVALIDLOGL= option in the PROC FMM statement. The constant distribution is useful for modeling overdispersion due to zero-inflation (or inflation of the process at support c).

The DIST=CONSTANT distribution is useful for modeling an inflated probability of observing a particular value (zero, by default) in data from other discrete distributions, as demonstrated in Modeling Zero-Inflation: Is it Better to Fish Poorly or Not to Have Fished at All?. While it is syntactically valid to mix a constant distribution with a continuous distribution, such as DIST=LOGNORMAL, such a mixture is not mathematically appropriate, because the constant log-likelihood is the log of a probability, while a continuous log-likelihood is the log of a probability density function. If you want to mix a constant distribution with a continuous distribution, you could model the constant as a very narrow continuous distribution, such as DIST=UNIFORM( $c-\delta$ , $c+\delta$ ) for a small value $\delta$ . However, using PROC FMM to analyze such mixtures is sensitive to numerical inaccuracy and ultimately unnecessary. Instead, the following approach is mathematically equivalent and more numerically stable:

Estimate the mixing probability $\mr {P}(Y=c)$ as the proportion of observations in the data set such that $|y_ i - c| < \epsilon$ .
Estimate the parameters of the continuous distribution from the observations for which $|y_ i - c | \ge \epsilon$ .

Exponential $(\mu )$

$l(\mu _ i;y_ i,w_ i) = \left\{ \begin{array}{ll} -\log \{ \mu _ i\} - y_ i/\mu _ i & w_ i = 1 \cr w_ i\log \left\{ \frac{w_ iy_ i}{\mu _ i}\right\} - \frac{w_ iy_ i}{\mu _ i} - \log \{ y_ i \Gamma (w_ i)\} & w_ i \not= 1 \end{array} \right.$

In this parameterization, $\mr {E}[Y] = \mu$ and $\mr {Var}[Y] = \mu ^2$ .

Folded normal $(\mu , \phi )$

$\begin{align*} l(\mu _ i,\phi ;y_ i,w_ i) =& -\frac{1}{2}\log \{ 2\pi \} -\frac{1}{2}\log \{ \phi /w_ i\} \\ +& \log \left\{ \exp \left\{ \frac{-w_ i (y_ i-\mu _ i)^2}{2\phi } \right\} + \exp \left\{ \frac{-w_ i (y_ i+\mu _ i)^2}{2\phi } \right\} \right\} \end{align*}$

If X has a normal distribution with mean $\mu$ and variance $\phi$ , then $Y = |X|$ has a folded normal distribution and log-likelihood function $l(\mu ,\phi ;y,w)$ for $y \geq 0$ . The folded normal distribution arises, for example, when normally distributed measurements are observed, but their signs are not observed. The mean and variance of the folded normal in terms of the underlying $\mr {N}(\mu ,\phi )$ distribution are

$\begin{align*} \mr {E}[Y] =& \frac{1}{\sqrt {2\pi \phi }} \exp \left\{ -\frac{\mu ^2}{2/\phi } \right\} + \mu \left(1-2\Phi \left(-\mu /\sqrt {\phi }\right)\right)\\ \mr {Var}[Y] =& \phi + \mu ^2 - \mr {E}[Y]^2 \end{align*}$

The FMM procedure models the folded normal distribution through the mean $\mu$ and variance $\phi$ of the underlying normal distribution. When the FMM procedure computes output statistics for the response variable (for example when you use the OUTPUT statement), the mean and variance of the response Y are reported. Similarly, the fit statistics apply to the distribution of $Y = |X|$ , not the distribution of X. When you model a folded normal variable, the response input variable should be positive; the FMM procedure treats negative values of Y as a support violation.

Gamma $(\mu ,\phi )$

$l(\mu _ i,\phi ;y_ i,w_ i) = w_ i\phi \log \left\{ \frac{w_ iy_ i\phi }{\mu _ i}\right\} - \frac{w_ iy_ i\phi }{\mu _ i} - \log \{ y_ i\} - \log \left\{ \Gamma (w_ i\phi )\right\}$

In this parameterization, $\mr {E}[Y] = \mu$ and $\mr {Var}[Y] = \mu ^2/\phi , \, \phi > 0$ . This parameterization of the gamma distribution differs from that in the GLIMMIX procedure, which expresses the log-likelihood function in terms of $1/\phi$ in order to achieve a variance function suitable for mixed model analysis.

Geometric $(\mu )$

$\begin{align*} l(\mu _ i;y_ i,w_ i) & = y_ i \log \left\{ \frac{\mu _ i}{w_ i}\right\} - (y_ i + w_ i) \log \left\{ 1 + \frac{\mu _ i}{w_ i}\right\} \\ & + \log \left\{ \frac{\Gamma (y_ i + w_ i)}{\Gamma (w_ i) \Gamma (y_ i + 1)} \right\} \end{align*}$

In this parameterization, $\mr {E}[Y] = \mu$ and $\mr {Var}[Y] = \mu + \mu ^2$ . The geometric distribution is a special case of the negative binomial distribution with $\phi =1$ .

Generalized Poisson $(\mu ,\phi )$

$\begin{align*} \xi _ i & = & \left(1-\exp \{ -\phi \} \right)/w_ i \\ \mu ^*_ i & = & \mu _ i - \xi (\mu _ i - y_ i)\\ l(\mu ^*_ i,\xi _ i;y_ i,w_ i) & = & \log \{ \mu ^*_ i - \xi _ i y_ i\} + (y_ i-1)\log \{ \mu ^*_ i\} \\ & & - \mu ^*_ i - \log \{ \Gamma (y_ i+1)\} \end{align*}$

In this parameterization, $\mr {E}[Y]=\mu$ , $\mr {Var}[Y] = \mu /(1-\xi )^2,$ and $\phi \ge 0$ . The FMM procedure models the mean $\mu$ through the effects in the MODEL statement and applies a log link by default. The generalized Poisson distribution provides an overdispersed alternative to the Poisson distribution; $\phi = \xi _ i = 0$ produces the mass function of a regular Poisson random variable. For details about the generalized Poisson distribution and a comparison with the negative binomial distribution, see Joe and Zhu (2005).

Inverse Gaussian $(\mu ,\phi )$

$l(\mu _ i,\phi ;y_ i,w_ i) = -\frac{1}{2} \left[ \frac{w_ i(y_ i-\mu _ i)^2}{y_ i\phi \mu _ i^2} + \log \left\{ \frac{\phi y_ i^3}{w_ i} \right\} + \log \{ 2\pi \} \right]$

The variance is $\mr {Var}[Y] = \phi \mu ^3, \, \phi > 0$ .

Lognormal $(\mu ,\phi )$

$\begin{align*} z_ i & = \log \{ y_ i\} - \mu _ i \\ l(\mu _ i,\phi ;y_ i,w_ i) & = -\frac{1}{2}\left( 2\log \{ y_ i\} + \log \left\{ \frac{\phi }{w_ i} \right\} + \log \{ 2\pi \} + \frac{w_ i z_ i^2}{\phi } \right) \end{align*}$

If $X = \log \{ Y\}$ has a normal distribution with mean $\mu$ and variance $\phi$ , then Y has the log-likelihood function $l(\mu _ i,\phi ;y_ i,w_ i)$ . The FMM procedure models the lognormal distribution and not the “shortcut” version you can obtain by taking the logarithm of a random variable and modeling that as normally distributed. The two approaches are not equivalent, and the approach taken by PROC FMM is the actual lognormal distribution. Although the lognormal model is a member of the exponential family of distributions, it is not in the “natural” exponential family because it cannot be written in canonical form.

In terms of the parameters $\mu$ and $\phi$ of the underlying normal process for X, the mean and variance of Y are $\mr {E}[Y] = \exp \{ \mu \} \sqrt {\omega }$ and $\mr {Var}[Y] = \exp \{ 2\mu \} \omega (\omega -1)$ , respectively, where $\omega = \exp \{ \phi \}$ . When you request predicted values with the OUTPUT statement, the FMM procedure computes $\mr {E}[Y]$ and not $\mu$ .

Negative binomial $(\mu ,\phi )$

$\begin{align*} l(\mu _ i,\phi ;y_ i,w_ i) & = y_ i \log \left\{ \frac{\phi \mu _ i}{w_ i}\right\} - (y_ i + w_ i / \phi )\log \left\{ 1 + \frac{\phi \mu _ i}{w_ i}\right\} \\ & + \log \left\{ \frac{\Gamma (y_ i + w_ i/\phi )}{\Gamma (w_ i/\phi ) \Gamma (y_ i + 1)} \right\} \end{align*}$

The variance is $\mr {Var}[Y] = \mu + \phi \mu ^2, \, \phi > 0$ .

For a given $\phi$ , the negative binomial distribution is a member of the exponential family. The parameter $\phi$ is related to the scale of the data because it is part of the variance function. However, it cannot be factored from the variance, as is the case with the $\phi$ parameter in many other distributions.

Normal $(\mu ,\phi )$

$l(\mu _ i,\phi ;y_ i,w_ i) = -\frac{1}{2} \left[ \frac{w_ i(y_ i-\mu _ i)^2}{\phi } + \log \left\{ \frac{\phi }{w_ i}\right\} + \log \{ 2\pi \} \right]$

The mean and variance are $\mr {E}[Y] = \mu$ and $\mr {Var}[Y] = \phi$ , respectively, $\phi > 0$

Poisson $(\mu )$

$l(\mu _ i;y_ i,w_ i) = w_ i (y_ i \log \{ \mu _ i\} - \mu _ i - \log \{ \Gamma (y_ i + 1)\} )$

The mean and variance are $\mr {E}[Y] = \mu$ and $\mr {Var}[Y] = \mu$ .

(Shifted) T $(\nu ;\mu ,\phi )$

$\begin{align*} z_ i & = -0.5\log \{ \phi /\sqrt {w_ i}\} + \log \left\{ \Gamma (0.5(\nu +1)\right\} \\ & - \log \left\{ \Gamma (0.5\nu )\right\} - 0.5\times \log \left\{ \pi \nu \right\} \\ l(\mu _ i,\phi ;y_ i,w_ i) & = - \left(\frac{\nu +1}{2}\right) \log \left\{ 1+\frac{w_ i}{\nu } \frac{(y_ i-\mu _ i)^2}{\phi }\right\} + z_ i \end{align*}$

In this parameterization $\mr {E}[Y] = \mu$ and $\mr {Var}[Y] = \phi \nu /(\nu -2), \, \phi > 0, \nu > 0$ . Note that this form of the t distribution is not a non-central distribution, but that of a shifted central t random variable.

Truncated Exponential $(\mu ; a, b)$

$\begin{align*} l(\mu _ i; a, b, y_ i, w_ i) & = w_ i\log \left\{ \frac{w_ iy_ i}{\mu _ i}\right\} - \frac{w_ iy_ i}{\mu _ i} - \log \{ y_ i \Gamma (w_ i)\} \\ & - \log \left[ \frac{\gamma \left(w_ i, \frac{w_ i b}{\mu _ i} \right)}{\Gamma (w_ i)} - \frac{\gamma \left(w_ i, \frac{w_ i a}{\mu _ i} \right)}{\Gamma (w_ i)} \right] \end{align*}$

where

$\gamma (c_1, c_2) = \int _0^{c_2} t^{c_1-1} \exp (-t) \mr {d}t$

is the lower incomplete gamma function. The mean and variance are

$\begin{align*} \mr {E}[Y] & = \frac{(a+\mu _ i) \exp (-a/\mu _ i) - (b+\mu _ i) \exp (-b/\mu _ i)}{\exp (-a/\mu _ i) - \exp (-b/\mu _ i)} \\ \mr {Var}[Y] & = \frac{(a^2+2a\mu _ i+2\mu _ i^2) \exp (-a/\mu _ i) - (b^2+2b\mu _ i+2\mu _ i^2) \exp (-b/\mu _ i)}{\exp (-a/\mu _ i) - \exp (-b/\mu _ i)} \\ & - \left(\mr {E}[Y] \right)^2 \end{align*}$

Truncated Lognormal $(\mu , \phi ; a, b)$

$\begin{align*} z_ i & = \log \{ y_ i\} - \mu _ i \\ l(\mu _ i, \phi ; a, b, y_ i, w_ i) & = -\frac{1}{2}\left( 2\log \{ y_ i\} + \log \left\{ \frac{\phi }{w_ i} \right\} + \log \{ 2\pi \} + \frac{w_ i z_ i^2}{\phi } \right) \\ & - \log \left\{ \Phi \left[ \sqrt {w_ i/\phi }(\log b - \mu _ i) \right] - \Phi \left[ \sqrt {w_ i/\phi }(\log a - \mu _ i) \right]\right\} \end{align*}$

where $\Phi (\cdot )$ is the cumulative distribution function of the standard normal distribution. The mean and variance are

$\begin{align*} \mr {E}[Y] & = \exp (\mu _ i + 0.5\phi ) \frac{\Phi \left(\sqrt {\phi } - \frac{\log a - \mu _ i}{\sqrt {\phi }} \right) - \Phi \left(\sqrt {\phi } - \frac{\log b - \mu _ i}{\sqrt {\phi }} \right)}{\Phi \left(\frac{\log b - \mu _ i}{\sqrt {\phi }} \right) - \Phi \left(\frac{\log a - \mu _ i}{\sqrt {\phi }} \right)} \\ \mr {Var}[Y] & = \exp (2\mu _ i + 2\phi ) \frac{\Phi \left(2\sqrt {\phi } - \frac{\log a - \mu _ i}{\sqrt {\phi }} \right) - \Phi \left(2\sqrt {\phi } - \frac{\log b - \mu _ i}{\sqrt {\phi }} \right)}{\Phi \left(\frac{\log b - \mu _ i}{\sqrt {\phi }} \right) - \Phi \left(\frac{\log a - \mu _ i}{\sqrt {\phi }} \right)} - \left(\mr {E}[Y] \right)^2 \end{align*}$

Truncated Negative binomial $(\mu , \phi )$

The mean and variance are

$\begin{align*} \mr {E}[Y] & = \mu _ i \left\{ 1 - (\phi \mu _ i + 1)^{-1/\phi } \right\} ^{-1} \\ \mr {Var}[Y] & = (1 + \phi \mu _ i + \mu _ i) \mr {E}[Y] - \left(\mr {E}[Y] \right)^2 \end{align*}$

Truncated Normal $(\mu , \phi ; a, b)$

$\begin{align*} l(\mu _ i, \phi ; a, b, y_ i, w_ i) & = -\frac{1}{2} \left[ \frac{w_ i(y_ i-\mu _ i)^2}{\phi } + \log \left\{ \frac{\phi }{w_ i}\right\} + \log \{ 2\pi \} \right] \\ & - \log \left\{ \Phi \left[ \sqrt {w_ i/\phi }(b-\mu _ i) \right] - \Phi \left[ \sqrt {w_ i/\phi }(a-\mu _ i) \right]\right\} \end{align*}$

where $\Phi (\cdot )$ is the cumulative distribution function of the standard normal distribution. The mean and variance are

$\begin{align*} \mr {E}[Y] & = \mu _ i + \sqrt {\phi } \frac{\mr {phi}\left(\frac{a-\mu _ i}{\sqrt {\phi }} \right) - \mr {phi}\left(\frac{b-\mu _ i}{\sqrt {\phi }} \right)}{\Phi \left(\frac{b-\mu _ i}{\sqrt {\phi }} \right) - \Phi \left(\frac{a-\mu _ i}{\sqrt {\phi }} \right)}\\ \mr {Var}[Y] & = \phi \left[1 + \frac{\frac{a-\mu _ i}{\sqrt {\phi }} \mr {phi}\left(\frac{a-\mu _ i}{\sqrt {\phi }} \right) - \frac{b-\mu _ i}{\sqrt {\phi }} \mr {phi}\left(\frac{b-\mu _ i}{\sqrt {\phi }} \right)}{\Phi \left(\frac{b-\mu _ i}{\sqrt {\phi }} \right) - \Phi \left(\frac{a-\mu _ i}{\sqrt {\phi }} \right)} \right. \\ & - \left. \left\{ \frac{\mr {phi}\left(\frac{a-\mu _ i}{\sqrt {\phi }} \right) - \mr {phi}\left(\frac{b-\mu _ i}{\sqrt {\phi }} \right)}{\Phi \left(\frac{b-\mu _ i}{\sqrt {\phi }} \right) - \Phi \left(\frac{a-\mu _ i}{\sqrt {\phi }} \right)}\right\} ^2 \right] \end{align*}$

where $\mr {phi}(\cdot )$ is the probability density function of the standard normal distribution.

Truncated Poisson $(\mu )$

$l(\mu _ i;y_ i,w_ i) = w_ i (y_ i \log \{ \mu _ i\} - \log \{ \exp (\mu _ i) - 1\} - \log \{ \Gamma (y_ i + 1)\} )$

The mean and variance are

$\begin{align*} \mr {E}[Y] & = \frac{\mu }{1-\exp (-\mu _ i)} \\ \mr {Var}[Y] & = \frac{\mu _ i \left[1 - \exp (-\mu _ i) - \mu _ i \exp (-\mu _ i)\right]}{[1-\exp (-\mu _ i)]^2} \end{align*}$

Uniform $(a,b)$

$l(\mu _ i;y_ i,w_ i) = -\log \{ b-a\}$

The mean and variance are $\mr {E}[Y] = 0.5(a+b)$ and $\mr {Var}[Y] = (b-a)^2/12$ .

Weibull $(\mu ,\phi )$

$\begin{align*} l(\mu _ i,\phi ;y_ i) & = -\frac{\phi -1}{\phi }\log \left\{ \frac{y_ i}{\mu _ i}\right\} - \log \{ \mu _ i\phi \} \\ & - \exp \left\{ \log \left\{ \frac{y_ i}{\mu _ i}\right\} /\phi \right\} \end{align*}$

In this particular parameterization of the two-parameter Weibull distribution, the mean and variance of the random variable Y are $\mr {E}[Y] = \mu \Gamma (1+\phi )$ and $\mr {Var}[Y] = \mu ^2\left\{ \Gamma (1+2\phi ) - \Gamma ^2(1+\phi ) \right\}$ .