The FMM Procedure

Latent Variables via Data Augmentation

In order to fit finite Bayesian mixture models, the FMM procedure treats the mixture model as a missing data problem and introduces an assignment variable $\mb{S}$ as in Dempster, Laird, and Rubin (1977). Because $\mb{S}$ is not observable, it is frequently referred to as a latent variable. The unobservable variable $\mb{S}$ assigns an observation to a component in the mixture model. The number of states, k, might be unknown, but it is known to be finite. Conditioning on the latent variable $\mb{S}$ , the component memberships of each observation is assumed to be known, and Bayesian estimation is straightforward for each component in the finite mixture model. That is, conditional on $S=j$ , the distribution of the response is now assumed to be $f(y;\alpha _ j,\beta _ j|S=j)$ . In other words, each distinct state of the random variable $\mb{S}$ leads to a distinct set of parameters. The parameters in each component individually are then updated using a conjugate Gibbs sampler (where available) or a Metropolis-Hastings sampling algorithm.

The FMM procedure assumes that the random variable $\mb{S}$ has a discrete multinomial distribution with probability $\pi _ j$ of belonging to a component j; it can occupy one of k states. The distribution for the latent variable $\mb{S}$ is

$\begin{equation*} f(S_ i=j|\pi _1,\hdots ,\pi _ k) = \mbox{multinomial}(1,\pi _1,\hdots ,\pi _ k) \end{equation*}$

where $f(\cdot |\cdot )$ denotes a conditional probability density. The parameters in the density $\pi _ j$ denote the probability that S takes on state j.

The FMM procedure assumes a conjugate Dirichlet prior distribution on the mixture proportions $\pi _ j$ written as:

$\begin{eqnarray*} p(\bpi ) & = & \mbox{Dirichlet}(a_1,\hdots ,a_ k) \end{eqnarray*}$

where $p(\cdot )$ indicates a prior distribution.

Using Bayes’ theorem, the likelihood function and prior distributions determine a conditionally conjugate posterior distribution of $\mb{S}$ and $\bpi$ from the multinomial distribution and Dirichlet distribution, respectively.