The FMM Procedure

Latent Variables via Data Augmentation

In order to fit finite Bayesian mixture models, the FMM procedure treats the mixture model as a missing data problem and introduces an assignment variable $\mb {S}$ as in Dempster, Laird, and Rubin (1977). Since $\mb {S}$ is not observable, it is frequently referred to as a latent variable. The unobservable variable $\mb {S}$ assigns an observation to a component in the mixture model. The number of states, k, might be unknown, but it is known to be finite. Conditioning on the latent variable $\mb {S}$, the component memberships of each observation is assumed to be known, and Bayesian estimation is straightforward for each component in the finite mixture model. That is, conditional on $S=j$, the distribution of the response is now assumed to be $f(y;\alpha _ j,\beta _ j|S=j)$. In other words, each distinct state of the random variable $\mb {S}$ leads to a distinct set of parameters. The parameters in each component individually are then updated using a conjugate Gibbs sampler (where available) or a Metropolis-Hastings sampling algorithm.

The FMM procedure assumes that the random variable $\mb {S}$ has a discrete multinomial distribution with probability $\pi _ j$ of belonging to a component j; it can occupy one of k states. The distribution for the latent variable $\mb {S}$ is

\begin{equation*}  f(S_ i=j|\pi _1,\hdots ,\pi _ k) = \mbox{multinomial}(1,\pi _1,\hdots ,\pi _ k) \end{equation*}

where $f(\cdot |\cdot )$ denotes a conditional probability density. The parameters in the density $\pi _ j$ denote the probability that S takes on state j.

The FMM procedure assumes a conjugate Dirichlet prior distribution on the mixture proportions $\pi _ j$ written as:

$\displaystyle  p(\bpi )  $
$\displaystyle  =  $
$\displaystyle  \mbox{Dirichlet}(a_1,\hdots ,a_ k)  $

where $p(\cdot )$ indicates a prior distribution.

Using Bayes’ theorem, the likelihood function and prior distributions determine a conditionally conjugate posterior distribution of $\mb {S}$ and $\bpi $ from the multinomial distribution and Dirichlet distribution, respectively.