The HPFMM Procedure

Basic Features

The HPFMM procedure estimates the parameters in univariate finite mixture models and produces various statistics to evaluate parameters and model fit. The following list summarizes some basic features of the HPFMM procedure:

  • maximum likelihood estimation for all models

  • Markov chain Monte Carlo estimation for many models, including zero-inflated Poisson models

  • many built-in link and distribution functions for modeling, including the beta, shifted t, Weibull, beta-binomial, and generalized Poisson distributions, in addition to many standard members of the exponential family of distributions

  • specialized built-in mixture models such as the binomial cluster model (Morel and Nagaraj 1993; Morel and Neerchal 1997; Neerchal and Morel 1998)

  • acceptance of multiple MODEL statements to build mixture models in which the model effects, distributions, or link functions vary across mixture components

  • model-building syntax using CLASS and effect-based MODEL statements familiar from many other SAS/STAT procedures (for example, the GLM, GLIMMIX, and MIXED procedures)

  • evaluation of sequences of mixture models when you specify ranges for the number of components

  • simple syntax to impose linear equality and inequality constraints among parameters

  • ability to model regression and classification effects in the mixing probabilities through the PROBMODEL statement

  • ability to incorporate full or partially known component membership into the analysis through the PARTIAL= option in the PROC HPFMM statement

  • OUTPUT statement that produces a SAS data set with important statistics for interpreting mixture models, such as component log likelihoods and prior and posterior probabilities

  • ability to add zero-inflation to any model

  • output data set with posterior parameter values for the Markov chain

  • multithreading and distributed computing for high-performance optimization and Monte Carlo sampling

The HPFMM procedure uses ODS Graphics to create graphs as part of its output. For general information about ODS Graphics, see Chapter 21: Statistical Graphics Using ODS. For specific information about the statistical graphics available with the HPFMM procedure, see the PLOTS options in the PROC HPFMM statement.

Because the HPFMM procedure is a high-performance analytical procedure, it also does the following:

  • enables you to run in distributed mode on a cluster of machines that distribute the data and the computations

  • enables you to run in single-machine mode on the server where SAS is installed

  • exploits all the available cores and concurrent threads, regardless of execution mode

For more information, see the section Processing Modes in SAS/STAT 14.1 User's Guide: High-Performance Procedures.