FOCUS AREAS

SAS/STAT Topics

SAS/STAT Software

Mixed Models

A mixed model is a model that contains fixed and random effects. Over the last few decades virtually every form of classical statistical model has been enhanced to accommodate random effecs. The linear model has been extended to the linear mixed model, generalized linear models have been extended to generalized linear mixed models, and so on. In parallel with this trend, SAS/STAT software offers a number of classical and contemporary mixed modeling tools.

The SAS/STAT mixed models procedures include the following:

GLIMMIX Procedure


The GLIMMIX procedure fits statistical models to data with correlations or nonconstant variability and where the response is not necessarily normally distributed. These models are known as generalized linear mixed models (GLMM). GLMMs, like linear mixed models, assume normal (Gaussian) random effects. Conditional on these random effects, data can have any distribution in the exponential family. The following are highlights of the GLIMMIX procedure's features:

  • provides the following built-in link functions:
    • cumulative complementary log-log
    • cumulative logit
    • cumulative log-log
    • cumulative probit
    • complementary log-log
    • generalized logit
    • identity
    • log
    • logit
    • log-log
    • probit
    • power with exponent λ = number
    • power with exponent -2
    • reciprocal
  • provides the following built-in distributions and associated variance functions:
    • beta
    • binary
    • binomial
    • exponential
    • gamma
    • normal
    • geometric
    • inverse gaussian
    • lognormal
    • negative binomial
    • Poisson
    • t
  • use SAS programming statements within the procedure to compute model effects, weights, frequency, subject, group, and other variables, and to define mean and variance functions
  • fits covariance structures including:
    • ANTE(1)
    • AR(1)
    • ARH(1)
    • ARMA(1,1)
    • Cholesky
    • compound symmetry
    • heterogeneous compound symmetry
    • factor analytic
    • Huynh-Feldt
    • general linear
    • P-spline
    • radial smoother
    • simple
    • exponential spatial
    • gaussian
    • Matern
    • power
    • anisitropic power
    • spherical
    • Toeplitz
    • unstructured
  • permits subject and group effects that enable blocking and heterogeneity, respectively
  • permits weighted multilevel models for analyzing survey data that arise from multistage sampling
  • choice of linearization approach or integral approximation by quadrature or Laplace method for mixed models with nonlinear random effects or nonnormal distribution
  • choice of linearization about expected values or expansion about current solutions of best linear unbiased predictors (BLUP)
  • flexible covariance structures for random and residual random effects, including variance components, unstructured, autoregressive, and spatial structures
  • produce hypothesis tests and estimable linear combinations of effects
  • provides a mechanism to obtain inferences for the covariance parameters. Significance tests are based on the ratio of (residual) likelihoods or pseudo-likelihoods. Confidence limits and bounds are computed as Wald or likelihood ratio limits.
  • construct special collections of columns for the design matrices in your model. These special collections, which are referred to as constructed effects can include the following:
    • COLLECTION is a collection effect defining one or more variables as a single effect with multiple degrees of freedom. The variables in a collection are considered as a unit for estimation and inference.
    • MULTIMEMBER | MM is a multimember classification effect whose levels are determined by one or more variables that appear in a CLASS statement.
    • POLYNOMIAL | POLY is a multivariate polynomial effect in the specified numeric variables.
    • SPLINE is a regression spline effect whose columns are univariate spline expansions of one or more variables. A spline expansion replaces the original variable with an expanded or larger set of new variables.
  • provides the following estimation methods:
    • RSPL
    • MSPL
    • RMPL
    • MMPL
    • Laplace
    • adaptive quadrature
  • enables you to exercise control over the numerical optimization. You can choose techniques, update methods, line search algorithms, convergence criteria, and more. Or, you can choose the default optimization strategies selected for the particular class of model you are fitting.
  • enables you to generate variables with SAS programming statements inside of PROC GLIMMIX (except for variables listed in the CLASS statement).
  • performs grouped data analysis
  • supports BY group processing, which enebales you to obtain separate analyses on grouped observations
  • use ODS to create a SAS data set corresponding to any table
  • automaticlly generates graphs by using ODS Graphics
For further details, see GLIMMIX Procedure

HPMIXED Procedure


The HPMIXED procedure uses a number of specialized high-performance techniques to fit linear mixed models with variance component structure. The following are highlights of the HPMIXED procedure's features:

  • specifically designed to cope with estimation problems involving:
    • linear mixed models with thousands of levels for the fixed and/or random effects
    • linear mixed models with hierarchically nested fixed and/or random effects, possibly with hundreds or thousands of levels at each level of the hierarchy
  • enables you to specify a linear mixed model with variance component structure, to estimate the covariance parameters by restricted maximum likelihood, and to perform confirmatory inference in such models
  • computes appropriate standard errors for all specified estimable linear combinations of fixed and random effects, and corresponding t and F tests
  • permits subject and group effects that enable blocking and heterogeneity, respectively
  • provides a mechanism for obtaining custom hypothesis tests
  • computes least squares means (LS-means) of fixed effects
  • perform weighted estimation
  • supports BY group processing, which enables you tp obtain separate analyses on grouped observations
  • creates a data set that contains predicted values and residual diagnostics
  • creates a SAS data set that corresponds to any output table
For further details, see HPMIXED Procedure

MIXED Procedure


The MIXED procedure fits a variety of mixed linear models to data and enables you to use these fitted models to make statistical inferences about the data. A mixed linear model is a generalization of the standard linear model used in the GLM procedure, the generalization being that the data are permitted to exhibit correlation and nonconstant variability. The mixed linear model, therefore, provides you with the flexibility of modeling not only the means of your data (as in the standard linear model) but their variances and covariances as well. The following are highlights of the MIXED procedure's features:

  • fits general linear models with fixed and random effects under the assumption that the data are normally distributed. The types of models include:
    • simple regression
    • multiple regression
    • analysis of variance for balanced or unbalanced data
    • analysis of covariance
    • response surface models
    • weighted regression
    • polynomial regression
    • multivariate analysis of variance (MANOVA)
    • partial correlation
    • repeated measures analysis of variance
  • fits covariance structures including:
    • variance components
    • compound symmetry
    • unstructured
    • AR(1) and (ARMA(1,1,)
    • Toeplitz
    • spatial
    • general linear
    • factor analytic
  • offers six estimation methods for the covariance parameters including:
    • Restricted Maximum Likelihood (REML)
    • Maximum Likelihood (ML)
    • Method of Moments
    • MIVQUE0
    • Type I
    • Type II
    • Type III
  • uses PROC GLM - type syntax by using MODEL, RANDOM, and REPEATED statements for model specification and CONTRAST, ESTIMATE, and LSMEANS statements for inferences
  • provides appropriate standard errors for all specified estimable linear combinations of fixed and random effects, and corresponding t and F tests
  • enables you to construct custom hypothesis tests
  • enables you to construct custom scalar estimates and their confidence limits
  • computes least square means and least square mean differences for classification fixed effects
  • permits subject and group effects that enable blocking and heterogeneity, respectively
  • performs multiple comparison of main effect means
  • accommodates unbalanced data
  • computes Type I, Type II, and Type III tests of fixed effects
  • performs sampling-based Bayesian analysis
  • performs weighted estimation
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • creates a SAS data set that corresponds to any output table
  • automatically creates graphs by using ODS Graphics
For further details, see MIXED Procedure

NLMIXED Procedure


The NLMIXED procedure fits nonlinear mixed models—that is, models in which both fixed and random effects enter nonlinearly. These models have a wide variety of applications, two of the most common being pharmacokinetics and overdispersed binomial data. The following are highlights of the NLMIXED procedure's features:

  • enables you to specify a conditional distribution for your data (given the random effects) having either a standard form or a general distribution that you code using SAS programming statements. The standard forms include the following:
    • normal
    • binary
    • binomial
    • gamma
    • negative binomial
    • Poisson
  • fits nonlinear mixed models by maximizing an approximation to the likelihood integrated over the random effects. Different integral approximations are available, the principal ones being adaptive Gaussian quadrature and a first-order Taylor series approximation.
  • enables you to use the estimated model to construct predictions of arbitrary functions by using empirical Bayes estimates of the random effects
  • enables you to specify more than one RANDOM statement in order to fit hierarchical nonlinear mixed models
  • enables you to estimate arbitrary functions of the nonrandom parameters and compute their approximate standard errors by using the delta method
  • constructs predictions of an expression across all of the observations in the input data set
  • accommodates models in which different subjects have identical data
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • creates a SAS data set that corresponds to any output table
For further details, see NLMIXED Procedure

PHREG Procedure


The PHREG procedure performs regression analysis of survival data based on the Cox proportional hazards model. Cox's semiparametric model is widely used in the analysis of survival data to explain the effect of explanatory variables on hazard rates. The following are highlights of the PHREG procedure's features:

  • fits a superset of the Cox model, known as the multiplicative hazards model or the Anderson-Gill model
  • fits frailty models
  • fits competing risk model of Fine and Gray
  • performs stratified analysis
  • includes four methods for handling ties in the failure times
  • provides four methods of variable selection
  • permits an offset in the model
  • performs weighted estimation
  • enables you to use SAS programming statements within the procedure to modify values of the explanatory variables or to create ne explanatory variables
  • tests linear hypotheses about the regression parameters
  • estimates customized hazard ratios
  • performs graphical and numerical assessment of the adequacy of the Cox regression model
  • creates a new SAS data set that contains the baseline function estimates at the event times of each stratum for every specified set of covariates
  • outputs survivor function estimates, residuals, and regression diagnostics
  • performs conditional logistic regression analysis for matched case-control studies
  • fits multinomial logit choice models for discrete choice data
  • performs sampling-based Bayesian analysis
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • creates an output data set that contains parameter and covariance estimates
  • creates an output data set that contains user-specified statistics
  • creates a SAS data set that corresponds to any output table
  • automatically created graphs by using ODS Graphics
For further details, see PHREG Procedure

VARCOMP Procedure


The VARCOMP procedure handles general linear models that have random effects. Random effects are classification effects with levels that are assumed to be randomly selected from an infinite population of possible levels. PROC VARCOMP estimates the contribution of each of the random effects to the variance of the dependent variable. The following are highlights of the VARCOMP procedure's features:

  • enables you to specify four general methods of estimation:
    • Type I
    • MIVQUE0
    • ML
    • REML
  • provides a specialized analysis for gauge repeatability and reproducibility (GRR) studies for balanced one-way or two-way designs
  • optionally computes confidence limits using modified large-sample methods when either the Type I or GRR methods are used
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • creates a SAS data set that corresponds to any output table
For further details, see VARCOMP Procedure