Longitudinal data (also known as panel data) arises when you measure a response variable of interest repeatedly through time
for multiple subjects. Thus, longitudinal data combines the characteristics of both crosssectional data and timeseries data.
The response variables in longitudinal studies can be either continuous or discrete.
The objective of a statistical analysis of longitudinal data is usually to model the expected value of the response variable as
either a linear or nonlinear function of a set of explanatory variables.
Statistical analysis of longitudinal data requires an accounting for possible betweensubject heterogeneity and withinsubject correlation.
SAS/STAT software provides two approaches for modeling longitudinal data: marginal models (also known as populationaverage models) and
mixed models (also known as subjectspecific models).
The SAS/STAT longitudinal data analysis procedures include the following:
GEE Procedure
The GEE procedure fits generalized linear models for longitudinal data by using the generalized estimating equations (GEE)
estimation method of Liang and Zeger (1986). The GEE method fits a marginal model to longitudinal data and is commonly used to analyze
longitudinal data when the populationaverage effect is of interest.
The following are highlights of the GEE procedure's features:
 perform weighted GEE estimation when there are missing data that are missing at random (MAR)
 supports the following response variable distributions:
 binomial
 gamma
 inverse Gaussian
 negative binomial
 normal
 Poisson
 multinomial
 supports the following link functions:
 complementary loglog
 identity
 log
 logit
 probit
 reciprocal
 power with exponent 2

 supports the following correlation structures:
 first order autoregressive
 exchangeable
 independent
 mdependent
 unstructured
 fixed (user specified)
 performs alternating logistic regression analysis for ordinal and binary data
 supports ESTIMATE, LSMEANS, and OUTPUT statements
 creates a SAS data set that corresponds to any output table
 automatically creates graphs by using ODS Graphics

For further details, see
GEE Procedure
GENMOD Procedure
The GENMOD procedure fits generalized linear models, as defined by Nelder and Wedderburn (1972). The class of generalized
linear models is an extension of traditional linear models that allows the mean of a population to depend on a linear predictor
through a nonlinear link function and allows the response probability distribution to be any member of an exponential family of
distributions. Many widely used statistical models are generalized linear models. These include classical linear models with normal
errors, logistic and probit models for binary data, and loglinear models for multinomial data. Many other useful statistical models
can be formulated as generalized linear models by the selection of an appropriate link function and response probability distribution.
The following are highlights of the GENMOD procedure's features:
 provides the following builtin distributions and associated variance functions:
 normal
 binomial
 Poisson
 gamma
 inverse Gaussian
 negative binomial
 geometric
 multinomial
 zeroinflated Poisson
 Tweedie
 provides the following builtin link functions:
 identity
 logit
 probit
 power
 log
 complementary loglog
 enables you to define your own link functions or distributions through DATA step
programming statements used within the procedure
 fits models to correlated responses by the GEE method

 perform Bayesian analysis for generalized linear models
 performs exact logistic regression
 performs exact Poisson regression
 enables you to fit a sequence of models and to perform Type I and Type III analyses
between each successive pair of models
 computes likelihood ratio statistics for userdefined contrasts
 computes estimated values, standard errors, and confidence limits for userdefined
contrasts and least squares means
 computes confidence intervals for model parameters based on either the profile
likelihood function or asymptotic normality
 produces an overdispersion diagnostic plot for zeroinflated models
 performs BY group processing, which enables you to obtain separate analyses on grouped observations
 creates SAS data sets that correspond to most output tables
 automatically generates graphs by using ODS Graphics

For further details, see
GENMOD Procedure
GLIMMIX Procedure
The GLIMMIX procedure fits statistical models to data with correlations or nonconstant variability and where the response is not necessarily
normally distributed. These models are known as generalized linear mixed models (GLMM).
GLMMs, like linear mixed models, assume normal (Gaussian) random effects. Conditional on these random effects, data can have any distribution
in the exponential family. The following are highlights of the GLIMMIX procedure's features:
 provides the following builtin link functions:
 cumulative complementary loglog
 cumulative logit
 cumulative loglog
 cumulative probit
 complementary loglog
 generalized logit
 identity
 log
 logit
 loglog
 probit
 power with exponent λ = number
 power with exponent 2
 reciprocal
 provides the following builtin distributions and associated variance functions:
 beta
 binary
 binomial
 exponential
 gamma
 normal
 geometric
 inverse gaussian
 lognormal
 negative binomial
 Poisson
 t
 use SAS programming statements within the procedure to compute model effects,
weights, frequency, subject, group, and other variables, and to define mean
and variance functions
 fits covariance structures including:
 ANTE(1)
 AR(1)
 ARH(1)
 ARMA(1,1)
 Cholesky
 compound symmetry
 heterogeneous compound symmetry
 factor analytic
 HuynhFeldt
 general linear
 Pspline
 radial smoother
 simple
 exponential spatial
 gaussian
 Matern
 power
 anisitropic power
 spherical
 Toeplitz
 unstructured
 permits subject and group effects that enable blocking and heterogeneity, respectively
 permits weighted multilevel models for analyzing survey data that arise from multistage sampling
 choice of linearization approach or integral approximation by quadrature or Laplace method
for mixed models with nonlinear random effects or nonnormal distribution
 choice of linearization about expected values or expansion about current solutions of best
linear unbiased predictors (BLUP)

 flexible covariance structures for random and residual random effects, including variance
components, unstructured, autoregressive, and spatial structures
 produce hypothesis tests and estimable linear combinations of effects
 provides a mechanism to obtain inferences for the covariance parameters.
Significance tests are based on the ratio of (residual) likelihoods or pseudolikelihoods.
Confidence limits and bounds are computed as Wald or likelihood ratio limits.
 construct special collections of columns for the design matrices in your model.
These special collections, which are referred to as constructed effects
can include the following:
 COLLECTION is a collection effect defining one or more variables as a single effect
with multiple degrees of freedom. The variables in a collection are
considered as a unit for estimation and inference.
 MULTIMEMBER  MM is a multimember classification effect whose levels are determined
by one or more variables that appear in a CLASS statement.
 POLYNOMIAL  POLY is a multivariate polynomial effect in the specified numeric variables.
 SPLINE is a regression spline effect whose columns are univariate spline expansions
of one or more variables. A spline expansion replaces the
original variable with an expanded or larger set of new variables.
 provides the following estimation methods:
 RSPL
 MSPL
 RMPL
 MMPL
 Laplace
 adaptive quadrature
 enables you to exercise control over the numerical optimization.
You can choose techniques, update methods, line search algorithms, convergence criteria,
and more. Or, you can choose the default optimization strategies selected for the particular
class of model you are fitting.
 enables you to generate variables with SAS programming statements inside of PROC GLIMMIX (except
for variables listed in the CLASS statement).
 performs grouped data analysis
 supports BY group processing, which enebales you to obtain separate analyses on grouped observations
 use ODS to create a SAS data set corresponding to any table
 automaticlly generates graphs by using ODS Graphics

For further details, see
GLIMMIX Procedure
MIXED Procedure
The MIXED procedure fits a variety of mixed linear models to data and enables you to use these fitted models to make statistical inferences
about the data. A mixed linear model is a generalization of the standard linear model used in the GLM procedure, the generalization being
that the data are permitted to exhibit correlation and nonconstant variability. The mixed linear model, therefore, provides you with the
flexibility of modeling not only the means of your data (as in the standard linear model) but their variances and covariances as well.
The following are highlights of the MIXED procedure's features:
 fits general linear models with fixed and random effects under the assumption
that the data are normally distributed. The types of models include:
 simple regression
 multiple regression
 analysis of variance for balanced or unbalanced data
 analysis of covariance
 response surface models
 weighted regression
 polynomial regression
 multivariate analysis of variance (MANOVA)
 partial correlation
 repeated measures analysis of variance
 fits covariance structures including:
 variance components
 compound symmetry
 unstructured
 AR(1) and (ARMA(1,1,)
 Toeplitz
 spatial
 general linear
 factor analytic
 offers six estimation methods for the covariance parameters including:
 Restricted Maximum Likelihood (REML)
 Maximum Likelihood (ML)
 Method of Moments
 MIVQUE0
 Type I
 Type II
 Type III

 uses PROC GLM  type syntax by using MODEL, RANDOM, and REPEATED statements
for model specification and CONTRAST, ESTIMATE, and LSMEANS statements for inferences
 provides appropriate standard errors for all specified estimable linear combinations
of fixed and random effects, and corresponding t and F tests
 enables you to construct custom hypothesis tests
 enables you to construct custom scalar estimates and their confidence limits
 computes least square means and least square mean differences for classification fixed effects
 permits subject and group effects that enable blocking and heterogeneity, respectively
 performs multiple comparison of main effect means
 accommodates unbalanced data
 computes Type I, Type II, and Type III tests of fixed effects
 performs samplingbased Bayesian analysis
 performs weighted estimation
 performs BY group processing, which enables you to obtain separate analyses on grouped observations
 creates a SAS data set that corresponds to any output table
 automatically creates graphs by using ODS Graphics

For further details, see
MIXED Procedure