Analysis of Variance
Analysis of variance in the contemporary sense of statistical modeling and analysis is the study
of the influences on the variation of a phenomenon. This type of analysis may, for example, take
the form of an analysis of variance table based on sums of squares, a deviance decomposition in
a generalized linear model, or a series of Type III tests followed by comparisons of least squares
means in a mixed model.
See the chapter Introduction to Analysis of Variance Procedures
in the SAS/STAT User's Guide for a more in depth discussion of this topic and SAS/STAT procedures that enable you to perform this type of statistical analysis.
The SAS/STAT analysis of variance procedures include the following:
ANOVA Procedure
The ANOVA procedure performs analysis of variance for balanced data from a wide variety of experimental designs.
Use PROC ANOVA for the analysis of balanced data only, with the following exceptions: oneway analysis of variance,
Latin squares designs, certain partially balanced incomplete block designs, completely nested (hierarchical) designs,
designs with cell frequencies that are proportional to each other and are also proportional to the background population.
These exceptions have designs in which the factors are all orthogonal to each other.
The procedure enables you to do the following:
 absorb classification effects in a model
 perform multivariate analysis of variance (MANOVA)
 compute and compare means
 perform multiple comparison tests
 perform multivariate and univariate repeated measurements analysis of variance
 specify variables to define subgroups for the analysis

 construct tests that use the sum of squares for effects and the error term you specify
 create a data set that contains sums of squares, degrees of freedom, F statistics, and probability levels for each effect in the model
 create a data set that corresponds to any output table
 automatically produce graphs by using ODS Graphics

For further details, see
ANOVA Procedure
CATMOD Procedure
The CATMOD procedure performs categorical data modeling of data that can be represented by a contingency table.
PROC CATMOD fits linear models to functions of response frequencies, and it can be used for linear modeling,
loglinear modeling, logistic regression, and repeated measurement analysis.
The procedure enables you to do the following:
 estimate model parameters by using weighted least squares (WLS) for a wide range of general linear
models or maximum likelihood (ML) for loglinear models and the analysis of generalized logits
 supply raw data, where each observation is a subject, supply cell count data,
where each observation is a cell in a contingency table, or directly input a covariance matrix
 construct linear functions of the model parameters or loglinear effects and test the hypothesis that the linear combination equals zero
 perform constrained estimation
 perform BY group precessing, which enables you to obtain separate analyses on grouped observations

 create a data set that contains the observed and predicted values of the response
functions, their standard errors, the residuals, and variables that describe the population and response
profiles. In addition, if you use the standard response functions, the data set includes observed
and predicted values for the cell frequencies or the cell probabilities, together with their standard errors and residuals.
 create a data set that contains the estimated parameter vector and its estimated covariance matrix
 create a data set that corresponds to any output table

For further details, see
CATMOD Procedure
GLM Procedure
The GLM procedure uses the method of least squares to fit general linear models.
Among the statistical methods available in PROC GLM are regression, analysis of variance,
analysis of covariance, multivariate analysis of variance, and partial correlation.
The following are highlights of the procedure's features:
 enables you to specify any degree of interaction (crossed effects) and nested effects
 enables you to specify polynomial, continuousbyclass, and continuousnesting class effects
 enables you to absorb classification effects in a model
 enables you to specify random effects in a model
 produces expected mean squares for each Type I, Type II, Type III, Type IV, and contrast mean squares used in the analysis
 enables you to specify both hypothesis effects and the error effect to use for a multivariate analysis of variance
 performs BY group processing, which enables you to obtain separate analyses on grouped observations
 computes least square means and least square mean differences for classification effects
 performs multiple comparison adjustments for the pvalues and confidence limits for the least square mean differences
 computes arithmetic means and standard deviations of all continuous variables in a model within each group corresponding to each effect

 performs multiple comparison of main effect means
 tests hypotheses for the effects of a linear model regardless of the number of missing cells or the extent of confounding
 performs F tests that use appropriate mean squares or linear combinations of mean squares as error terms
 estimates linear functions of the model parameters
 tests hypotheses for linear combinations of the model parameters
 displays the sum of squares associated with each hypothesis tested and, upon request, the form of the estimable function employed in a test.
 produces the general form of all estimable functions
 creates an output data set that contains the input data set, predicted values, residuals, and other diagnostic measures
 creates a SAS data set that corresponds to any output table
 automatically creates graphs by using ODS Graphics

For further details, see
GLM Procedure
INBREED Procedure
The INBREED procedure calculates the covariance or inbreeding coefficients for a pedigree.
PROC INBREED is unique in that it handles very large populations.
The following are highlights of the procedure's features:
 supports two modes of operation:
 Mode 1: carries out analysis on the assumption that all the individuals belong to the same generation
 Mode 2: divides the population into nonoverlapping generations and analyzes each generation separately,
assuming that the parents of individuals in the current generation are defined in the previous generation.

 computes averages of the covariance or inbreeding coefficients within sex categories if the sex of individuals is known
 performs BY group processing, which enables you to obtain separate analyses on grouped observations
 creates a SAS data set that corresponds to any output table

For further details, see
INBREED Procedure
LATTICE Procedure
The LATTICE procedure computes the analysis of variance and analysis of simple covariance for data from an experiment
with a lattice design. PROC LATTICE analyzes balanced square lattices, partially balanced square lattices, and some
rectangular lattices. The following are highlights of the LATTICE procedure's features:
 determines from the data set which type of design has been used and verifies that the design is valid
 produces output data sets including:
 analysis of variance
 adjusted treatment means
 additional statistics

 performs BY group processing, which enables you to obtain separate analyses on grouped observations
 creates a SAS data set that corresponds to any output table

For further details, see
LATTICE Procedure
NESTED Procedure
The NESTED procedure performs randomeffects analysis of variance for data from an experiment with a nested (hierarchical) structure.
The following are highlights of the NESTED procedure's features:
 provides a descriptive analysis of covariation
 accommodates unbalanced data
 automatically displays the following for each dependent variable:
 Coefficients of Expected Mean Squares
 each Variance Source in the model (the different components of variance) and the total variance
 degrees of freedom (DF) for the corresponding sum of squares
 Sum of Squares for each classification factor
 F Value for a factor and the significance levels of a test of the hypothesis that each variance component equals zero
 the appropriate Error Term for an F test
 Mean Square due to a factor
 estimates of the Variance Components
 Percent of Total (the proportion of variance due to each source)
 Mean

 automatically displays the following when there are multiple dependent variables:
 degrees of freedom
 sum of products
 mean products
 covariance component
 variance component correlation
 mean square correlation
 performs BY group processing, which enables you to obtain separate analyses on grouped observations
 creates a SAS data set that corresponds to any output table

For further details, see
NESTED Procedure
PLAN Procedure
The PLAN procedure constructs designs and randomizes plans for factorial experiments, especially nested and crossed experiments and randomized block designs.
PROC PLAN can also be used for generating lists of permutations and combinations of numbers.
The following are highlights of the PLAN procedure's features:
 supports the following randomization selection methods:
 randomized selection, for which the levels are returned in a random order
 ordered selection, for which the levels are returned in a standard order every time a selection is generated
 cyclic selection, for which the levels returned are computed by cyclically permuting the levels of the previous selection
 permuted selection, for which the levels are a permutation of the integers 1, ... , n
 combination selection, for which the m levels are selected as a combination of the integers 1, ... , n taken m at a time

 construct the following types of experimental designs:
 full factorial designs, with and without randomization
 certain balanced and partially balanced incomplete block designs
 generalized cyclic incomplete block designs
 Latin square designs
 can be used interactively
 creates a SAS data set that corresponds to any output table

For further details, see
PLAN Procedure
TTEST Procedure
The TTEST procedure performs t tests and computes confidence limits for one sample, paired observations, two independent samples,
and the AB/BA crossover design. Twosided, TOST (two onesided test) equivalence, and upper and lower onesided hypotheses are
supported for means, mean differences, and mean ratios for either normal or lognormal data. PROC TTEST also provides the following features:
 performs BY group processing, which enables you to obtain separate analyses on grouped observations
 creates a SAS data set that corresponds to any output table

 automatically creates graphs by using ODS Graphics

For further details, see
TTEST Procedure
Related Papers
Making Comparisons Fair: How LSMeans Unify the Analysis of Linear Models
Having an EFFECT: More General Linear Modeling and Analysis with the New EFFECT Statement in SAS/STAT Software
CONTRAST and ESTIMATE Statements Made Easy: The LSMESTIMATE Statement
Up To Speed With Categorical Data Analysis
Like Wine, the TTEST Procedure Improves with Age