FOCUS AREAS

SAS/STAT Topics

SAS/STAT Software

Analysis of Variance

Analysis of variance in the contemporary sense of statistical modeling and analysis is the study of the influences on the variation of a phenomenon. This type of analysis may, for example, take the form of an analysis of variance table based on sums of squares, a deviance decomposition in a generalized linear model, or a series of Type III tests followed by comparisons of least squares means in a mixed model.

See the chapter Introduction to Analysis of Variance Procedures in the SAS/STAT User's Guide for a more in depth discussion of this topic and SAS/STAT procedures that enable you to perform this type of statistical analysis.

The SAS/STAT analysis of variance procedures include the following:

ANOVA Procedure


The ANOVA procedure performs analysis of variance for balanced data from a wide variety of experimental designs. Use PROC ANOVA for the analysis of balanced data only, with the following exceptions: one-way analysis of variance, Latin squares designs, certain partially balanced incomplete block designs, completely nested (hierarchical) designs, designs with cell frequencies that are proportional to each other and are also proportional to the background population. These exceptions have designs in which the factors are all orthogonal to each other. The procedure enables you to do the following:

  • absorb classification effects in a model
  • perform multivariate analysis of variance (MANOVA)
  • compute and compare means
  • perform multiple comparison tests
  • perform multivariate and univariate repeated measurements analysis of variance
  • specify variables to define subgroups for the analysis
  • construct tests that use the sum of squares for effects and the error term you specify
  • create a data set that contains sums of squares, degrees of freedom, F statistics, and probability levels for each effect in the model
  • create a data set that corresponds to any output table
  • automatically produce graphs by using ODS Graphics
For further details, see ANOVA Procedure

CATMOD Procedure


The CATMOD procedure performs categorical data modeling of data that can be represented by a contingency table. PROC CATMOD fits linear models to functions of response frequencies, and it can be used for linear modeling, log-linear modeling, logistic regression, and repeated measurement analysis. The procedure enables you to do the following:

  • estimate model parameters by using weighted least squares (WLS) for a wide range of general linear models or maximum likelihood (ML) for log-linear models and the analysis of generalized logits
  • supply raw data, where each observation is a subject, supply cell count data, where each observation is a cell in a contingency table, or directly input a covariance matrix
  • construct linear functions of the model parameters or log-linear effects and test the hypothesis that the linear combination equals zero
  • perform constrained estimation
  • perform BY group precessing, which enables you to obtain separate analyses on grouped observations
  • create a data set that contains the observed and predicted values of the response functions, their standard errors, the residuals, and variables that describe the population and response profiles. In addition, if you use the standard response functions, the data set includes observed and predicted values for the cell frequencies or the cell probabilities, together with their standard errors and residuals.
  • create a data set that contains the estimated parameter vector and its estimated covariance matrix
  • create a data set that corresponds to any output table
For further details, see CATMOD Procedure

GLM Procedure


The GLM procedure uses the method of least squares to fit general linear models. Among the statistical methods available in PROC GLM are regression, analysis of variance, analysis of covariance, multivariate analysis of variance, and partial correlation. The following are highlights of the procedure's features:

  • enables you to specify any degree of interaction (crossed effects) and nested effects
  • enables you to specify polynomial, continuous-by-class, and continuous-nesting class effects
  • enables you to absorb classification effects in a model
  • enables you to specify random effects in a model
  • produces expected mean squares for each Type I, Type II, Type III, Type IV, and contrast mean squares used in the analysis
  • enables you to specify both hypothesis effects and the error effect to use for a multivariate analysis of variance
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • computes least square means and least square mean differences for classification effects
  • performs multiple comparison adjustments for the p-values and confidence limits for the least square mean differences
  • computes arithmetic means and standard deviations of all continuous variables in a model within each group corresponding to each effect
  • performs multiple comparison of main effect means
  • tests hypotheses for the effects of a linear model regardless of the number of missing cells or the extent of confounding
  • performs F tests that use appropriate mean squares or linear combinations of mean squares as error terms
  • estimates linear functions of the model parameters
  • tests hypotheses for linear combinations of the model parameters
  • displays the sum of squares associated with each hypothesis tested and, upon request, the form of the estimable function employed in a test.
  • produces the general form of all estimable functions
  • creates an output data set that contains the input data set, predicted values, residuals, and other diagnostic measures
  • creates a SAS data set that corresponds to any output table
  • automatically creates graphs by using ODS Graphics
For further details, see GLM Procedure

INBREED Procedure


The INBREED procedure calculates the covariance or inbreeding coefficients for a pedigree. PROC INBREED is unique in that it handles very large populations. The following are highlights of the procedure's features:

  • supports two modes of operation:
    • Mode 1: carries out analysis on the assumption that all the individuals belong to the same generation
    • Mode 2: divides the population into nonoverlapping generations and analyzes each generation separately, assuming that the parents of individuals in the current generation are defined in the previous generation.
  • computes averages of the covariance or inbreeding coefficients within sex categories if the sex of individuals is known
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • creates a SAS data set that corresponds to any output table
For further details, see INBREED Procedure

LATTICE Procedure


The LATTICE procedure computes the analysis of variance and analysis of simple covariance for data from an experiment with a lattice design. PROC LATTICE analyzes balanced square lattices, partially balanced square lattices, and some rectangular lattices. The following are highlights of the LATTICE procedure's features:

  • determines from the data set which type of design has been used and verifies that the design is valid
  • produces output data sets including:
    • analysis of variance
    • adjusted treatment means
    • additional statistics
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • creates a SAS data set that corresponds to any output table
For further details, see LATTICE Procedure

NESTED Procedure


The NESTED procedure performs random-effects analysis of variance for data from an experiment with a nested (hierarchical) structure. The following are highlights of the NESTED procedure's features:

  • provides a descriptive analysis of covariation
  • accommodates unbalanced data
  • automatically displays the following for each dependent variable:
    • Coefficients of Expected Mean Squares
    • each Variance Source in the model (the different components of variance) and the total variance
    • degrees of freedom (DF) for the corresponding sum of squares
    • Sum of Squares for each classification factor
    • F Value for a factor and the significance levels of a test of the hypothesis that each variance component equals zero
    • the appropriate Error Term for an F test
    • Mean Square due to a factor
    • estimates of the Variance Components
    • Percent of Total (the proportion of variance due to each source)
    • Mean
  • automatically displays the following when there are multiple dependent variables:
    • degrees of freedom
    • sum of products
    • mean products
    • covariance component
    • variance component correlation
    • mean square correlation
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • creates a SAS data set that corresponds to any output table
For further details, see NESTED Procedure

PLAN Procedure


The PLAN procedure constructs designs and randomizes plans for factorial experiments, especially nested and crossed experiments and randomized block designs. PROC PLAN can also be used for generating lists of permutations and combinations of numbers. The following are highlights of the PLAN procedure's features:

  • supports the following randomization selection methods:
    • randomized selection, for which the levels are returned in a random order
    • ordered selection, for which the levels are returned in a standard order every time a selection is generated
    • cyclic selection, for which the levels returned are computed by cyclically permuting the levels of the previous selection
    • permuted selection, for which the levels are a permutation of the integers 1, ... , n
    • combination selection, for which the m levels are selected as a combination of the integers 1, ... , n taken m at a time
  • construct the following types of experimental designs:
    • full factorial designs, with and without randomization
    • certain balanced and partially balanced incomplete block designs
    • generalized cyclic incomplete block designs
    • Latin square designs
  • can be used interactively
  • creates a SAS data set that corresponds to any output table
For further details, see PLAN Procedure

TTEST Procedure


The TTEST procedure performs t tests and computes confidence limits for one sample, paired observations, two independent samples, and the AB/BA crossover design. Two-sided, TOST (two one-sided test) equivalence, and upper and lower one-sided hypotheses are supported for means, mean differences, and mean ratios for either normal or lognormal data. PROC TTEST also provides the following features:

  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • creates a SAS data set that corresponds to any output table
  • automatically creates graphs by using ODS Graphics
For further details, see TTEST Procedure

Related Papers

Making Comparisons Fair: How LS-Means Unify the Analysis of Linear Models

Having an EFFECT: More General Linear Modeling and Analysis with the New EFFECT Statement in SAS/STAT Software

CONTRAST and ESTIMATE Statements Made Easy: The LSMESTIMATE Statement

Up To Speed With Categorical Data Analysis

Like Wine, the TTEST Procedure Improves with Age