FOCUS AREAS

SAS/STAT Topics

SAS/STAT Software

Multivariate Analysis

The multivariate analysis procedures are used to investigate relationships among variables without designating some as independent and others as dependent.

Below are highlights of the capabilities of the SAS/STAT procedures that perform multivariate analysis:

CANCORR Procedure


The CANCORR procedure performs canonical correlation, partial canonical correlation, and canonical redundancy analysis. The procedure enables you to do the following:

  • test a series of hypotheses that each canonical correlation and all smaller canonical correlations are zero in the population.
  • use multiple regression analysis options to aid in interpreting the canonical correlation analysis
  • compute standardized and unstandardized canonical coefficients
  • compute canonical structure matrices
  • base the canonical analysis on partial correlations
  • create a data set that contains the scores of each observation on each canonical variable
  • create a data set that contains the canonical correlations, coefficients, and most other statistics computed by the procedure
  • create a data set that corresponds to any output table
  • compute weighted product-moment correlation coefficients
  • perform BY group processing, which enables you to obtain separate analyses on grouped observations
For further details, see CANCORR Procedure

CORR Procedure


The CORR procedure computes Pearson correlation coefficients, three nonparametric measures of association, and the probabilities associated with these statistics. The following are highlights of the CORR procedure's features:

  • produces the following correlation statistics:
    • Pearson product-moment correlation
    • Spearman rank-order correlation
    • Kendall's tau-b coefficient
    • Hoeffding's measure of dependence, D
    • Pearson, Spearman, and Kendall partial correlation
    • polychoric correlation
    • polyserial correlation
  • computes Cronbach's coefficient alpha for estimating reliability
  • saves the correlation statistics in a SAS data set for use with other statistical and reporting procedures
  • enables you to use Fisher's z transformation to derive confidence limits and p-values under a specified null hypothesis for a Pearson or Spearman correlation
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • creates a SAS data set that corresponds to any output table
  • automatically creates graphs by using ODS Graphics
For further details, see CORR Procedure

CORRESP Procedure


The CORRESP procedure performs simple correspondence analysis and multiple correspondence analysis (MCA). You can use correspondence analysis to find a low-dimensional graphical representation of the rows and columns of a crosstabulation or contingency table. Each row and column is represented by a point in a plot determined from the cell frequencies. PROC CORRESP can also compute coordinates for supplementary rows and columns. The procedure enables you to do the following:

  • use two kinds of input: raw categorical responses on two or more classification variables or a two-way contingency table
  • specify the number of dimensions or axes
  • specify the standardization for the row and column coordinates
  • create a data set that contains coordinates and the results of the correspondence analysis
  • create a data set that contains frequencies and percentages
  • create a data set that corresponds to any output table
  • perform BY group processing, which enebales you to obtain separate analyses on grouped observations
  • automatically display the correspondence analysis plot by using ODS Graphics
For further details, see CORRESP Procedure

FACTOR Procedure


The FACTOR procedure performs a variety of common factor and component analyses and rotations. The following are highlights of the procedure's features:

  • supports the following factor extraction methods:
    • principal component analysis
    • principal factor analysis
    • iterated principal factor analysis
    • unweighted least squares factor analysis
    • maximum likelihood (canonical) factor analysis
    • alpha factor analysis
    • image component analysis
    • Harris component analysis
  • supports the following rotation methods:
    • varimax
    • quartimax
    • biquartimax
    • equamax
    • parsimax
    • factor parsimax
    • quartimin
    • biquartimin
    • covarimin
    • orthomax with user-specified gamma
    • Crawford-Ferguson family with user-specified weights on variable parsimony and factor parsimony
    • generalized Crawford-Ferguson family with user-specified weights
    • direct oblimin with user-specified tau
    • Crawford-Ferguson family with user-specified weights on variable parsimony and factor parsimony
    • generalized Crawford-Ferguson family with user-specified weights
    • promax with user-specified exponent
    • Harris-Kaiser case II with user-specified exponent
    • Procrustes with a user-specified target pattern
  • provides a variety of methods for prior communality estimation
  • input can be multivariate data, a correlation matrix, a covariance matrix, a factor pattern, or a matrix of scoring coefficients
  • enables you to factor either the correlation or covariance matrix
  • processes output from other procedures
  • produces the following output:
    • means
    • standard deviations
    • correlations
    • Kaiser's measure of sampling adequacy
    • eigenvalues
    • a scree plot
    • path diagrams
    • eigenvectors
    • prior and final communality estimates
    • the unrotated factor pattern
    • residual and partial correlations
    • the rotated primary factor pattern
    • the primary factor structure
    • interfactor correlations
    • the reference structure
    • reference axis correlations
    • the variance explained by each factor both ignoring and eliminating other factors
    • plots of both rotated and unrotated factors
    • squared multiple correlation of each factor with the variables
    • standard error estimates
    • confidence limits
    • coverage displays
    • scoring coefficients
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • enables you to use relative weights for each observation in the input data set
  • creates a SAS data set that corresponds to any table
  • automatically creates graphs by using ODS Graphics
For further details, see FACTOR Procedure

MDS Procedure


The MDS procedure fits two- and three-way, metric and nonmetric multidimensional scaling models. Multidimensional scaling refers to a class of methods. These methods estimate coordinates for a set of objects in a space of specified dimensionality. The input data are measurements of distances between pairs of objects. A variety of models can be used that include different ways of computing distances and various functions relating the distances to the actual data. The following are highlights of the MDS procedure's features:

  • estimates the following parameters by nonlinear least squares:
    • configuration — the coordinates of each object in a Euclidean or weighted Euclidean space of one or more dimensions
    • dimension coefficients — for each data matrix, the coefficients that multiply each coordinate of the common or group weighted Euclidean space to yield the individual unweighted Euclidean space
    • transformation parameters — intercept, slope, or exponent in a linear, affine, or power transformation relating the distances to the data
  • fits either a regression model of the form
    fit(datum) = fit(trans(distance)) + error
    or a measurement model of the form
    fit(trans(datum)) = fit(distance) + error
    where
    • fit is a predetermined power or logarithmic transformation
    • trans is an estimated (`optimal') linear, affine, power, or monotone transformation
    • datum is a measure of the similarity or dissimilarity of two objects or stimuli
    • distance is a distance computed from the estimated coordinates of the two objects and estimated dimension coefficients in a space of one or more dimensions
    • error is an error term assumed to have an approximately normal distribution and to be independently and identically distributed for all data
  • performs BY group processing, whcih enables you to obtain separate analyses on grouped observations
  • performs weighted analysis
  • creates a SAS data set that corresponds to any output table
  • automatically creates graphs by using ODS Graphics
For further details, see MDS Procedure

PRINCOMP Procedure


The PRINCOMP procedure performs principal component analysis. The following are highlights of the PRINCOMP procedure's features:

  • input can be in the form of raw data, a correlation matrix, a covariance matrix, or a sum-of-squares-and-crossproducts (SSCP) matrix
  • creates output data sets that contain eigenvalues, eigenvectors, and standardized or unstandardized principal component scores
  • automatically creates the scree plot, component pattern plot, component pattern profile plot, matrix plot of component scores, and component score plots by using ODS Graphics
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • performs weighted analysis
  • creates a SAS data set that corresponds to any output table
For further details, see PRINCOMP Procedure

PRINQUAL Procedure


The PRINQUAL procedure performs principal component analysis (PCA) of qualitative, quantitative, or mixed data. PROC PRINQUAL enables you to do the following:

  • find linear and nonlinear transformations of variables, using the method of alternating least squares, that optimize properties of the transformed variables' correlation or covariance matrix. Nonoptimal transformations such as logarithm and rank are also available.
  • fit metric and nonmetric principal component analyses
  • perform metric and nonmetric multidimensional preference (MDPREF) analyses
  • reduce the number of variables for subsequent use in regression analyses, cluster analyses, and other analyses
  • choose between three methods, each of which seeks to optimize a different property of the transformed variables' covariance or correlation matrix. These methods are as follows:
    • maximum total variance, or MTV
    • minimum generalized variance, or MGV
    • maximum average correlation, or MAC
  • transform ordinal variables monotonically by scoring the ordered categories so that order is weakly preserved (adjacent categories can be merged) and the covariance matrix is optimized. You can undo ties optimally or leave them tied. You can also transform ordinal variables to ranks.
  • transform nominal variables by optimally scoring the categories
  • transform interval and ratio scale of measurement variables linearly, or transform them nonlinearly with spline transformations or monotone spline transformations. In addition, nonoptimal transformations for logarithm, rank, exponential, power, logit, and inverse trigonometric sine are available.
  • estimate missing data without constraint, with category constraints (missing values within the same group get the same value), and with order constraints (missing value estimates in adjacent groups can be tied to preserve a specified ordering).
  • detect nonlinear relationships
  • perform weighted estimation
  • perform BY group processing, which enables you to obtain separate analyses on grouped observations
  • create a SAS data set that contains the original variables, transformed variables, components, or data approximations
  • create a SAS data set that corresponds to any output table
  • automatically create graphs by using ODS Graphics
For further details, see PRINQUAL Procedure