The OUTSTAT= data set is similar to the TYPE=CORR data set that the CORR procedure produces but contains many results in addition to those produced by PROC CORR.
The OUTSTAT= data set is TYPE=CORR, and it contains the following variables:
the BY variables, if any
the CLASS variable
_TYPE_
, a character variable of length 8 that identifies the type of statistic
_NAME_
, a character variable of length 32 that identifies the row of the matrix or the name of the canonical variable
the quantitative variables (those in the VAR statement, or if there is no VAR statement, all numeric variables not listed in any other statement)
The observations, as identified by the variable _TYPE_
, have the following _TYPE_
values:
Contents
number of observations for the total sample (CLASS variable missing) and within each class (CLASS variable present)
sum of weights for the total sample (CLASS variable missing) and within each class (CLASS variable present) if a WEIGHT statement is specified
means for the total sample (CLASS variable missing) and within each class (CLASS variable present)
total-standardized class means
pooled within-class standardized class means
standard deviations for the total sample (CLASS variable missing) and within each class (CLASS variable present)
pooled within-class standard deviations
between-class standard deviations
univariate R squares
The following kinds of observations are identified by the combination of the variables _TYPE_
and _NAME_
. When the _TYPE_
variable has one of the following values, the _NAME_
variable identifies the row of the matrix:
Contents
corrected SSCP matrix for the total sample (CLASS variable missing) and within each class (CLASS variable present)
pooled within-class corrected SSCP matrix
between-class SSCP matrix
covariance matrix for the total sample (CLASS variable missing) and within each class (CLASS variable present)
pooled within-class covariance matrix
between-class covariance matrix
correlation matrix for the total sample (CLASS variable missing) and within each class (CLASS variable present)
pooled within-class correlation matrix
between-class correlation matrix
When the _TYPE_
variable has one of the following values, the _NAME_
variable identifies the canonical variable:
Contents
canonical correlations
canonical structure
between canonical structure
pooled within-class canonical structure
total-sample standardized canonical coefficients
pooled within-class standardized canonical coefficients
raw canonical coefficients
means of the canonical variables for each class
You can use this data set in PROC SCORE to get scores on the canonical variables for new data by using one of the following forms:
* The CLASS variable C is numeric; proc score data=NewData score=Coef(where=(c = . )) out=Scores; run; * The CLASS variable C is character; proc score data=NewData score=Coef(where=(c = ' ')) out=Scores; run;
The WHERE clause excludes the within-class means and standard deviations. PROC SCORE standardizes the new data by subtracting
the original variable means that are stored in the _TYPE_
=’MEAN’ observations and dividing by the original variable standard deviations from the _TYPE_
=’STD’ observations. Then PROC SCORE multiplies the standardized variables by the coefficients from the _TYPE_
=’SCORE’ observations to get the canonical scores.