The CANDISC Procedure |

Output Data Sets |

The OUT= data set contains all the variables in the original data set plus new variables containing the canonical variable scores. You determine the number of new variables by using the NCAN= option. The names of the new variables are formed as described in the PREFIX= option. The new variables have means equal to zero and pooled within-class variances equal to one. An OUT= data set cannot be created if the DATA= data set is not an ordinary SAS data set.

The OUTSTAT= data set is similar to the TYPE=CORR data set produced by the CORR procedure but contains many results in addition to those produced by the CORR procedure.

The OUTSTAT= data set is TYPE=CORR, and it contains the following variables:

the BY variables, if any

the CLASS variable

_TYPE_, a character variable of length 8 that identifies the type of statistic

_NAME_, a character variable of length 32 that identifies the row of the matrix or the name of the canonical variable

the quantitative variables (those in the VAR statement, or if there is no VAR statement, all numeric variables not listed in any other statement)

The observations, as identified by the variable _TYPE_, have the following _TYPE_ values:

**_TYPE_****Contents**- N
number of observations both for the total sample (CLASS variable missing) and within each class (CLASS variable present)

- SUMWGT
sum of weights both for the total sample (CLASS variable missing) and within each class (CLASS variable present) if a WEIGHT statement is specified

- MEAN
means both for the total sample (CLASS variable missing) and within each class (CLASS variable present)

- STDMEAN
total-standardized class means

- PSTDMEAN
pooled within-class standardized class means

- STD
standard deviations both for the total sample (CLASS variable missing) and within each class (CLASS variable present)

- PSTD
pooled within-class standard deviations

- BSTD
between-class standard deviations

- RSQUARED
univariate R squares

The following kinds of observations are identified by the combination of the variables _TYPE_ and _NAME_. When the _TYPE_ variable has one of the following values, the _NAME_ variable identifies the row of the matrix:

**_TYPE_****Contents**- CSSCP
corrected SSCP matrix for the total sample (CLASS variable missing) and within each class (CLASS variable present)

- PSSCP
pooled within-class corrected SSCP matrix

- BSSCP
between-class SSCP matrix

- COV
covariance matrix for the total sample (CLASS variable missing) and within each class (CLASS variable present)

- PCOV
pooled within-class covariance matrix

- BCOV
between-class covariance matrix

- CORR
correlation matrix for the total sample (CLASS variable missing) and within each class (CLASS variable present)

- PCORR
pooled within-class correlation matrix

- BCORR
between-class correlation matrix

When the _TYPE_ variable has one of the following values, the _NAME_ variable identifies the canonical variable:

**_TYPE_****Contents**- CANCORR
canonical correlations

- STRUCTUR
canonical structure

- BSTRUCT
between canonical structure

- PSTRUCT
pooled within-class canonical structure

- SCORE
total sample standardized canonical coefficients

- PSCORE
pooled within-class standardized canonical coefficients

- RAWSCORE
raw canonical coefficients

- CANMEAN
means of the canonical variables for each class

You can use this data set with PROC SCORE to get scores on the canonical variables for new data by using one of the following forms:

* The CLASS variable C is numeric; proc score data=NewData score=Coef(where=(c = . )) out=Scores; run; * The CLASS variable C is character; proc score data=NewData score=Coef(where=(c = ' ')) out=Scores; run;

The WHERE clause is used to exclude the within-class means and standard deviations. PROC SCORE standardizes the new data by subtracting the original variable means that are stored in the _TYPE_=’MEAN’ observations, and dividing by the original variable standard deviations from the _TYPE_=’STD’ observations. Then PROC SCORE multiplies the standardized variables by the coefficients from the _TYPE_=’SCORE’ observations to get the canonical scores.

Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.