The CANDISC Procedure

Input Data Set

The input DATA= data set can be an ordinary SAS data set or one of several specially structured data sets created by statistical procedures available in SAS/STAT software. For more information about special types of data sets, see Appendix A: Special SAS Data Sets. The BY variable in these data sets becomes the CLASS variable in PROC CANDISC. These specially structured data sets include the following:

  • TYPE=CORR data sets created by PROC CORR by using a BY statement

  • TYPE=COV data sets created by PROC PRINCOMP by using both the COV option and a BY statement

  • TYPE=CSSCP data sets created by PROC CORR by using the CSSCP option and a BY statement, where the OUT= data set is assigned TYPE=CSSCP by using the TYPE= data set option

  • TYPE=SSCP data sets created by PROC REG by using both the OUTSSCP= option and a BY statement

When the input data set is TYPE=CORR, TYPE=COV, or TYPE=CSSCP, then PROC CANDISC reads the number of observations for each class from the observations for which _TYPE_=N and the variable means in each class from the observations for which _TYPE_=MEAN. The CANDISC procedure then reads the within-class correlations from the observations for which _TYPE_=CORR, the standard deviations from the observations for which _TYPE_=STD (data set TYPE=CORR), the within-class covariances from the observations for which _TYPE_=COV (data set TYPE=COV), or the within-class corrected sums of squares and crossproducts from the observations for which _TYPE_=CSSCP (data set TYPE=CSSCP).

When the data set does not include any observations for which _TYPE_=CORR (data set TYPE=CORR), _TYPE_=COV (data set TYPE=COV), or _TYPE_=CSSCP (data set TYPE=CSSCP) for each class, PROC CANDISC reads the pooled within-class information from the data set. In this case, PROC CANDISC reads the pooled within-class correlations from the observations for which _TYPE_=PCORR, the pooled within-class standard deviations from the observations for which _TYPE_=PSTD (data set TYPE=CORR), the pooled within-class covariances from the observations for which _TYPE_=PCOV (data set TYPE=COV), or the pooled within-class corrected SSCP matrix from the observations for which _TYPE_=PSSCP (data set TYPE=CSSCP).

When the input data set is TYPE=SSCP, then PROC CANDISC reads the number of observations for each class from the observations for which _TYPE_=N, the sum of weights of observations from the variable Intercept in observations for which _TYPE_=SSCP and _NAME_=Intercept, the variable sums from the analysis variables in observations for which _TYPE_=SSCP and _NAME_=Intercept, and the uncorrected sums of squares and crossproducts from the analysis variables in observations for which _TYPE_=SSCP and _NAME_=variable-name.