The CANDISC Procedure

Input Data Set

The input DATA= data set can be an ordinary SAS data set or one of several specially structured data sets created by statistical procedures available in SAS/STAT software. For more information about special types of data sets, see Appendix A: Special SAS Data Sets. The BY variable in these data sets becomes the CLASS variable in PROC CANDISC. These specially structured data sets include the following:

  • TYPE=CORR data sets created by PROC CORR by using a BY statement

  • TYPE=COV data sets created by PROC PRINCOMP by using both the COV option and a BY statement

  • TYPE=CSSCP data sets created by PROC CORR by using the CSSCP option and a BY statement, where the OUT= data set is assigned TYPE=CSSCP by using the TYPE= data set option

  • TYPE=SSCP data sets created by PROC REG by using both the OUTSSCP= option and a BY statement

When the input data set is TYPE=CORR, TYPE=COV, or TYPE=CSSCP, then PROC CANDISC reads the number of observations for each class from the observations for which _TYPE_=’N’ and the variable means in each class from the observations for which _TYPE_=’MEAN’. The CANDISC procedure then reads the within-class correlations from the observations for which _TYPE_=’CORR’, the standard deviations from the observations for which _TYPE_=’STD’ (data set TYPE=CORR), the within-class covariances from the observations for which _TYPE_=’COV’ (data set TYPE=COV), or the within-class corrected sums of squares and crossproducts from the observations for which _TYPE_=’CSSCP’ (data set TYPE=CSSCP).

When the data set does not include any observations for which _TYPE_=’CORR’ (data set TYPE=CORR), _TYPE_=’COV’ (data set TYPE=COV), or _TYPE_=’CSSCP’ (data set TYPE=CSSCP) for each class, PROC CANDISC reads the pooled within-class information from the data set. In this case, PROC CANDISC reads the pooled within-class correlations from the observations for which _TYPE_=’PCORR’, the pooled within-class standard deviations from the observations for which _TYPE_=’PSTD’ (data set TYPE=CORR), the pooled within-class covariances from the observations for which _TYPE_=’PCOV’ (data set TYPE=COV), or the pooled within-class corrected SSCP matrix from the observations for which _TYPE_=’PSSCP’ (data set TYPE=CSSCP).

When the input data set is TYPE=SSCP, then PROC CANDISC reads the number of observations for each class from the observations for which _TYPE_=’N’, the sum of weights of observations from the variable Intercept in observations for which _TYPE_=’SSCP’ and _NAME_=’Intercept’, the variable sums from the analysis variables in observations for which _TYPE_=’SSCP’ and _NAME_=’Intercept’, and the uncorrected sums of squares and crossproducts from the analysis variables in observations for which _TYPE_=’SSCP’ and _NAME_=variablename.