Input Data Set

The input DATA= data set can be an ordinary SAS data set or one of several specially structured data sets created by statistical procedures available with SAS/STAT software. For more information about special types of data sets, see Appendix A, Special SAS Data Sets. The BY variable in these data sets becomes the CLASS variable in PROC CANDISC. These specially structured data sets include the following:

  • TYPE=CORR data sets created by PROC CORR by using a BY statement

  • TYPE=COV data sets created by PROC PRINCOMP by using both the COV option and a BY statement

  • TYPE=CSSCP data sets created by PROC CORR by using the CSSCP option and a BY statement, where the OUT= data set is assigned TYPE=CSSCP with the TYPE= data set option

  • TYPE=SSCP data sets created by PROC REG by using both the OUTSSCP= option and a BY statement.

When the input data set is TYPE=CORR, TYPE=COV, or TYPE=CSSCP, then PROC CANDISC reads the number of observations for each class from the observations with _TYPE_=’N’ and the variable means in each class from the observations with _TYPE_=’MEAN’. The CANDISC procedure then reads the within-class correlations from the observations with _TYPE_=’CORR’, the standard deviations from the observations with _TYPE_=’STD’ (data set TYPE=CORR), the within-class covariances from the observations with _TYPE_=’COV’ (data set TYPE=COV), or the within-class corrected sums of squares and crossproducts from the observations with _TYPE_=’CSSCP’ (data set TYPE=CSSCP).


When the data set does not include any observations with _TYPE_=’CORR’ (data set TYPE=CORR), _TYPE_=’COV’ (data set TYPE=COV), or _TYPE_=’CSSCP’ (data set TYPE=CSSCP) for each class, PROC CANDISC reads the pooled within-class information from the data set. In this case, PROC CANDISC reads the pooled within-class correlations from the observations with _TYPE_=’PCORR’, the pooled within-class standard deviations from the observations with _TYPE_=’PSTD’ (data set TYPE=CORR), the pooled within-class covariances from the observations with _TYPE_=’PCOV’ (data set TYPE=COV), or the pooled within-class corrected SSCP matrix from the observations with_TYPE_=’PSSCP’ (data set TYPE=CSSCP).

When the input data set is TYPE=SSCP, then PROC CANDISC reads the number of observations for each class from the observations with _TYPE_=’N’, the sum of weights of observations from the variable INTERCEPT in observations with _TYPE_=’SSCP’ and _NAME_=’INTERCEPT’, the variable sums from the variable=variablenames in observations with _TYPE_=’SSCP’ and _NAME_=’INTERCEPT’, and the uncorrected sums of squares and crossproducts from the variable=variablenames in observations with _TYPE_=’SSCP’ and _NAME_=variablenames.