By default, the HPCANDISC procedure begins by displaying the output along with the following:
The "Performance Information" table, which is produced by default. It displays information about the execution mode. For single-machine mode, the table displays the number of threads used. For distributed mode, the table displays the grid mode (symmetric or asymmetric), the number of compute nodes, and the number of threads per node.
The "Data Access Information" table, which is produced by default. For the input and output data sets, it displays the libref and data set name, the engine used to access the data, the role (input or output) of the data set, and the path that data followed to reach the computation.
Summary information about the variables in the analysis that displays the total sample size, the number of quantitative variables, the number of class levels, and the number of degrees of freedom
The "Number of Observations" table, which displays the number of observations read from the input data set and the number of observations used in the analysis. If you specify a FREQ statement, the table also displays the sum of frequencies read and used.
The "Class Level Information" table, which displays, for each level of the classification variable, the frequency sum, weight sum, and proportion of the total sample
The optional output from PROC HPCANDISC includes the following:
Within-class SSCP matrices for each group
Pooled within-class SSCP matrix
Between-class SSCP matrix
Total-sample SSCP matrix
Within-class covariance matrices for each group
Pooled within-class covariance matrix
Between-class covariance matrix, equal to the between-class SSCP matrix divided by , where n is the number of observations and c is the number of classes
Total-sample covariance matrix
Within-class correlation coefficients and to test the hypothesis that the within-class population correlation coefficients are zero
Pooled within-class correlation coefficients and to test the hypothesis that the partial population correlation coefficients are zero
Between-class correlation coefficients and to test the hypothesis that the between-class population correlation coefficients are zero
Total-sample correlation coefficients and to test the hypothesis that the total population correlation coefficients are zero
Simple statistics, including N (the number of observations), sum, mean, variance, and standard deviation for the total sample and within each class
Total-sample standardized class means, obtained by subtracting the grand mean from each class mean and dividing by the total sample standard deviation
Pooled within-class standardized class means, obtained by subtracting the grand mean from each class mean and dividing by the pooled within-class standard deviation
Pairwise squared distances between groups
Univariate test statistics, including total-sample standard deviations, pooled within-class standard deviations, between-class standard deviations, R square, , F, and (univariate F values and probability levels for one-way analyses of variance)
The "Timing" table, which displays the elapsed time for each main task of the procedure, if you specify the DETAILS option in the PERFORMANCE statement
By default, PROC HPCANDISC displays these statistics:
Multivariate statistics and F approximations, including Wilks’ lambda, Pillai’s trace, Hotelling-Lawley trace, and Roy’s greatest root with F approximations, numerator and denominator degrees of freedom (Num DF and Den DF), and probability values . Each of these four multivariate statistics tests the hypothesis that the class means are equal in the population. For more information, see the section Multivariate Tests in SAS/STAT 14.1 User's Guide.
Canonical correlations
Adjusted canonical correlations (Lawley 1959). These are asymptotically less biased than the raw correlations and can be negative. The adjusted canonical correlations might not be computable and are displayed as missing values if two canonical correlations are nearly equal or if some are close to zero. A missing value is also displayed if an adjusted canonical correlation is larger than a previous adjusted canonical correlation.
Approximate standard error of the canonical correlations
Squared canonical correlations
Eigenvalues of . Each eigenvalue is equal to , where is the corresponding squared canonical correlation and can be interpreted as the ratio of between-class variation to pooled within-class variation for the corresponding canonical variable. The table includes eigenvalues, differences between successive eigenvalues, the proportion of the sum of the eigenvalues, and the cumulative proportion.
Likelihood ratio for the hypothesis that the current canonical correlation and all smaller ones are zero in the population. The likelihood ratio for the hypothesis that all canonical correlations equal zero is Wilks’ lambda.
Approximate F statistic based on Rao’s approximation to the distribution of the likelihood ratio (Rao; 1973, p. 556; Kshirsagar; 1972, p. 326)
Numerator degrees of freedom (Num DF), denominator degrees of freedom (Den DF), and , the probability level associated with the F statistic
You can suppress the following statistics by specifying the SHORT option:
Total canonical structure, giving total-sample correlations between the canonical variables and the original variables
Between canonical structure, giving between-class correlations between the canonical variables and the original variables
Pooled within canonical structure, giving pooled within-class correlations between the canonical variables and the original variables
Total-sample standardized canonical coefficients, standardized to give canonical variables that have zero mean and unit pooled within-class variance when applied to the total-sample standardized variables
Pooled within-class standardized canonical coefficients, standardized to give canonical variables that have zero mean and unit pooled within-class variance when applied to the pooled within-class standardized variables
Raw canonical coefficients, standardized to give canonical variables that have zero mean and unit pooled within-class variance when applied to the centered variables
Class means on the canonical variables