Multivariate Analysis: Discriminant Analysis

Overview of Discriminant Analysis

For a set of observations that contains one or more interval variables and also a classification variable that defines groups of observations, discriminant analysis derives a discriminant criterion function to classify each observation into one of the groups.

When the distribution within each group is assumed to be multivariate normal, a parametric method can be used to develop a discriminant function. The discriminant function, also known as a classification criterion, is determined by a generalized squared distance. The classification criterion can be based on either the individual within-group covariance matrices (yielding a quadratic function) or the pooled covariance matrix (yielding a linear function). It also takes into account the prior probabilities of the groups.

When no assumptions can be made about the distribution within each group, or when the distribution is not assumed to be multivariate normal, nonparametric methods can be used to estimate the group-specific densities. These methods include the kernel and k-nearest-neighbor methods.

You can run the Discriminant analysis by selecting Analysis →Multivariate Analysis →Discriminant Analysis from the main menu. The analysis is implemented by calling the DISCRIM procedure in SAS/STAT software. See the documentation for the DISCRIM procedure in the SAS/STAT User's Guide for additional details.