SUPPORT / SAMPLES & SAS NOTES
 

Support

Sample 30662: Mahalanobis distance: from each observation to the mean, from each observation to a specific observation, between all possible pairs

DetailsCodeAboutRate It

Overview

This sample shows one way of computing Mahalanobis distance in each of the following scenarios:

  • from each observation to the mean
  • from each observation to a specific observation
  • from each observation to all other observations (all possible pairs)
1) To compute the Mahalanobis distance from each observation to the mean, first run PROC PRINCOMP with the STD option to produce principal component scores in the OUT= data set having an identity covariance matrix. The Mahalanobis distance and Euclidean distances are equivalent for these scores. Then use a DATA step with a statement such as:

   mahalanobis_distance_to_mean = sqrt(uss(of prin:));

to complete the required distance.

2) To compute the Mahalanobis distance from each observation to a specific point, compute the principal component score for that point using the original scoring coefficients. Then compute the Euclidean distance from each observation to the reference point. One easy way to do this is to use PROC FASTCLUS treating the reference point as the SEED.

3) To compute Mahalanobis distances between all possible pairs, run PROC DISTANCE on the OUT= data set as created by PRINCOMP in the steps above. PROC DISTANCE will automatically calculate all possible pairs.

Additional Documentation




These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.