This sample shows one way of computing Mahalanobis distance in each of the following scenarios:
mahalanobis_distance_to_mean = sqrt(uss(of prin:));
to complete the required distance.
2) To compute the Mahalanobis distance from each observation to a specific point, compute the principal component score for that point using the original scoring coefficients. Then compute the Euclidean distance from each observation to the reference point. One easy way to do this is to use PROC FASTCLUS treating the reference point as the SEED.
3) To compute Mahalanobis distances between all possible pairs, run PROC DISTANCE on the OUT= data set as created by PRINCOMP in the steps above. PROC DISTANCE will automatically calculate all possible pairs.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
title 'Mahalanobis Distance';
options pagno=1 dtreset nodate;
/* SAS Institute Inc.
updated: 8/29/07
License Agreement for Corrective Code or Additional Functionality
SAS INSTITUTE INC. IS PROVIDING YOU WITH THE COMPUTER SOFTWARE CODE INCLUDED
WITH THIS AGREEMENT ("CODE") ON AN "AS IS" BASIS, AND AUTHORIZES YOU TO USE THE
CODE SUBJECT TO THE TERMS HEREOF. BY USING THE CODE, YOU AGREE TO THESE TERMS.
YOUR USE OF THE CODE IS AT YOUR OWN RISK. SAS INSTITUTE INC. MAKES NO
REPRESENTATION OR WARRANTY, EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT
AND TITLE, WITH RESPECT TO THE CODE.
The Code is intended to be used solely as part of a product ("Software") you
currently have licensed from SAS Institute Inc. or one of its subsidiaries or
authorized agents ("SAS"). The Code is designed to either correct an error in
the Software or to add functionality to the Software, but has not necessarily
been tested. Accordingly, SAS makes no representation or warranty that the Code
will operate error-free. SAS is under no obligation to maintain or support the
Code.
Neither SAS nor its licensors shall be liable to you or any third party for any
general, special, direct, indirect, consequential, incidental or other damages
whatsoever arising out of or related to your use or inability to use the Code,
even if SAS has been advised of the possibility of such damages.
Except as otherwise provided above, the Code is governed by the same agreement
that governs the Software. If you do not have an existing agreement with SAS
governing the Software, you may not use the Code.
SAS and all other SAS Institute Inc. product or service names
are registered trademarks or trademarks of SAS Institute Inc. in the USA and
other countries. (r) indicates USA registration. Other brand and product names
are registered trademarks or trademarks of their respective companies.
*/
/*
To get the Mahalanobis distance of each observation
to the mean, first run PRINCOMP with the STD option to produce
principal component scores in the OUT=data set with
an identity covariance matrix. (Hence the Mahalanobis distance and
Euclidean distance are the same for these scores.) Then use a DATA step
with a statement such as : Mah_dist=sqrt (uss(of PRIN1-PRINn)); to
complete the required distance.
To get the Mahalanobis distance of each observation to a specific point,
compute the principal component score for that point
using the original scoring coefficients. Then compute the
Euclidean distance from each observation to the reference point.
One easy way to do this is to use PROC FASTCLUS treating the reference
point as the SEED.
To get Mahalanobis distances between all possible pairs, run PROC DISTANCE
on the OUT= data set as created by PRINCOMP in the steps above.
*/
data points;
drop i;
do i=1 to 10;
id=put(i,$9.);
x=rannor(34343);
y=rannor(12345);
z=rannor(54321);
output;
end;
run;
title2 'find Mahalanobis distance from each point to the mean';
proc princomp data=points std out=out outstat=outstat noprint;
var x y z;
run;
data mahalanobis_to_mean;
set out;
mahalanobis_distance_to_mean = sqrt(uss(of prin:));
run;
proc print data=mahalanobis_to_mean uniform noobs;
id id;
run;
title2 'find Mahalanobis distance from each point to a reference point';
data reference_point;
x=-1; y=-2; z=-2.5;
id='reference';
run;
proc score data=reference_point score=outstat out=reference_point;
var x y z;
run;
proc append data=out base=reference_point;
run;
proc fastclus data=reference_point maxc=1 replace=none maxiter=0 noprint
out=mahalanobis_to_point(rename=(distance=mahalanobis_distance_to_point)
drop=cluster);
var prin:;
run;
proc print data=mahalanobis_to_point uniform noobs;
id id;
run;
title2 'find Mahalanobis distances between all possible pairs of points';
proc distance data=out out=distance_matrix;
var interval(prin:);
id id;
run;
proc print data=distance_matrix uniform noobs;
run;
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
Type: | Sample |
Date Modified: | 2008-06-16 12:59:08 |
Date Created: | 2007-11-30 13:34:37 |
Product Family | Product | Host | SAS Release | |
Starting | Ending | |||
SAS System | SAS/STAT | z/OS | 9.1 TS1M0 | |
Microsoft® Windows® for 64-Bit Itanium-based Systems | 9.1 TS1M0 | |||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | 9.1 TS1M0 | |||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | 9.1 TS1M0 | |||
Microsoft Windows 2000 Advanced Server | 9.1 TS1M0 | |||
Microsoft Windows 2000 Datacenter Server | 9.1 TS1M0 | |||
Microsoft Windows 2000 Server | 9.1 TS1M0 | |||
Microsoft Windows 2000 Professional | 9.1 TS1M0 | |||
Microsoft Windows NT Workstation | 9.1 TS1M0 | |||
Microsoft Windows Server 2003 Datacenter Edition | 9.1 TS1M0 | |||
Microsoft Windows Server 2003 Enterprise Edition | 9.1 TS1M0 | |||
Microsoft Windows Server 2003 Standard Edition | 9.1 TS1M0 | |||
Microsoft Windows XP Professional | 9.1 TS1M0 | |||
64-bit Enabled AIX | 9.1 TS1M0 | |||
64-bit Enabled HP-UX | 9.1 TS1M0 | |||
64-bit Enabled Solaris | 9.1 TS1M0 | |||
HP-UX IPF | 9.1 TS1M0 | |||
Linux | 9.1 TS1M0 | |||
OpenVMS Alpha | 9.1 TS1M0 | |||
Tru64 UNIX | 9.1 TS1M0 |