Example 9.1 Correlation
The following statements show how you can define modules to compute correlation coefficients between numeric variables and standardized values for a set of data. For more efficient computations, use the built-in CORR function and the STD function.
proc iml;
/* Module to compute correlations */
start corr;
n = nrow(x); /* number of observations */
sum = x[+,] ; /* compute column sums */
xpx = t(x)*x-t(sum)*sum/n; /* compute sscp matrix */
s = diag(1/sqrt(vecdiag(xpx))); /* scaling matrix */
corr = s*xpx*s; /* correlation matrix */
print "Correlation Matrix",,corr[rowname=nm colname=nm] ;
finish corr;
/* Module to standardize data */
start std;
mean = x[+,] /n; /* means for columns */
x = x-repeat(mean,n,1); /* center x to mean zero */
ss = x[##,] ; /* sum of squares for columns */
std = sqrt(ss/(n-1)); /* standard deviation estimate*/
x = x*diag(1/std); /* scaling to std dev 1 */
print ,"Standardized Data",,X[colname=nm] ;
finish std;
/* Sample run */
x = { 1 2 3,
3 2 1,
4 2 1,
0 4 1,
24 1 0,
1 3 8};
nm={age weight height};
run corr;
run std;
The results are shown in Output 9.1.1.
Output 9.1.1
Correlation Coefficients and Standardized Values
1 |
-0.717102 |
-0.436558 |
-0.717102 |
1 |
0.3508232 |
-0.436558 |
0.3508232 |
1 |
-0.490116 |
-0.322749 |
0.2264554 |
-0.272287 |
-0.322749 |
-0.452911 |
-0.163372 |
-0.322749 |
-0.452911 |
-0.59903 |
1.6137431 |
-0.452911 |
2.0149206 |
-1.290994 |
-0.792594 |
-0.490116 |
0.6454972 |
1.924871 |
Copyright © SAS Institute Inc. All rights reserved.