General Statistics Examples

Example 8.1: Correlation

The following statements define modules to compute correlation coefficients between numeric variables and standardized values for a set of data:

  
       /* Module to compute correlations  */ 
    start corr; 
       n=nrow(x);                      /* number of observations */ 
       sum=x[+,] ;                        /* compute column sums */ 
       xpx=t(x)*x-t(sum)*sum/n;         /* compute sscp matrix   */ 
       s=diag(1/sqrt(vecdiag(xpx)));           /* scaling matrix */ 
       corr=s*xpx*s;                       /* correlation matrix */ 
       print "Correlation Matrix",,corr[rowname=nm colname=nm] ; 
    finish corr; 
  
       /* Module to standardize data */ 
    start std; 
       mean=x[+,] /n;                       /* means for columns */ 
       x=x-repeat(mean,n,1);            /* center x to mean zero */ 
       ss=x[##,] ;                 /* sum of squares for columns */ 
       std=sqrt(ss/(n-1));         /* standard deviation estimate*/ 
       x=x*diag(1/std);                  /* scaling to std dev 1 */ 
       print ,"Standardized Data",,X[colname=nm] ; 
    finish std; 
  
  
  
  
       /* Sample run */ 
    x = { 1 2 3, 
          3 2 1, 
          4 2 1, 
          0 4 1, 
         24 1 0, 
          1 3 8}; 
    nm={age weight height}; 
    run corr; 
    run std;
 
The results are shown in Output 8.1.1.

Output 8.1.1: Correlation Coefficients and Standardized Values

Correlation Matrix

CORR
  AGE WEIGHT HEIGHT
AGE 1 -0.717102 -0.436558
WEIGHT -0.717102 1 0.3508232
HEIGHT -0.436558 0.3508232 1

Standardized Data

X
AGE WEIGHT HEIGHT
-0.490116 -0.322749 0.2264554
-0.272287 -0.322749 -0.452911
-0.163372 -0.322749 -0.452911
-0.59903 1.6137431 -0.452911
2.0149206 -1.290994 -0.792594
-0.490116 0.6454972 1.924871



Previous Page | Next Page | Top of Page