![]() | ![]() | ![]() | ![]() |
The example titled "Combining Correlation Coefficients" in the PROC MIANALYZE documentation illustrates how to combine sample coefficients for the correlation between two variables computed from a set of imputed data sets by using Fisher's z transformation. The following extends that example by showing how to combine coefficients of the correlations among several variables forming a correlation matrix.
The following observations were taken from men involved in a physical fitness course at N.C. State University. Three variables were recorded for each participant. They were the oxygen intake (Oxygen) in ml per kg of body weight per minute, the time to run 1.5 miles (Runtime) in minutes, and the heart rate while running (RunPulse).
data FitMiss; input Oxygen RunTime RunPulse @@; datalines; 44.609 11.37 178 45.313 10.07 185 54.297 8.65 156 59.571 . . 49.874 9.22 . 44.811 11.63 176 . 11.95 176 . 10.85 . 39.442 13.08 174 60.055 8.63 170 50.541 . . 37.388 14.03 186 44.754 11.12 176 47.273 . . 51.855 10.33 166 49.156 8.95 180 40.836 10.95 168 46.672 10.00 . 46.774 10.25 . 50.388 10.08 168 39.407 12.63 174 46.080 11.17 156 45.441 9.63 164 . 8.92 . 45.118 11.08 . 39.203 12.88 168 45.790 10.47 186 50.545 9.93 148 48.673 9.40 186 47.920 11.50 170 47.467 10.50 170 ;
The following statements run PROC MI to impute the missing values (results are not shown) and create the OUTMI data set containing the default five imputed data sets.
proc mi data=FitMiss seed=3237851 noprint out=outmi; var Oxygen RunTime RunPulse; run;
These statements use PROC CORR to compute the correlation coefficients among the three variables and their associated Fisher's z statistics for each imputed data set. The ODS OUTPUT statement saves the Fisher's z statistics in an output data set.
proc corr data=outmi fisher(biasadj=no); var Oxygen RunTime RunPulse; by _Imputation_; ods output FisherPearsonCorr=outz; run;
The following statements generate the standard error associated with each z statistic, StdZ = (n-3)-½. The variable, Pair, is also created which identifies the pair of variables associated with each correlation coefficient.
data outz; set outz; StdZ= 1/sqrt(NObs-3); Pair=catx(' ',var,withvar); run;
In order to combine the correlation coefficients, ensure that the data are sorted by Pair and then by _imputation_. The PRINT step displays the data set structure.
proc sort data=outz; by Pair _Imputation_; run; proc print data=outz noobs; title 'Fisher''s Correlation Statistics'; by Pair; var _Imputation_ Corr ZVal; run;
These statements use PROC MIANALYZE to generate combined Fisher's z values and their variances. The BY statement runs the procedure for each Pair of variables to get the combined Fisher's z coefficient for each correlation. The ODS OUTPUT statement saves these estimates in an output data set.
proc mianalyze data=outz; by Pair; ods output ParameterEstimates=parms; modeleffects ZVal; stderr StdZ; run;
Finally, the inverse transform (TANH) is applied to the z statistics to obtain the correlation coefficients. The following statements generate and display the correlation coefficient estimates along with their p-values.
data corr_est; set parms; r=tanh(Estimate); rename estimate=Fishersz; run; proc print data=corr_est noobs; var Pair r stderr Fishersz probt; title 'Final Combined Correlation Estimates and p-values'; run;
Product Family | Product | System | SAS Release | |
Reported | Fixed* | |||
SAS System | SAS/STAT | z/OS | ||
OpenVMS VAX | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft® Windows® for x64 | ||||
OS/2 | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |
Type: | Usage Note |
Priority: | |
Topic: | SAS Reference ==> Procedures ==> MI SAS Reference ==> Procedures ==> MIANALYZE Analytics ==> Descriptive Statistics Analytics ==> Missing Value Imputation |
Date Modified: | 2010-03-17 14:49:12 |
Date Created: | 2010-02-16 08:40:21 |