This example creates an EST-type data set that contains regression coefficients and their corresponding covariance matrices computed from imputed data sets. These estimates are then combined to generate valid statistical inferences about the regression model.
The following statements use the REG procedure to generate regression coefficients for each imputed data set:
proc reg data=outmi outest=outreg covout noprint; model Oxygen= RunTime RunPulse; by _Imputation_; run;
The following statements display (in Output 76.3.1) output regression coefficients and their covariance matrices from PROC REG for the first two imputed data sets:
proc print data=outreg(obs=8); var _Imputation_ _Type_ _Name_ Intercept RunTime RunPulse; title 'REG Model Coefficients and Covariance Matrices' ' (First Two Imputations)'; run;
Output 76.3.1: EST-Type Data Set
REG Model Coefficients and Covariance Matrices (First Two Imputations) |
Obs | _Imputation_ | _TYPE_ | _NAME_ | Intercept | RunTime | RunPulse |
---|---|---|---|---|---|---|
1 | 1 | PARMS | 86.544 | -2.82231 | -0.05873 | |
2 | 1 | COV | Intercept | 100.145 | -0.53519 | -0.55077 |
3 | 1 | COV | RunTime | -0.535 | 0.10774 | -0.00345 |
4 | 1 | COV | RunPulse | -0.551 | -0.00345 | 0.00343 |
5 | 2 | PARMS | 83.021 | -3.00023 | -0.02491 | |
6 | 2 | COV | Intercept | 79.032 | -0.66765 | -0.41918 |
7 | 2 | COV | RunTime | -0.668 | 0.11456 | -0.00313 |
8 | 2 | COV | RunPulse | -0.419 | -0.00313 | 0.00264 |
The following statements combine the results for the imputed data sets. The EDF= option is specified to request that the adjusted degrees of freedom be used in the analysis. For a regression model with three independent variables (including the Intercept) and 31 observations, the complete-data error degrees of freedom is 28.
proc mianalyze data=outreg edf=28; modeleffects Intercept RunTime RunPulse; run;
Output 76.3.2: Variance Information
Variance Information (25 Imputations) | |||||||
---|---|---|---|---|---|---|---|
Parameter | Variance | DF | Relative Increase in Variance |
Fraction Missing Information |
Relative Efficiency |
||
Between | Within | Total | |||||
Intercept | 22.485821 | 75.413875 | 98.799129 | 19.102 | 0.310092 | 0.240234 | 0.990482 |
RunTime | 0.021126 | 0.124930 | 0.146902 | 21.823 | 0.175870 | 0.151147 | 0.993990 |
RunPulse | 0.000656 | 0.002622 | 0.003304 | 20.042 | 0.260376 | 0.209393 | 0.991694 |
The "Variance Information" table in Output 76.3.2 displays the between-imputation, within-imputation, and total variances for combining complete-data inferences.
The "Parameter Estimates" table in Output 76.3.3 displays the estimated mean and standard error of the regression coefficients. The inferences are based on the t distribution. The table also displays a 95% mean confidence interval and a t test with the associated p-value for the hypothesis that the regression coefficient is equal to zero. Since the p-value for RunPulse
is 0.1812, this variable can be removed from the regression model.
Output 76.3.3: Parameter Estimates
Parameter Estimates (25 Imputations) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Parameter | Estimate | Std Error | 95% Confidence Limits | DF | Minimum | Maximum | Theta0 | t for H0: Parameter=Theta0 |
Pr > |t| | |
Intercept | 92.700420 | 9.939775 | 71.90376 | 113.4971 | 19.102 | 83.020730 | 100.839807 | 0 | 9.33 | <.0001 |
RunTime | -3.030325 | 0.383278 | -3.82557 | -2.2351 | 21.823 | -3.280042 | -2.754668 | 0 | -7.91 | <.0001 |
RunPulse | -0.079621 | 0.057482 | -0.19951 | 0.0403 | 20.042 | -0.135862 | -0.024910 | 0 | -1.39 | 0.1812 |