![]() | ![]() | ![]() | ![]() |
The data below represent an experiment in which researchers tested four cheese additives and obtained 52 response ratings for each additive. Each response was measured on a scale of nine categories ranging from strong dislike (1) to excellent taste (9). There were missing values in the response for six observations so PROC MI is used to impute those missing values. Because the response is ordinal, the monotone logistic imputation method is used.
data Cheese; input Additive y @@; datalines; 1 3 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 6 1 6 1 6 1 6 1 6 1 6 1 6 1 6 1 7 1 7 1 7 1 . 1 7 1 7 1 7 1 7 1 7 1 7 1 7 1 7 1 7 1 7 1 7 1 7 1 7 1 7 1 7 1 8 1 8 1 8 1 8 1 8 1 8 1 8 1 8 1 9 2 1 2 1 2 1 2 1 2 1 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 4 2 4 2 4 2 4 2 4 2 4 2 4 2 4 2 4 2 4 2 4 2 5 2 5 2 5 2 5 2 5 2 5 2 5 2 6 2 6 2 . 2 6 2 6 2 6 2 7 3 1 3 2 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 4 3 4 3 4 3 4 3 4 3 4 3 4 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 5 3 . 3 6 3 6 3 6 3 6 3 6 3 6 3 7 3 7 3 7 3 7 3 7 3 8 4 4 4 5 4 5 4 5 4 6 4 6 4 6 4 6 4 6 4 6 4 6 4 7 4 7 4 7 4 7 4 7 4 7 4 7 4 7 4 7 4 7 4 7 4 7 4 7 4 7 4 8 4 8 4 8 4 8 4 8 4 8 4 8 4 . 4 8 4 8 4 8 4 8 4 8 4 8 4 8 4 8 4 . 4 9 4 9 4 9 4 9 4 9 4 9 4 9 4 9 4 . 4 9 ; proc mi data=Cheese out=cheese_mi seed=1; class additive y; var additive y; monotone logistic (y=additive); run;
The following statements fit a proportional odds model for each imputed data set. The OUTEST= and COVOUT options save the parameter estimates and the estimated covariance matrix of the estimates to a data set.
proc logistic data=cheese_mi outest=ordinal_parms covout; by _imputation_; class additive; model y=additive; run;
It is necessary to understand the naming convention that PROC LOGISTIC uses in naming the variables containing the parameter estimates in the OUTEST= data set so they can be specified in the MODELEFFECTS statement in PROC MIANALYZE. The intercept variables are named Intercept_xxx, where xxx is the value (formatted if a format is applied) of the corresponding response category.
For continuous explanatory variables, the variable names containing the parameters are the same as the corresponding model variables. For CLASS variables, the variable names are obtained by concatenating the corresponding CLASS variable name with the CLASS category. For interaction and nested effects, the parameter names are created by concatenating the names of each effect. See "Input and Output Data Sets: Parameter Names in the OUTEST= Data Set" in the Details section of the PROC LOGISTIC documentation for more details.
The names of the variables containing the parameter estimates are easily seen using PROC PRINT or PROC CONTENTS. The following statements display the parameter estimates from the first imputed data set. Note the names of variables containing the parameter estimates.
proc print data=ordinal_parms noobs; where _imputation_=1 and _TYPE_="PARMS"; var int: add:; title 'Parameter Estimates for the First Imputation'; run;
|
PROC MIANALYZE can now be used to combine the results from the imputed data sets. The parameter variables are individually listed in the MODELEFFECTS statement. Note that variable lists such as intercept_1-intercept_8 or int: cannot be used. The EDF= option is also specified because the calculated degrees of freedom far exceed the complete data degrees of freedom. In this case it is set to 197 which is the number of observations (208) minus the number of parameters (11).
proc mianalyze data=ordinal_parms edf=205; modeleffects intercept_1 intercept_2 intercept_3 intercept_4 intercept_5 intercept_6 intercept_7 intercept_8 additive1 additive2 additive3; run;
|
For the purpose of illustration, missing values are introduced into the variable Program in the school instruction style data set that demonstrates how to use PROC LOGISTIC to fit a generalized logit model.
data school; length Program $ 9; input School Program $ Style $ NumStudent Count @@; datalines; 1 regular self 21 10 1 regular team 22 17 1 regular class 16 26 1 afternoon self 23 5 1 afternoon team 26 12 1 afternoon class 21 50 2 . self 22 21 2 regular team 31 17 2 regular class 32 26 2 . . 18 16 2 afternoon team 28 12 2 afternoon class 27 36 3 regular self 14 15 3 regular team 32 15 3 regular class 31 16 3 afternoon self 19 12 3 afternoon team 30 12 3 . class 33 20 ;
PROC MI is used to impute the missing values. The DISCRIM method is used for Style since it is nominal with three levels and the LOGISTIC method is used for Program since it has two levels.
proc mi data=school out=school_imp; freq count; class school style program; var NumStudent school style program; monotone discrim (style=NumStudent); monotone logistic (program=school style); title 'Proc MI results for monotone Logistic model'; run;
These statements fit a generalized logit model to each of the imputed data sets using PROC LOGISTIC. The OUTEST= and COVOUT options create a data set containing the parameter estimates and the estimated covariance matrix of the estimates.
proc logistic data=school_imp outest=imp_parms covout; by _imputation_; freq Count; class School Program(ref=first); model Style(order=data)=School Program NumStudent / link=glogit; run;
The parameter variables needed for the MODELEFFECTS statement in PROC MIANALYZE are named as described above for the ordinal model. However, for the generalized logit model, names of parameters corresponding to each nonreference category contain _xxx as the suffix, where xxx is the value (formatted if a format is applied) of the corresponding nonreference category. See "Input and Output Data Sets: Parameter Names in the OUTEST= Data Set" in the Details section of the PROC LOGISTIC documentation for more details and an example. As before, the names can be displayed using PROC PRINT.
proc print data=imp_parms noobs; where _imputation_=1 and _TYPE_="PARMS"; var int: sch: pro: num:; title 'Parameter Estimates for the First Imputation'; run;
|
The following statements use PROC MIANALYZE to combine the results from the imputed data sets. As discussed in the previous example, the individual parameters are specified in the MODELEFFECTS statement and the EDF= option is also specified and set to 328 which is the number of observations (338) minus the number of parameters (10).
proc mianalyze data=imp_parms edf=328; modeleffects Intercept_self Intercept_team School1_self School1_team School2_self School2_team Programregular_self Programregular_team NumStudent_self NumStudent_team ; run;
|
Product Family | Product | System | SAS Release | |
Reported | Fixed* | |||
SAS System | SAS/STAT | z/OS | ||
Z64 | ||||
OpenVMS VAX | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft® Windows® for x64 | ||||
OS/2 | ||||
Microsoft Windows 8 Pro | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2003 for x64 | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows Server 2008 for x64 | ||||
Microsoft Windows Server 2012 | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
Windows Vista for x64 | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |
Type: | Usage Note |
Priority: | |
Topic: | Analytics ==> Categorical Data Analysis Analytics ==> Missing Value Imputation SAS Reference ==> Procedures ==> LOGISTIC SAS Reference ==> Procedures ==> MI SAS Reference ==> Procedures ==> MIANALYZE |
Date Modified: | 2013-07-31 16:18:57 |
Date Created: | 2013-04-03 14:40:01 |