This error message indicates that, for some subject(s), no choice was made. Even if the decision variable contains a choice for every subject, the error may be caused by missing values in other variables. PROC MDC excludes an observation if at least one of the model variables has a missing value. If such a missing value appears in an observation in which the alternative is chosen, then for that subject no choice is made since the observation is excluded from the analysis. To avoid the error message, exclude all observations for those subjects for which no choice was made. This is illustrated in the following example.
The following DATA step code creates data set NEW. Note that for subject 1 (PID=1) the independent variable TTIME is missing for alternative 2 (MODE=2) which was the alternative chosen by this subject.
data new; input pid mode ttime decision @@; datalines; 1 1 16.481 0 26 1 15.237 0 1 2 . 1 26 2 14.345 1 1 3 23.890 0 26 3 19.984 0 2 1 15.123 0 27 1 10.840 1 2 2 11.373 1 27 2 11.071 0 2 3 14.182 0 27 3 10.188 0 3 1 19.469 0 28 1 16.841 0 3 2 8.822 1 28 2 11.224 1 3 3 20.819 0 28 3 13.417 0 4 1 18.847 0 29 1 13.913 0 4 2 15.649 1 29 2 16.991 0 4 3 21.280 0 29 3 26.618 1 5 1 12.578 0 30 1 13.089 0 5 2 10.671 1 30 2 9.822 1 5 3 18.335 0 30 3 19.162 0 6 1 11.513 1 31 1 16.626 0 6 2 20.582 0 31 2 10.725 0 6 3 27.838 0 31 3 15.285 1 7 1 10.651 1 32 1 13.477 0 7 2 15.537 0 32 2 15.509 1 7 3 17.418 0 32 3 24.421 0 8 1 8.359 1 33 1 20.851 0 8 2 15.675 0 33 2 14.557 1 8 3 21.050 0 33 3 19.800 0 9 1 11.679 1 34 1 11.365 0 9 2 12.668 0 34 2 12.673 1 9 3 23.104 0 34 3 22.212 0 10 1 23.237 0 35 1 13.296 0 10 2 10.356 1 35 2 10.076 1 10 3 21.346 0 35 3 17.810 0 11 1 13.236 0 36 1 15.417 1 11 2 16.019 0 36 2 14.103 0 11 3 10.087 1 36 3 21.050 0 12 1 20.052 0 37 1 15.938 0 12 2 16.861 0 37 2 11.180 1 12 3 14.168 1 37 3 19.851 0 13 1 18.917 0 38 1 19.034 0 13 2 14.764 1 38 2 14.125 1 13 3 21.564 0 38 3 19.764 0 14 1 18.200 0 39 1 10.466 1 14 2 6.868 1 39 2 12.841 0 14 3 19.095 0 39 3 18.540 0 15 1 10.777 1 40 1 15.799 0 15 2 16.554 0 40 2 16.979 0 15 3 15.938 0 40 3 13.074 1 16 1 20.003 0 41 1 12.713 0 16 2 6.377 1 41 2 15.105 1 16 3 9.314 0 41 3 13.629 0 17 1 19.768 0 42 1 16.908 0 17 2 8.523 1 42 2 10.958 1 17 3 18.960 0 42 3 19.713 0 18 1 8.151 0 43 1 17.098 0 18 2 13.845 1 43 2 6.853 1 18 3 17.643 0 43 3 14.502 0 19 1 22.173 1 44 1 18.608 0 19 2 18.045 0 44 2 14.286 1 19 3 15.535 0 44 3 18.301 0 20 1 13.134 0 45 1 11.059 1 20 2 11.067 1 45 2 10.812 0 20 3 19.108 0 45 3 20.121 0 21 1 14.051 1 46 1 15.641 0 21 2 14.247 0 46 2 10.754 1 21 3 15.764 0 46 3 24.669 0 22 1 14.685 0 47 1 7.822 1 22 2 10.811 0 47 2 18.949 0 22 3 12.361 1 47 3 16.904 0 23 1 11.666 1 48 1 12.824 0 23 2 10.758 0 48 2 5.697 1 23 3 16.445 0 48 3 19.183 0 24 1 17.211 0 49 1 11.852 0 24 2 15.201 0 49 2 12.147 1 24 3 17.059 1 49 3 15.672 0 25 1 13.930 1 50 1 15.557 0 25 2 16.227 0 50 2 8.307 1 25 3 22.024 0 50 3 22.286 0 ; proc sort data=new; by pid; run;
The following PROC MDC analysis of data set NEW results in the error message described above because the observation in which subject 1's choice was made is excluded from the analysis. As a result, PROC MDC does not see a chosen alternative for subject 1.
proc mdc data=new; model decision = ttime / type=clogit choice=(mode 1 2 3) optmethod=qn covest=hess; id pid; run;
The following steps allow you to exclude subjects for which no choice is seen. Begin with this DATA step identifying observations which will be included in the analysis. The NMISS function returns 0 if none of the specified variables contains a missing value. If the PID, DECISION, MODE, and TTIME variables are all nonmissing, NMISS returns 0, and the variable _INCLUDE is set to 1 indicating that the observation will be included in the analysis. Note that if you use the NCHOICE= option in your PROC MDC analysis rather than the CHOICE= option, you should remove MODE from the NMISS function since missing MODE values will not cause observations to be excluded.
data newtwo; set new; if nmiss(of pid decision mode ttime) = 0 then _include = 1; run;
Since a subject should choose one alternative, the sum of the DECISION variable should be nonzero across all of the observations included in the analysis for that subject. The following PROC MEANS step computes the sums, for each subject, using only the observations that would be included in the analysis. The data set NEWTHREE contains the sums (_SUM) for each subject (PID).
proc means data=newtwo noprint; where _include = 1; by pid; var decision; output out=newthree(keep=pid _sum) sum = _sum; run;
The following statements add the _SUM variable to the original data set NEW. Subjects whose included observations do not contain a choice have _SUM=0 for all their observations. By excluding all observations with _SUM=0, you can now exclude all subjects that would cause the error message.
data new; merge new newthree; by pid; run;
These statements display the observations for the first two subjects (PID=1 and 2).
proc print data=new(obs = 6); run;
Note that the _SUM variable is 0 for the observations of subject 1 because the MODE=2 observation, which contains the chosen alternative, would be excluded in the analysis due to the missing value of TTIME.
|
The analysis can now be done in PROC MDC by specifying a WHERE statement which includes only those subjects whose choices would not be excluded due to missing values.
proc mdc data=new; where _sum ne 0; model decision = ttime / type=clogit choice=(mode 1 2 3) optmethod=qn covest=hess; id pid; run;
Product Family | Product | System | SAS Release | |
Reported | Fixed* | |||
SAS System | SAS/ETS | z/OS | ||
OpenVMS VAX | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft® Windows® for x64 | ||||
OS/2 | ||||
Microsoft Windows 7 | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows XP Professional | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |
Type: | Usage Note |
Priority: | |
Topic: | SAS Reference ==> Procedures ==> MDC Analytics ==> Regression Analytics ==> Econometrics Analytics ==> Categorical Data Analysis |
Date Modified: | 2009-11-12 12:58:41 |
Date Created: | 2009-10-20 15:41:49 |