In the example titled Logistic Modeling with Categorical Predictors (see the Examples section of the LOGISTIC documentation), the model fit at the end of that example contains categorical predictors Sex and Treatment and continuous predictor Age. The response variable, Pain, has levels No and Yes. The predictor, Treatment, has levels P, B, and A, and the predictor, Sex, has levels M and F. The following statements refit the model. The EVENT="No" option tells PROC LOGISTIC to model the probability that Pain=No. The PARAM=REF option specifies reference coding (or parameterization) for the categorical predictors. The effect of coding is discussed below.
proc logistic data=Neuralgia; class Treatment Sex / param=ref; model Pain(event="No") = Treatment Sex Age; run;
The following are the Class Level Information and the Analysis of Maximum Likelihood Estimates tables produced by PROC LOGISTIC.
|
With this information, you can write the logistic model in terms of the log odds of no pain:
log odds = log[Pr(Pain=No)/Pr(Pain=Yes)] = 15.8669 + 3.1790*TA + 3.7264*TB + 1.8235*SF - 0.2650*Age ,
or in terms of the odds of no pain:
Odds = Pr(Pain=No)/Pr(Pain=Yes) = exp(15.8669 + 3.1790*TA + 3.7264*TB + 1.8235*SF - 0.2650*Age) ,
or in terms of the probability of no pain:
Pr(Pain=No) = 1/(1+exp(-log odds)) = 1/(1+exp[-(15.8669 + 3.1790*TA + 3.7264*TB + 1.8235*SF - 0.2650*Age)]) .
Categorical predictors are represented in the model by sets of coded design variables and the parameters multiply the design variable values. The coding of these design variables depends on the PARAM= option in the CLASS statement and is shown in the Class Level Information table near the beginning of the PROC LOGISTIC results. In the logistic model above, TA and TB are the two design variables created to represent the Treatment predictor. One design variable, SF, is created for the Sex predictor. As shown in the Class Level Information table, Treatment A is represented by TA=1 and TB=0. Similarly, Treatment B is represented by TA=0 and TB=1. For Treatment=P, both design variables equal zero. This is the "reference" parameterization produced by the PARAM=REF option in the CLASS statement.
A new subject with Treatment P, Sex M, and Age 70 would be scored using the model with TA=0, TB=0, SF=0, and Age=70:
log odds = 15.8669 + 3.1790*0 + 3.7264*0 + 1.8235*0 - 0.2650*70 = 15.8669 - 18.55 = -2.68 .
His odds of no pain are:
Odds = exp(15.8669 + 3.1790*0 + 3.7264*0 + 1.8235*0 - 0.2650*70) = exp(15.8669 - 18.55) = 0.068 ,
and his probability of no pain is:
Pr(Pain=No) = 1/(1+exp[2.6831]) = 0.064 .
For a 70 year old male individual under treatment P, his probability of no pain estimated by the model is 0.064. Therefore his probability of pain is 1 - 0.064 = 0.936. Using a maximum probability decision rule, his predicted response is Pain=Yes.
Note that in order to avoid rounding errors in these predicted values, you should always use full precision of the parameter estimates as PROC LOGISTIC does in its calculations. This is discussed and illustrated in section 4 of this note which also shows ways to easily score new observations using capabilities built in to PROC LOGISTIC.
The first model fit in the Logistic Modeling with Categorical Predictors example uses "effects" parameterization. This is the parameterization used when the PARAM= option is not specified or when you specify the PARAM=EFFECT option. Note the difference in the coding of the design variables shown in the Class Level Information table:
|
Fitting the same model as above with effects parameterization will result in different parameter estimates. However, the model is equivalent to the model using reference parameterization as evidenced by having the same log likelihood value (-2 Log L in the Model Fit Statistics table) and by producing identical scores for observations. The written form of the model equation, as above, does not change, but when using the model to score new observations, the values of the design variables, TA, TB, and SF are different as shown in the Class Level Information table. For example, Treatment P is represented by TA=-1 and TB=-1. A model using effects parameterization is shown and used to score observations in this note.
Product Family | Product | System | SAS Release | |
Reported | Fixed* | |||
SAS System | SAS/STAT | z/OS | ||
Z64 | ||||
OpenVMS VAX | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft® Windows® for x64 | ||||
OS/2 | ||||
Microsoft Windows 8 Enterprise 32-bit | ||||
Microsoft Windows 8 Enterprise x64 | ||||
Microsoft Windows 8 Pro 32-bit | ||||
Microsoft Windows 8 Pro x64 | ||||
Microsoft Windows 8.1 Enterprise 32-bit | ||||
Microsoft Windows 8.1 Enterprise x64 | ||||
Microsoft Windows 8.1 Pro | ||||
Microsoft Windows 8.1 Pro 32-bit | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2003 for x64 | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows Server 2008 R2 | ||||
Microsoft Windows Server 2008 for x64 | ||||
Microsoft Windows Server 2012 Datacenter | ||||
Microsoft Windows Server 2012 R2 Datacenter | ||||
Microsoft Windows Server 2012 R2 Std | ||||
Microsoft Windows Server 2012 Std | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
Windows Vista for x64 | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |
Type: | Usage Note |
Priority: | |
Topic: | Analytics ==> Categorical Data Analysis Analytics ==> Regression SAS Reference ==> Procedures ==> HPLOGISTIC SAS Reference ==> Procedures ==> LOGISTIC |
Date Modified: | 2013-11-06 16:25:46 |
Date Created: | 2013-11-06 16:21:12 |