There are two types of relative risks that might be of interest when modeling a multinomial response. You might want to compare two populations with respect to an individual response level probability (P(Y=iX=j)/P(Y=iX=k)), or you might want to compare response level probabilities in a given population (P(Y=iX=j)/P(Y=kX=j). Both situations are discussed below.
In the following examples, a generalized logit model is fit to the nominal, multinomial response. As discussed in this note, the generalized logit model can also be fit in other procedures such as GLIMMIX, HPGENSELECT, FMM, CATMOD, and SURVEYLOGISTIC. For repeated measures data with a multinomial response, use PROC SURVEYLOGISTIC with a CLUSTER statement. In LOGISTIC, and most other procedures, generalized logits are requested by the LINK=GLOGIT option in the MODEL statement.
When the response is binary, you can obtain relative risk estimates as discussed in this note. In the multinomial case, relative risk estimates are nonlinear functions of the parameters in a generalized logit model, which can be fit using PROC LOGISTIC. These nonlinear functions can be estimated using the NLMeans macro, the NLEstimate macro, or by fitting the model and doing the estimation in PROC NLMIXED. PROC CATMOD can also be used by fitting a model to the log probabilities rather than logits.
The data below are presented in the example titled "Nominal Response Data: Generalized Logits Model" in the PROC LOGISTIC documentation comparing styles of instruction at several schools. There are three response levels (Styles) and three populations (Schools).
data School; length Program $ 9; input School Program $ Style $ Count @@; datalines; 1 regular self 10 1 regular team 17 1 regular class 26 1 afternoon self 5 1 afternoon team 12 1 afternoon class 50 2 regular self 21 2 regular team 17 2 regular class 26 2 afternoon self 16 2 afternoon team 12 2 afternoon class 36 3 regular self 15 3 regular team 15 3 regular class 16 3 afternoon self 12 3 afternoon team 12 3 afternoon class 20 ;
Using PROC LOGISTIC and the NLMeans macro
The following statements fit a generalized logit model to the multinomial response, Style. The ORDER=DATA response variable option orders the Styles as they first appear in the data so that the logits are log(p_{self}/p_{class}) and log(p_{team}/p_{class}). The LSMEANS statement provides estimates of the log odds for each School. The ILINK option adds estimates of the Style level probabilities for each School. The E option produces a table of coefficients of the linear combination of parameters that define the log odds for each School on each Style logit. The table is saved by the ODS OUTPUT statement for use later with the NLMeans macro. The STORE statement saves the fitted model for use with the NLMeans and NLEstimate macros.
proc logistic data=School; freq count; class school / param=glm; model style(order=data) = School / link=glogit; lsmeans School / e ilink; ods output coef=coeffs; store out=logmod; run;
Following are the model parameter estimates and estimated Style probabilities for each School. Note that estimates are only possible for two of the three Style levels due to the constraint that the probabilities must sum to one.

In spite of the constraint, the NLMeans macro can estimate and test the differences among the Schools on all three of the Style probabilities. To use the macro, you provide the saved model from the STORE statement, the saved table of coefficients from the LSMEANS / E statement, and the link function used in the model. By default, the NLMeans macro estimates and tests pairwise differences among the mean estimates. To request that the ratio be estimated rather than the difference, specify options=ratio.
%NLMeans(instore=logmod, coef=coeffs, link=glogit, options=ratio, title=School RRs on each Style)
The Label in each row identifies the relative risk estimate in that row using the same order as seen in the Least Squares Means table above. The last set of three values correspond to the third Style (class). So, the label in the first row indicates that the estimate, 0.4324, is the relative risk comparing School 1 and School 2 on Style=self – that is, 0.1250/0.2891. If the reciprocal of this is desired, add the reverse option: options=ratio reverse. Similarly, the last estimate, 1.2109, is the relative risk comparing Schools 2 and 3 on Style=class.

Using PROC LOGISTIC and the NLEstimate macro
Relative risk estimates can also be produced using the NLEstimate macro by writing the expressions for the estimates in terms of the model parameters.
The formula for the predicted probabilities of the individual response levels within a population is given in "Linear Predictor, Predicted Probability, and Confidence Limits" in the Details section of the LOGISTIC documentation. Using that formula, the ratio of individual probabilities can be written in terms of the model parameters. For example, the estimated probability of Style=self in School=1 is:
exp(Intercept_{Self}+School_{1,Self}) 1+exp(Intercept_{Self}+School_{1,Self})+exp(Intercept_{Team}+School_{1,Team})
A similar expression can be written for the estimated probability of Style=Self in School=3:
exp(Intercept_{Self}) 1+exp(Intercept_{Self})+exp(Intercept_{Team})
The relative risk is then the ratio of these two expressions. In order to use the NLEstimate macro, the expressions for the relative risks must be written using the names of the parameters. These names can be displayed by specifying shownames as the first argument. See the description of the NLEstimate macro for details.
%NLEstimate(shownames, instore=logmod)
The parameter names are displayed in the following table.

The expressions for each of the relative risks can then be written. When estimating several functions, it is useful to specify them in a data set with appropriate labels. This is done in the following statements and the resulting data set is specified in the fdata= NLEstimate macro parameter.
data fd; length label f $32767; infile datalines delimiter=''; input label f; datalines; P(self,S1)/P(self,S2)(exp(b_p1+b_p3)/(1+(exp(b_p1+b_p3)+exp(b_p2+b_p4)))) / (exp(b_p1+b_p5)/(1+(exp(b_p1+b_p5)+exp(b_p2+b_p6)))) P(self,S1)/P(self,S3)(exp(b_p1+b_p3)/(1+(exp(b_p1+b_p3)+exp(b_p2+b_p4)))) / (exp(b_p1)/(1+(exp(b_p1)+exp(b_p2)))) P(self,S2)/P(self,S3)(exp(b_p1+b_p5)/(1+(exp(b_p1+b_p5)+exp(b_p2+b_p6)))) / (exp(b_p1)/(1+(exp(b_p1)+exp(b_p2)))) P(team,S1)/P(team,S2)(exp(b_p2+b_p4)/(1+(exp(b_p1+b_p3)+exp(b_p2+b_p4)))) / (exp(b_p2+b_p6)/(1+(exp(b_p1+b_p5)+exp(b_p2+b_p6)))) P(team,S1)/P(team,S3)(exp(b_p2+b_p4)/(1+(exp(b_p1+b_p3)+exp(b_p2+b_p4)))) / (exp(b_p2)/(1+(exp(b_p1)+exp(b_p2)))) P(team,S2)/P(team,S3)(exp(b_p2+b_p6)/(1+(exp(b_p1+b_p5)+exp(b_p2+b_p6)))) / (exp(b_p2)/(1+(exp(b_p1)+exp(b_p2)))) P(class,S1)/P(class,S2)(1/(1+(exp(b_p1+b_p3)+exp(b_p2+b_p4)))) / (1/(1+(exp(b_p1+b_p5)+exp(b_p2+b_p6)))) P(class,S1)/P(class,S3)(1/(1+(exp(b_p1+b_p3)+exp(b_p2+b_p4)))) / (1/(1+(exp(b_p1)+exp(b_p2)))) P(class,S2)/P(class,S3)(1/(1+(exp(b_p1+b_p5)+exp(b_p2+b_p6)))) / (1/(1+(exp(b_p1)+exp(b_p2)))) ; %NLEstimate(instore=logmod, fdata=fd, title=School RRs on each Style)
The macro produces the same results as from the NLMeans macro above.
Using PROC CATMOD
Another approach is to model the log of the probabilities of the first two Styles (self and team). This can be done in PROC CATMOD using weighted least squares estimation of the model. Maximum likelihood estimation is only available in CATMOD when modeling logits. In the statements below, the RESPONSE statement defines the log of the two Style probabilities. The PROB option prints a table of the observed probabilities. The PARAM=REF option uses reference parameterization, just like when the same option is used in the CLASS statement in GENMOD, LOGISTIC, and other procedures. You can then write CONTRAST statements to estimate each relative risk of interest. The labels indicate the relative risk that is estimated. The ALL_PARMS keyword in the CONTRAST statements allows you to specify a list of multipliers for all of the model parameter estimates in the order they appear in the "Analysis of Weighted Least Squares Estimates" table. The specified linear combinations estimate the log relative risks. For example, the first CONTRAST labeled P(self,S1)/P(self,S3) estimates the log relative risk of selecting Style=self in School 1 vs. School 3. The ESTIMATE=EXP option exponentiates the linear combinations resulting in relative risk estimates.
proc catmod data=School order=data; weight Count; response log 1 0 0, 0 1 0; model Style = School / prob param=ref; contrast 'P(self,S1)/P(self,S3)' all_parms 0 0 1 0 0 0 / estimate=exp; contrast 'P(team,S1)/P(team,S3)' all_parms 0 0 0 1 0 0 / estimate=exp; contrast 'P(self,S2)/P(self,S3)' all_parms 0 0 0 0 1 0 / estimate=exp; contrast 'P(team,S2)/P(team,S3)' all_parms 0 0 0 0 0 1 / estimate=exp; contrast 'P(self,S1)/P(self,S2)' all_parms 0 0 1 0 1 0 / estimate=exp; contrast 'P(team,S1)/P(team,S2)' all_parms 0 0 0 1 0 1 / estimate=exp; run; quit;
From the "Analysis of Weighted Least Squares Estimates" table the fitted model is:
log(p_{Self}) = 1.2040 + 0.8755*I(School=1) + 0.0371*I(School=2)
log(p_{Team}) = 1.2040 + 0.2162*I(School=1) + 0.2808*I(School=2)
where I(condition) is the indicator function which returns 1 if the condition is true and returns 0 otherwise.

The following table of contrasts shows that the estimated relative risk for Style=self comparing School 1 vs. School 3 is 0.4167. This agrees with the observed probabilities in the "Response Probabilities" table (0.125/0.3 = 0.4167). The standard error and 95% confidence interval for the relative risks are also given. These results are essentially identical to those obtained from the NLEstimate macro above. The 95% confidence limits differ because of the use of different methods for forming the interval. CATMOD forms a large sample confidence interval around the log relative risk and then exponentiates the limits of that interval to produce the confidence limits for the relative risk. Notice that the interval is not symmetric about the relative risk estimate. The macros and NLMIXED use the delta method to produce a confidence interval for the relative risk which is symmetric about the estimate. See this note that discusses methods for obtaining confidence intervals for ratios.

The relative risks for Style=class comparing Schools cannot be estimated with CONTRAST statements in this parameterization of the model. The easiest way to obtain these estimates is to refit the model using a different ordering of the response levels. In the above example, this can be done by removing the ORDER=DATA option. This will produce an equivalent model in which Style=class is the first level rather than Style=self.
Using PROC NLMIXED
These relative risk estimates can also be obtained in PROC NLMIXED using maximum likelihood estimation, though it is a little more difficult since the multinomial likelihood must be coded in the procedure. NLMIXED requires a numeric response, so variable Y is created from the Style variable. In the PROC NLMIXED statements that follow, a large value is specified in the df= option to provide largesample tests and confidence intervals. The linear component of the model for the two probabilities is specified in the ep1= and ep2= assignment statements. The p3= assignment statement ensures that the probabilities properly sum to 1. The following statements define the multinomial log likelihood and specify that the response, Y, is distributed according to that log likelihood. The p= assignment statement assures that each probability is valid (between 0 and 1). Finally, the ESTIMATE statements provide estimates of each of the relative risks of interest. Notice that the relative risk estimates for Style=class are computed using the fact that p_{class} + p_{self} + p_{team} = 1.
proc nlmixed data=School df=1e8; if Style="self" then y=1; else if Style="team" then y=2; else y=3; ep1=exp(int1 + b11*(School=1) + b12*(School=2)); ep2=exp(int2 + b21*(School=1) + b22*(School=2)); p3=1/(1+ep1+ep2); p1=p3*ep1; p2=p3*ep2; if y=1 then p=p1; else if y=2 then p=p2; else p=p3; p = (p>0 and p<=1)*p + (p<=0)*1e8 + (p>1); loglik = Count*log(p); model y ~ general(loglik); estimate 'P(self,S1)/P(self,S2)' (int1+b11)/(int1+b12); estimate 'P(self,S1)/P(self,S3)' (int1+b11)/int1; estimate 'P(self,S2)/P(self,S3)' (int1+b12)/int1; estimate 'P(team,S1)/P(team,S2)' (int2+b21)/(int2+b22); estimate 'P(team,S1)/P(team,S3)' (int2+b21)/int2; estimate 'P(team,S2)/P(team,S3)' (int2+b22)/int2; estimate 'P(class,S1)/P(class,S2)' (1(int1+b11)(int2+b21))/(1(int1+b12)(int2+b22)); estimate 'P(class,S1)/P(class,S3)' (1(int1+b11)(int2+b21))/(1int1int2); estimate 'P(class,S2)/P(class,S3)' (1(int1+b12)(int2+b22))/(1int1int2); run;
The results are the same as from the NLMeans and NLEstimate macros above.
In the example above, suppose you want to estimate relative risks comparing the instruction styles at each school. For example, you might want to estimate the relative risk comparing the self and class styles in school 1. If you fit a generalized logit model to the data, then the logit response functions that are modeled are the log of two relative risks defined on the three response levels such as log(p_{self}/p_{class}) and log(p_{team}/p_{class}). Exponentiating the estimates of these logit response functions yields estimates of the relative risks.
The following statements again fit the generalized logit model to the data using PROC LOGISTIC. The ORDER=DATA response variable option orders the Styles as they first appear in the data so that the logits are log(p_{self}/p_{class}) and log(p_{team}/p_{class}). The LSMEANS statement provides estimates of the log odds for each School. The ILINK option adds estimates of the Style level probabilities for each School. The E option produces a table of coefficients of the linear combination of parameters that define the log odds for each School on each Style logit. The table is saved by the ODS OUTPUT statement for use later with the NLMeans macro. The STORE statement saves the fitted model for use with the NLMeans and NLEstimate macros. The XBETA= option in the OUTPUT statement computes the log relative risks using the model parameter estimates and saves the values in variable XB in data set LogRR. The standard errors of the log relative risks are also computed and saved in variable S. The OUT= data set is structured to have one observation per response level for each input observation. As a result, the LogRR data set has three times as many observations as the input data set, School.
proc logistic data=School; freq count; class school / param=glm; model style(order=data) = school / link=glogit; lsmeans School / e ilink; ods output coef=coeffs; output out=LogRR xbeta=xb stdxbeta=s; store out=logmod; run;
The model parameter estimates and LSMEANS results are repeated below.

Using the NLMeans macro
The NLMeans macro requires the saved model from the STORE statement, the saved table of coefficients from the LSMEANS / E statement, and the link function used in the model. By default, the NLMeans macro estimates the pairwise differences of the School probabilities within each of Styles. Specifying options=ratio produces ratios rather than differences so that relative risks can be estimated. However, the desired relative risks are defined across, rather than within, the Styles. This can be accomplished by specifying the desired contrasts of the probabilities as rows in a data set which is then specified in the contrasts= parameter of the macro.
The contrasts= data set must contain variables named SET and K1, K2, ... , Kn. Optionally, a LABEL variable can be included to label each of the specified contrasts. The SET variable is primarily used when multiple sets of means are estimated by the SLICE statement or by multiple LSMEANS, SLICE, or ESTIMATE statements. Since there is only a single LSMEANS statement and therefore only a single set of LSmeans, SET=1. For a generalized logit model, n is the number of estimates from the LSMEANS statement plus the number of estimates within a logit. There are six probability estimates provided by the LSMEANS statement above. These are the estimated probabilities for the three Schools on two of the three Styles. Probability estimates can also be obtained for the third Style, Style=class, making a total of nine probability estimates. So, in this case n=9.
In the following data set, the K variables are in the same order as shown in the LSMEANS table – K1K3 correspond to the three Schools on Style=self, K4K6 correspond to the three Schools on Style=team, and the additional three variables, K7K9, correspond to the three Schools on Style=class. In the first row, the 1 value selects the School=1, Style=self probability for the relative risk numerator. The 1 value selects the School=1, Style=class for the denominator. With options=ratio, the result from this row is, as shown in the label, the relative risk comparing the team and class Styles within School 1. Each of the following rows define a relative risk comparing each pair of the Styles within each of the Schools.
data cont; length label $40; infile datalines missover; input label k1k9; set=1; datalines; P(self,S1)/P(class,S1) 1 0 0 0 0 0 1 0 0 P(team,S1)/P(class,S1) 0 0 0 1 0 0 1 0 0 P(self,S1)/P(team,S1) 1 0 0 1 0 0 0 0 0 P(self,S2)/P(class,S2) 0 1 0 0 0 0 0 1 0 P(team,S2)/P(class,S2) 0 0 0 0 1 0 0 1 0 P(self,S2)/P(team,S2) 0 1 0 0 1 0 0 0 0 P(self,S3)/P(class,S3) 0 0 1 0 0 0 0 0 1 P(team,S3)/P(class,S3) 0 0 0 0 0 1 0 0 1 P(self,S3)/P(team,S3) 0 0 1 0 0 1 0 0 0 ; %NLMeans(instore=logmod, coef=coeffs, link=glogit, options=ratio, contrasts=cont, title=Style RRs in each School)
Following are the results from the NLMeans macro.

Using the NLEstimate macro
These statements create a data set with one observation for each relative risk to be estimated. The NLEstimate macro is then called to estimate the relative risks. See the example above and the description of the NLEstimate macro for details.
data fd; length label f $32767; infile datalines delimiter=''; input label f; datalines; P(self,S1)/P(class,S1) exp(b_p1+b_p3) P(team,S1)/P(class,S1) exp(b_p2+b_p4) P(self,S1)/P(team,S1)  exp(b_p1+b_p3b_p2b_p4) P(self,S2)/P(class,S2) exp(b_p1+b_p5) P(team,S2)/P(class,S2) exp(b_p2+b_p6) P(self,S2)/P(team,S2)  exp(b_p1+b_p5b_p2b_p6) P(self,S3)/P(class,S3) exp(b_p1) P(team,S3)/P(class,S3) exp(b_p2) P(self,S3)/P(team,S3)  exp(b_p1b_p2) ; %NLEstimate(instore=logmod, fdata=fd)
The results are the same as from the NLMeans macro above.
Using the predicted log relative risks
Another approach is to form confidence intervals around the log relative risks estimates and exponentiate the limits. This is different from the way the limits are obtained by the NLMeans and NLEstimate macros and by PROC NLMIXED. In the following DATA step, the log relative risks are exponentiated to obtain the relative risk estimates comparing response level probabilities. Confidence limits for 95% large sample confidence intervals are also computed for each relative risk. The PRINT step displays the relative risk estimates and confidence intervals for each school. The WHERE statement is used to remove the repetition in the data set.
data RR; set LogRR; RR = exp(xb); RR_LCL = exp(xbprobit(.975)*s); RR_UCL = exp(xb+probit(.975)*s); run; proc print data=RR; where program="regular" and style="self" and _level_ ne "class"; id school _level_; var RR:; run;
The two relative risk estimates per school are labeled in the _LEVEL_ column using the Style that appears in the numerator of the relative risk. The class Style is in the denominator of each. These results essentially agree with those from the NLMeans and NLEstimate macro. Again, the 95% confidence limits differ somewhat for the reasons discussed in the previous section.

If the relative risk estimate for the third possible pair of levels, self and team, is desired, the easiest way is to refit the model using a different ordering of the response levels. In this example, this can be done by removing the ORDER=DATA response option. This will produce an equivalent model in which the second logit compares self and team.
Using PROC NLMIXED
The relative risks can also be estimated by fitting the generalized logit model in PROC NLMIXED. As in the example above, the linear model for the two logits is specified in the ep1= and ep2= assignment statements and the individual level probabilities are derived from the logits in the p1, p2, and p3 statements. The following statements define the multinomial log likelihood and specify that the numeric response, Y, is distributed according to that log likelihood. The p= assignment statement assures that each probability is valid (between 0 and 1). The ESTIMATE statements, for each school, provide estimates of the relative risks comparing each pair of response levels. Note that the relative risk comparing self and team can be obtained from the difference of the two logits since log(p_{self}/p_{team}) = log(p_{self}/p_{class})  log(p_{team}/p_{class}).
proc nlmixed data=School df=1e8; if Style="self" then y=1; else if Style="team" then y=2; else y=3; ep1=exp(int1 + b11*(School=1) + b12*(School=2)); ep2=exp(int2 + b21*(School=1) + b22*(School=2)); p3=1/(1+ep1+ep2); p1=p3*ep1; p2=p3*ep2; if y=1 then p=p1; else if y=2 then p=p2; else p=p3; p = (p>0 and p<=1)*p + (p<=0)*1e8 + (p>1); loglik = Count*log(p); model y ~ general(loglik); estimate 'P(self,S1)/P(class,S1)' exp(int1+b11); estimate 'P(team,S1)/P(class,S1)' exp(int2+b21); estimate 'P(self,S1)/P(team,S1)' exp(int1+b11int2b21); estimate 'P(self,S2)/P(class,S2)' exp(int1+b12); estimate 'P(team,S2)/P(class,S2)' exp(int2+b22); estimate 'P(self,S2)/P(team,S2)' exp(int1+b12int2b22); estimate 'P(self,S3)/P(class,S3)' exp(int1); estimate 'P(team,S3)/P(class,S3)' exp(int2); estimate 'P(self,S3)/P(team,S3)' exp(int1int2); run;
The results are the same as those from the NLMeans and NLEstimate macros.
Product Family  Product  System  SAS Release  
Reported  Fixed*  
SAS System  SAS/STAT  z/OS  
z/OS 64bit  
OpenVMS VAX  
Microsoft® Windows® for 64Bit Itaniumbased Systems  
Microsoft Windows Server 2003 Datacenter 64bit Edition  
Microsoft Windows Server 2003 Enterprise 64bit Edition  
Microsoft Windows XP 64bit Edition  
Microsoft® Windows® for x64  
OS/2  
Microsoft Windows 8 Enterprise 32bit  
Microsoft Windows 8 Enterprise x64  
Microsoft Windows 8 Pro 32bit  
Microsoft Windows 8 Pro x64  
Microsoft Windows 8.1 Enterprise 32bit  
Microsoft Windows 8.1 Enterprise x64  
Microsoft Windows 8.1 Pro  
Microsoft Windows 8.1 Pro 32bit  
Microsoft Windows 10  
Microsoft Windows 95/98  
Microsoft Windows 2000 Advanced Server  
Microsoft Windows 2000 Datacenter Server  
Microsoft Windows 2000 Server  
Microsoft Windows 2000 Professional  
Microsoft Windows NT Workstation  
Microsoft Windows Server 2003 Datacenter Edition  
Microsoft Windows Server 2003 Enterprise Edition  
Microsoft Windows Server 2003 Standard Edition  
Microsoft Windows Server 2003 for x64  
Microsoft Windows Server 2008  
Microsoft Windows Server 2008 R2  
Microsoft Windows Server 2008 for x64  
Microsoft Windows Server 2012 Datacenter  
Microsoft Windows Server 2012 R2 Datacenter  
Microsoft Windows Server 2012 R2 Std  
Microsoft Windows Server 2012 Std  
Microsoft Windows XP Professional  
Windows 7 Enterprise 32 bit  
Windows 7 Enterprise x64  
Windows 7 Home Premium 32 bit  
Windows 7 Home Premium x64  
Windows 7 Professional 32 bit  
Windows 7 Professional x64  
Windows 7 Ultimate 32 bit  
Windows 7 Ultimate x64  
Windows Millennium Edition (Me)  
Windows Vista  
Windows Vista for x64  
64bit Enabled AIX  
64bit Enabled HPUX  
64bit Enabled Solaris  
ABI+ for Intel Architecture  
AIX  
HPUX  
HPUX IPF  
IRIX  
Linux  
Linux for x64  
Linux on Itanium  
OpenVMS Alpha  
OpenVMS on HP Integrity  
Solaris  
Solaris for x64  
Tru64 UNIX 
Type:  Usage Note 
Priority:  
Topic:  Analytics ==> Categorical Data Analysis Analytics ==> Regression SAS Reference ==> Procedures ==> CATMOD SAS Reference ==> Procedures ==> LOGISTIC SAS Reference ==> Procedures ==> NLMIXED SAS Reference ==> Macro 
Date Modified:  20180601 09:51:16 
Date Created:  20160304 14:47:57 