When modeling the probability of an event, p, the INVERSECL option in PROC PROBIT provides estimates of a continuous predictor (often a dose variable) that yield a range of response probabilities. See the examples in the PROBIT documentation. Suppose you have several groups of subjects and you fit a model to each group. You may want to compare groups with respect to the dose that yields, say, a 50% response probability (commonly called the ED50 or LD50). The following discusses and illustrates how tests and confidence intervals of differences of such dose values can be obtained.
PROC PROBIT fits the model F^{1}(p)=x'β, where F is the cumulative distribution function used to model the probability (normal for probit models or logistic for logistic models), x is the vector of predictor values, and β is the corresponding vector of parameters.
The first step is to write a single model that encompasses all groups by simultaneously fitting separate curves to all of the groups. This is discussed in generality in this usage note, but is illustrated below for the case of modeling a response probability. Group dose values can then be compared by testing a function of the parameters in this model. The model that simultaneously estimates curves for all groups can be fitted by including the group variable in the CLASS and MODEL statements. For example, these statements
proc probit; class group; model response = group x; run;
fit the following parallel lines probit model with separate lines (lines on the probit scale, but curves on the probability scale) for each of the groups:
F^{1}(p) = probit(p) = α + γ_{i} + βX ,
where α is the overall intercept, γ_{i} is the parameter for the i^{th} group representing the deviation of the group i intercept from the overall intercept, and β is the common slope on X.
A model that allows for nonparallel lines on the probit scale can be obtained by adding the GROUP×X interaction. These statements
proc probit; class group; model response = group x group*x; run;
fit the nonparallel lines probit model
F^{1}(p) = probit(p) = α + γ_{i} + βX + δ_{i}X = α + γ_{i} + (β+δ_{i})X ,
where δ_{i} is the deviation of the group i slope from the common slope, β.
By simple algebraic rearrangement of the model, F^{1}(p)=x'β, the following estimator of the dose, ^ x , yielding a given response probability, p, is obtained as shown in "Inverse Confidence Limits" in the Details section of the PROBIT documentation:
^ x = β^{1}[F^{1}(p)  x^{*}'b^{*}]
where β is the dose variable's parameter, x^{*} is the vector of predictors other than the dose variable, and b^{*} is the corresponding vector of parameter estimates. Suppose you have two groups, 0 and 1, and want to compare their ED50 values. Since the ED50 is the dose giving a 50% response, the response probability is p=0.5. Then the difference in group dose estimates in the parallel lines model is
^ x_{1}  ^ x_{0} = β^{1}[F^{1}(0.5)  α]  β^{1}[F^{1}(0.5)  (α+γ_{0})] = γ_{0}/β
With the group variable in the CLASS statement, the last level (1 in this case) is treated as the reference category, so its parameter, γ_{1}, is set to zero. Note that the difference is invariant with respect to the chosen response probability, p, since the lines are parallel. For the nonparallel lines model, the difference in group dose estimates is
^ x_{1}  ^ x_{0} = β^{1}[F^{1}(0.5)  α]  (β+δ_{0})^{1}[F^{1}(0.5)  (α+γ_{0})]
The deviation of the group 1 slope, δ_{1}, is zero since group 1 is the reference level.
While these differences cannot be estimated in PROC PROBIT, they can be estimated using the NLEstimate macro or equivalently by fitting the model in PROC NLMIXED and using the ESTIMATE statement.
The data in the example titled "An Epidemiology Study" in the PROBIT documentation records the number of individuals surviving after an epidemic, r, out of the number treated, n, for combinations of dosage of a medicine (dose), treatment (treat = A, B), and sex (sex = 0(Female), 1(Male)). For purposes of this discussion, the treatment variable is not used.
data epidemic; input treat$ dose n r sex @@; label dose = Dose; datalines; A 2.17 142 142 0 A .57 132 47 1 A 1.68 128 105 1 A 1.08 126 100 0 A 1.79 125 118 0 B 1.66 117 115 1 B 1.49 127 114 0 B 1.17 51 44 1 B 2.00 127 126 0 B .80 129 100 1 ;
These statements fit the nonparallel lines model that simultaneously fits unrestricted lines to both sexes. However, with the SEX×DOSE interaction in the model, PROC PROBIT can provide neither the separate ED50 estimates for the sexes nor the difference in the estimates using the INVERSECL option. The STORE statement is used to save the model information for use later by the NLEstimate macro.
proc probit data=epidemic; class sex; model r/n = sex dose sex*dose; store out=prbmod; run;
The estimated parameters of the model appear below.

To obtain the ED50 estimates for the two sexes, the following statements are used instead. They model the effect of dosage on the response probability, separately for each sex, using a probit model. Note that the models for the two sexes are unrestricted, allowing the sexes to have separate intercepts and slopes. The INVERSECL(PROB=0.5) option requests an estimate of the dosage that produces a 50% response probability (ED50) for each sex.
proc sort data=epidemic; by sex; run; proc probit data=epidemic; by sex; model r/n = dose / inversecl(prob=0.5); run;
Note that the ED50 estimates are 0.581 for females (sex=0) and .571 for males (sex=1). The difference (males  females) is 0.01.
sex=0
sex=1

Difference estimation using the NLEstimate macro
Since these are nonlinear functions of the model parameters, they can be estimated using the NLEstimate macro. The NLEstimate macro uses the fitted model information saved with the STORE statement in PROC PROBIT. It then uses PROC NLMIXED to estimate the functions and the delta method to obtain confidence limits. When estimating several functions, it is useful to specify them in a data set with appropriate labels. This is done in the following statements and the resulting data set is specified in the fdata= NLEstimate macro parameter. You write the functions to be estimated using the parameter names. See the description of the NLEstimate macro for details about displaying parameter names and using the macro.
data fd; length label f $32767; infile datalines delimiter=','; input label f; datalines; Male ED50,(1/b_p4)*(probit(0.5)b_p1) Female ED50,(1/(b_p4+b_p5))*(probit(0.5)(b_p1+b_p2)) Male  Female ED50,(1/b_p4)*(probit(0.5)b_p1)  (1/(b_p4+b_p5))*(probit(0.5)(b_p1+b_p2)) ; %NLEstimate(instore=prbmod, fdata=fd, df=10)
The estimated difference in ED50 values (males  females) is 0.01 as found earlier when fitting separate models^{Note}. In addition to the point estimate, the standard error of the difference (0.1560), a 95% confidence interval (0.3575, 0.3375), and a test that the difference equals zero (p=0.9502) are given. The test and confidence interval indicate that the small difference is not significant.

Since the test of the SEX×DOSE interaction in the above model is not significant (p ≈ 0.4), you might conclude that the effects of dose are the same in the two sexes and prefer a parallel lines model. The following statements fit the parallel lines model and obtain the ED50 estimates for each sex. To do this, PROC PROBIT is run two times, fitting the same model using all of the data from both sexes, but using different XDATA= data sets to estimate the two ED50 values. The XDATA= data set allows you to specify the group for which the INVERSECL option obtains estimates. The data set should contain one observation which defines the setting of the covariates for which you want dose estimates. The data set must also contain some nonmissing value for the response, though the actual value does not affect the computation. In data set FEMALE, SEX=0 and in data set MALE, SEX=1. In both data sets, DOSE is arbitrarily set to 1. Specifying XDATA=FEMALE in PROC PROBIT causes the INVERSECL(PROB=0.5) option to report the ED50 estimate for females. Similarly, XDATA=MALE provides the ED50 estimate for males. The STORE statement is used in the first PROBIT step to save the model for use in the NLEstimate macro.
/* Request ED50 estimate for females */ data female; input sex dose; datalines; 0 1 ; proc probit data=epidemic xdata=female; class sex; model r/n = sex dose / inversecl(prob=0.5); store out=prllmod; title 'Female ED50 estimate'; run; /* Request ED50 estimate for males */ data male; input sex dose; datalines; 1 1 ; proc probit data=epidemic xdata=male; class sex; model r/n = sex dose / inversecl(prob=0.5); title 'Male ED50 estimate'; run;

The ED50 for each sex and their difference are again specified in a data set and estimated by the NLEstimate macro.
data fd; length label f $32767; infile datalines delimiter=','; input label f; datalines; Male ED50,(1/b_p4)*(probit(0.5)b_p1) Female ED50,(1/b_p4)*(probit(0.5)(b_p1+b_p2)) Male  Female ED50,b_p2/b_p4 ; %NLEstimate(instore=prllmod, fdata=fd, df=10)
While the ED50 values are slightly more different than under the nonparallel lines model (0.12 vs 0.01), the difference is still not significant (p=0.2172).

Difference estimation using PROC NLMIXED
The ED50 estimates and difference can also be obtained by using PROC NLMIXED directly. The following fits the nonparallel lines model using PROC NLMIXED and estimates the difference in ED50 values. The model is specified by the assignment statement defining the response probability, p, as F(x'β), and by the MODEL statement indicating the number of responses, r, is distributed as binomial with number of trials, n, and response probability p. The first ESTIMATE statement estimates the ED50 for males using the first half of the difference for nonparallel models shown above. Similarly, the second ESTIMATE statement estimates the ED50 for females using the second half. The third ESTIMATE statement estimates the male  female difference in ED50 values. Note that since this is a probit model, F is the cumulative standard normal distribution. The PROBNORM function returns values from this distribution, and the PROBIT function returns values of its inverse.
proc nlmixed data=epidemic; p=probnorm(b0 + bsex*(sex=0) + bdose*dose + bsexdose*dose*(sex=0)); model r ~ binomial(n,p); estimate "Male ED50" (1/bdose)*(probit(0.5)b0); estimate "Female ED50" (1/(bdose+bsexdose))*(probit(0.5)(b0+bsex)); estimate "Male  Female ED50" (1/bdose)*(probit(0.5)b0)  (1/(bdose+bsexdose))*(probit(0.5)(b0+bsex)); run;
The results are the same as from the NLEstimate macro above.
PROC NLMIXED can again be used to fit the parallel lines model and provide an estimate, test, and confidence interval comparing the ED50 values. When fitting a parallel lines model, you assume that the difference in values that provides a 50% response is the same as the difference at any response probability. Note that as shown above, the estimator of the difference does not depend on the chosen response probability. In this example, the parameter for the nonreference level of the group variable, γ_{0} is BSEX, and the common slope, β, for the dose variable is BDOSE.
proc nlmixed data=epidemic; p=probnorm(b0 + bsex*(sex=0) + bdose*dose); model r ~ binomial(n,p); estimate "Male ED50" (1/bdose)*(probit(0.5)b0); estimate "Female ED50" (1/bdose)*(probit(0.5)(b0+bsex)); estimate "Male  Female ED50" bsex/bdose; run;
The results again match those from the NLEstimate macro above.
_____
Note: The PROC PROBIT step without the BY statement uses all the data from both sexes to fit the nonparallel lines model. The PROC PROBIT step with the BY statement fits separate models for the two sexes using only the data from one sex in each model. Because of this, the results may not be identical. Fitting a single model to all groups, using all the data, is usually preferable.
Product Family  Product  System  SAS Release  
Reported  Fixed*  
SAS System  SAS/STAT  z/OS  
OpenVMS VAX  
Microsoft® Windows® for 64Bit Itaniumbased Systems  
Microsoft Windows Server 2003 Datacenter 64bit Edition  
Microsoft Windows Server 2003 Enterprise 64bit Edition  
Microsoft Windows XP 64bit Edition  
Microsoft® Windows® for x64  
OS/2  
Microsoft Windows 95/98  
Microsoft Windows 2000 Advanced Server  
Microsoft Windows 2000 Datacenter Server  
Microsoft Windows 2000 Server  
Microsoft Windows 2000 Professional  
Microsoft Windows NT Workstation  
Microsoft Windows Server 2003 Datacenter Edition  
Microsoft Windows Server 2003 Enterprise Edition  
Microsoft Windows Server 2003 Standard Edition  
Microsoft Windows Server 2003 for x64  
Microsoft Windows Server 2008  
Microsoft Windows Server 2008 for x64  
Microsoft Windows XP Professional  
Windows 7 Enterprise 32 bit  
Windows 7 Enterprise x64  
Windows 7 Home Premium 32 bit  
Windows 7 Home Premium x64  
Windows 7 Professional 32 bit  
Windows 7 Professional x64  
Windows 7 Ultimate 32 bit  
Windows 7 Ultimate x64  
Windows Millennium Edition (Me)  
Windows Vista  
Windows Vista for x64  
64bit Enabled AIX  
64bit Enabled HPUX  
64bit Enabled Solaris  
ABI+ for Intel Architecture  
AIX  
HPUX  
HPUX IPF  
IRIX  
Linux  
Linux for x64  
Linux on Itanium  
OpenVMS Alpha  
OpenVMS on HP Integrity  
Solaris  
Solaris for x64  
Tru64 UNIX 
Type:  Usage Note 
Priority:  
Topic:  SAS Reference ==> Procedures ==> PROBIT Analytics ==> Categorical Data Analysis Analytics ==> Regression SAS Reference ==> Macro 
Date Modified:  20190503 13:58:17 
Date Created:  20111122 14:44:09 