The PROBIT Procedure

ODS Graphics

Statistical procedures use ODS Graphics to create graphs as part of their output. ODS Graphics is described in detail in Chapter 21: Statistical Graphics Using ODS.

Before you create graphs, ODS Graphics must be enabled (for example, by specifying the ODS GRAPHICS ON statement). For more information about enabling and disabling ODS Graphics, see the section Enabling and Disabling ODS Graphics in Chapter 21: Statistical Graphics Using ODS.

The overall appearance of graphs is controlled by ODS styles. Styles and other aspects of using ODS Graphics are discussed in the section A Primer on ODS Statistical Graphics in Chapter 21: Statistical Graphics Using ODS.

These ODS graphs are controlled by the PLOTS= option in the PROC PROBIT statement. You can specify more than one graph request with the PLOTS= option. Table 75.41 summarizes these requests.

Table 75.41: Options for Plots

Option

 

Plot

ALL

 

All appropriate plots

CDFPLOT

 

Estimated cumulative probability

IPPPLOT

 

Inverse predicted probability

LPREDPLOT

 

Linear predictor

NONE

 

No plot

PREDPPLOT

 

Predicted probability


The following subsections provide information about these graphs.

ODS Graph Names

PROC PROBIT assigns a name to each graph it creates using ODS. You can use these names to reference the graphs when using ODS. The names are listed in Table 75.42.

Table 75.42: Graphs Produced by PROC PROBIT

ODS Graph Name

Plot Description

Statement

PLOTS= Option

CDFPlot

Estimated cumulative probability

PROC

CDFPLOT

IPPPlot

Inverse predicted probability

PROC

IPPPLOT

LPredPlot

Linear predictor

PROC

LPREDPLOT

PredPPlot

Predicted probability

PROC

PREDPPLOT


CDF Plot

For a multinomial model, the predicted cumulative distribution function is defined as

\[ {\hat F}_{j}(\mb {x}) = C + (1-C)F({\hat a}_ j + \mb {x}^{\prime } \mb {\hat b}) \]

where $j = 1,\ldots ,k$ are the indexes of the k levels of the multinomial response variable, F is the CDF of the distribution used to model the cumulative probabilities, $\mb {\hat b}$ is the vector of estimated parameters, $\mb {x}$ is the covariate vector, ${\hat a}_ j$ are estimated ordinal intercepts with ${\hat a}_1 = 0$, and C is the threshold parameter, either known or estimated from the model. Let $x_1$ be the covariate corresponding to the dose variable and $\mb {x}_{-1}$ be the vector of the rest of the covariates. Let the corresponding estimated parameters be ${\hat b}_1$ and ${\mb {\hat b}}_{-1}$. Then

$\displaystyle  {\hat F}_{j}(\mb {x})= C + (1-C)F({\hat a}_ j + x_1 {\hat b}_1 + \mb {x}^{\prime }_{-1} {\mb {\hat b}}_{-1})  $

To plot ${\hat F}_{j}$ as a function of $x_1$, $\mb {x}_{-1}$ must be specified. You can use the XDATA= option to provide the values of $\mb {x}_{-1}$ (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow these rules:

  • If the effect contains a continuous variable (or variables), the overall mean of this effect is used.

  • If the effect is a single classification variable, the highest level of the variable is used.

The LEVEL= suboption specify the levels of the multinomial response variable for which the CDF curves are requested. There are k – 1 curves for a k-level multinomial response variable (for the highest level, it is the constant line 1). You can specify any of them to be plotted by the LEVEL= suboption. See the plot in Output 75.2.6 for an example.

Inverse Predicted Probability Plot

For the binomial model, the response variable is a probability. An estimate of the dose level ${\hat x}_1$ needed for a response of p is given by

$\displaystyle  {\hat x}_1 = (F^{-1}(p) - \Strong{x}_{-1}^{\prime } {\mb {\hat b}}_{-1})/{\hat b}_1  $

where F is the cumulative distribution function used to model the probability, $\mb {x}_{-1}$ is the vector of the rest of the covariates, ${\mb {\hat b}}_{-1}$ is the vector of the estimated parameters corresponding to $\mb {x}_{-1}$, and ${\hat b}_1$ is the estimated parameter for the dose variable of interest.

To plot ${\hat x}_1$ as a function of p, $\mb {x}_{-1}$ must be specified. You can use the XDATA= option to provide the values of $\mb {x}_{-1}$ (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow these rules:

  • If the effect contains a continuous variable (or variables), the overall mean of this effect is used.

  • If the effect is a single classification variable, the highest level of the variable is used.

Output 75.4.12 in Example 75.4 shows an inverse predicted probability plot.

Linear Predictor Plot

For both binomial models and multinomial models, the linear predictor $\mb {x}^{\prime }\mb {b}$ can be plotted against the first single continuous variable (dose variable) in the MODEL statement.

Let $x_1$ be the covariate of the dose variable, $\mb {x}_{-1}$ be the vector of the rest of the covariates, ${\mb {\hat b}}_{-1}$ be the vector of estimated parameters corresponding to $\mb {x}_{-1}$, and ${\hat b}_1$ be the estimated parameter for the dose variable of interest.

To plot ${\mb {\hat x}}^{\prime }\mb {b} $ as a function of $x_1$, $\mb {x}_{-1}$ must be specified. You can use the XDATA= option to provide the values of $\mb {x}_{-1}$ (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow these rules:

  • If the effect contains a continuous variable (or variables), the overall mean of this effect is used.

  • If the effect is a single classification variable, the highest level of the variable is used.

For the multinomial model, you can use the LEVEL= suboption to specify the levels for which the linear predictor lines are plotted.

The confidence limits for the predicted values are only available for the binomial model. Output 75.4.13 in Example 75.4 shows a linear predictor plot for a binomial model.

Predicted Probability Plot

The predicted probability is

$\displaystyle  {\hat p} = C + (1-C) F( \mb {x}^{\prime } {\mb {\hat b}})  $

for the binomial model and

$\displaystyle  {\hat p}_1  $
$\displaystyle = $
$\displaystyle  C + (1-C) F( \mb {x}^{\prime } {\mb {\hat b}})  $
$\displaystyle {\hat p}_ j  $
$\displaystyle = $
$\displaystyle  (1-C) (F({\hat a}_ j + \mb {x}^{\prime } {\mb {\hat b}}) - F({\hat a}_{j-1} + \mb {x}^{\prime } {\mb {\hat b}})), \; \;  j=2,\ldots ,k-1  $
$\displaystyle {\hat p}_ k  $
$\displaystyle = $
$\displaystyle  (1-C) (1- F({\hat a}_{k-1} + \mb {x}^{\prime } {\mb {\hat b}}))  $

for the multinomial model with k response levels, where F is the cumulative distribution function used to model the probability, $\mb {x}^{\prime }$ is the vector of the covariates, ${\hat a}_ j$ are the estimated ordinal intercepts with ${\hat a}_1 = 0$, C is the threshold parameter, either known or estimated from the model, and ${\mb {\hat b}}^{\prime }$ is the vector of estimated parameters.

To plot ${\hat p}$ (or ${\hat p}_ j$) as a function of a continuous variable $x_1$, the remaining covariates $\mb {x}_{-1}$ must be specified. You can use the XDATA= option to provide the values of $\mb {x}_{-1}$ (see the XDATA= option in the PROC PROBIT statement for details), or use the default values that follow these rules:

  • If the effect contains a continuous variable (or variables), the overall mean of this effect is used.

  • If the effect is a single classification variable, the highest level of the variable is used.

For the multinomial model, you can use the LEVEL= suboption to specify the levels for which the linear predictor lines are plotted.

Confidence limits are plotted only for the binomial model. Output 75.1.7 in Example 75.1 shows a predicted probability plot for a binomial model; and Output 75.2.3 in Example 75.2 shows a predicted probability plot for a multinomial model.