Linear Predictor, Predicted Probability, and Confidence Limits

This section describes how predicted probabilities and confidence limits are calculated by using the pseudo-estimates (MLEs) obtained from PROC SURVEYLOGISTIC. For a specific example, see the section Getting Started: SURVEYLOGISTIC Procedure. Predicted probabilities and confidence limits can be output to a data set with the OUTPUT statement.

Let $\Delta _{\alpha /2}$ is the $100(1-\alpha /2)$th percentile point of a standard normal distribution or a t distribution according to the DF= specification:

\begin{eqnarray*} \Delta _{\alpha /2}= \left\{ \begin{array}{ll} 100(1-\alpha /2)\mbox{th percentile point of a standard normal distribution } z_{\alpha /2} & \mbox{if DF=INFINITY} \\ 100(1-\alpha /2)\mbox{th percentile point of a } t \mbox{ distribution } t_{\alpha /2} & \mbox{otherwise} \end{array} \right. \end{eqnarray*}

Cumulative Response Models

For a row vector of explanatory variables $\mb{x}$, the linear predictor

\[ \eta _ i= g(\mbox{Pr}(Y\leq i~ |~ \mb{x})) = \alpha _ i+\mb{x}\bbeta , \quad 1 \leq i \leq k \]

is estimated by

\[ \hat{\eta }_ i=\hat{\alpha }_ i+\mb{x}\hat{\bbeta } \]

where $\hat{\alpha }_ i$ and $\hat{\bbeta }$ are the MLEs of $\alpha _ i$ and $\bbeta $. The estimated standard error of ${\eta }_ i$ is $\hat{\sigma }({\hat{\eta }}_ i)$, which can be computed as the square root of the quadratic form $(1, {\mb{x}}^\prime ){\hat{\mb{V}}_\mb {b}}(1, \mb{x}^\prime )^\prime $, where $\hat{\mb{V}}_\mb {b}$ is the estimated covariance matrix of the parameter estimates. The asymptotic $100(1-\alpha )\% $ confidence interval for ${\eta }_{i}$ is given by

\[ \hat{\eta }_ i\pm \Delta _{\alpha /2}\hat{\sigma }({\hat{\eta }}_ i) \]

The predicted value and the $100(1-\alpha )\% $ confidence limits for Pr$(Y\leq i~ |~ \mb{x})$ are obtained by back-transforming the corresponding measures for the linear predictor.


Predicted Probability

$100(1-\alpha )$ Confidence Limits


$1/(1+e^{-\hat{\eta }_ i})$

$1/(1+e^{-\hat{\eta }_ i \pm \Delta _{\alpha /2}\hat{\sigma }({\hat{\eta }}_ i)})$


$\Phi (\hat{\eta }_ i)$

$\Phi (\hat{\eta }_ i \pm \Delta _{\alpha /2}\hat{\sigma }({\hat{\eta }}_ i))$


$1-e^{-e^{\hat{\eta }_ i}}$

$1-e^{-e^{\hat{\eta }_ i\pm \Delta _{\alpha /2}\hat{\sigma }({\hat{\eta }}_ i)}}$

Generalized Logit Model

For a vector of explanatory variables $\mb{x}$, let $\pi _ i$ denote the probability of obtaining the response value i:

\[ \pi _ i = \left\{ \begin{array}{ll} \pi _{k+1} {e}^{\alpha _ i+\mb{x}\bbeta _ i} & 1\le i\le k \\ \displaystyle \frac{1}{1+\sum _{j=1}^{k} {e}^{\alpha _ j+\mb{x} {\bbeta }_ j}} & i=k+1 \end{array} \right. \]

By the delta method,

\[ \sigma ^2({\pi }_ i) = \biggl ( \frac{\partial \pi _ i}{\partial \btheta } \biggr )’ \bV ({\btheta }) \frac{\partial \pi _ i}{\partial \btheta } \]

A 100(1$-\alpha $)% confidence level for $\pi _ i$ is given by

\[ \hat{\pi }_ i \pm \Delta _{\alpha /2} \hat{\sigma }(\hat{\pi }_ i) \]

where $\hat{\pi }_ i$ is the estimated expected probability of response i and $\hat{\sigma }(\hat{\pi }_ i)$ is obtained by evaluating $\sigma ({\pi }_ i)$ at $\btheta =\hat{\btheta }$.