The LOGISTIC Procedure

Confidence Intervals for Parameters

Subsections:

Likelihood Ratio-Based Confidence Intervals
Wald Confidence Intervals

There are two methods of computing confidence intervals for the regression parameters. One is based on the profile-likelihood function, and the other is based on the asymptotic normality of the parameter estimators. The latter is not as time-consuming as the former, because it does not involve an iterative scheme; however, it is not thought to be as accurate as the former, especially with small sample size. You use the CLPARM= option to request confidence intervals for the parameters.

Likelihood Ratio-Based Confidence Intervals

The likelihood ratio-based confidence interval is also known as the profile-likelihood confidence interval. The construction of this interval is derived from the asymptotic $\chi ^2$ distribution of the generalized likelihood ratio test (Venzon and Moolgavkar, 1988). Suppose that the parameter vector is $\bbeta = (\beta _{0},\beta _{1},\ldots ,\beta _{s})’$ and you want to compute a confidence interval for $\beta _{j}$ . The profile-likelihood function for $\beta _{j}=\gamma$ is defined as

$l_ j^*(\gamma ) = \max _{\bbeta \in {\mc{B}}_ j(\gamma )} l(\bbeta )$

where ${\mc{B}}_ j(\gamma )$ is the set of all $\bbeta$ with the jth element fixed at $\gamma$ , and $l(\bbeta )$ is the log-likelihood function for $\bbeta$ . If $l_{\max } = l({\widehat{\bbeta }})$ is the log likelihood evaluated at the maximum likelihood estimate ${\widehat{\bbeta }}$ , then $2( l_{\max } - l_ j^{*}(\beta _{j} ))$ has a limiting chi-square distribution with one degree of freedom if $\beta _{j}$ is the true parameter value. Let $l_0=l_{\max } - 0.5\chi ^{2}_{1}(1-\alpha )$ , where $\chi ^{2}_{1}(1-\alpha )$ is the $100(1-\alpha )$ percentile of the chi-square distribution with one degree of freedom. A $100(1-\alpha )$ % confidence interval for $\beta _{j}$ is

$\{ \gamma : l_ j^*(\gamma ) \geq l_{0} \}$

The endpoints of the confidence interval are found by solving numerically for values of $\beta _{j}$ that satisfy equality in the preceding relation. To obtain an iterative algorithm for computing the confidence limits, the log-likelihood function in a neighborhood of $\bbeta$ is approximated by the quadratic function

$\tilde{l}(\bbeta + \bdelta ) = l(\bbeta ) + \bdelta ’\mb{g} + \frac{1}{2}\bdelta ’ \bV \bdelta$

where $\mb{g}=\mb{g}(\bbeta )$ is the gradient vector and $\bV =\bV (\bbeta )$ is the Hessian matrix. The increment $\bdelta$ for the next iteration is obtained by solving the likelihood equations

$\frac{d}{d\bdelta }\{ \tilde{l}(\bbeta + \bdelta ) + \lambda ( \mb{e}_ j’\bdelta - \gamma )\} = \bm {0}$

where $\lambda$ is the Lagrange multiplier, $\mb{e}_ j$ is the jth unit vector, and $\gamma$ is an unknown constant. The solution is

$\bdelta = -\bV ^{-1}(\mb{g} + \lambda \mb{e}_ j)$

By substituting this $\bdelta$ into the equation $\tilde{l}(\bbeta + \bdelta ) = l_0$ , you can estimate $\lambda$ as

$\lambda = \pm \biggl (\frac{2(l_0 - l(\bbeta ) + \frac{1}{2}\mb{g}'\bV ^{-1}\mb{g})}{\mb{e}_ j'\bV ^{-1}\mb{e}_ j}\biggr )^{ \frac{1}{2}}$

The upper confidence limit for $\beta _ j$ is computed by starting at the maximum likelihood estimate of $\bbeta$ and iterating with positive values of $\lambda$ until convergence is attained. The process is repeated for the lower confidence limit by using negative values of $\lambda$ .

Convergence is controlled by the value $\epsilon$ specified with the PLCONV= option in the MODEL statement (the default value of $\epsilon$ is 1E–4). Convergence is declared on the current iteration if the following two conditions are satisfied:

$|l(\bbeta )-l_{0}| \leq \epsilon$

and

$({\mb{g}} + \lambda {\mb{e}_ j})’{\bV }^{-1}({\mb{g}} + \lambda {\mb{e}_ j}) \leq \epsilon$

Wald Confidence Intervals

Wald confidence intervals are sometimes called the normal confidence intervals. They are based on the asymptotic normality of the parameter estimators. The $100(1-\alpha )$ % Wald confidence interval for $\beta _ j$ is given by

${\widehat{\beta }}_ j \pm z_{1-\alpha /2}\widehat{\sigma }_ j$

where $z_{p}$ is the 100p percentile of the standard normal distribution, ${\widehat{\beta }}_ j$ is the maximum likelihood estimate of $\beta _ j$ , and $\widehat{\sigma }_ j$ is the standard error estimate of ${\widehat{\beta }}_ j$ .