The HPSEVERITY Procedure

Parameter Estimation Method

Subsections:

Likelihood Function
Estimating Covariance and Standard Errors

If you do not specify a custom objective function by specifying programming statements and the OBJECTIVE= option in the PROC HPSEVERITY statement, then PROC HPSEVERITY uses the maximum likelihood (ML) method to estimate the parameters of each model. A nonlinear optimization process is used to maximize the log of the likelihood function. If you specify a custom objective function, then PROC HPSEVERITY uses a nonlinear optimization algorithm to estimate the parameters of each model that minimize the value of your specified objective function. For more information, see the section Custom Objective Functions.

Likelihood Function

Let $f_\Theta (x)$ and $F_\Theta (x)$ denote the PDF and CDF, respectively, evaluated at $x$ for a set of parameter values $\Theta$ . Let $Y$ denote the random response variable, and let $y$ denote its value recorded in an observation in the input data set. Let $T^ l$ and $T^ r$ denote the random variables for the left-truncation and right-truncation threshold, respectively, and let $t^ l$ and $t^ r$ denote their values for an observation, respectively. If there is no left-truncation, then $t^ l = \tau ^ l$ , where $\tau ^ l$ is the smallest value in the support of the distribution; so $F(t^ l)=0$ . If there is no right-truncation, then $t^ r = \tau _ h$ , where $\tau _ h$ is the largest value in the support of the distribution; so $F(t^ r)=1$ . Let $C^ l$ and $C^ r$ denote the random variables for the left-censoring and right-censoring limit, respectively, and let $c^ l$ and $c^ r$ denote their values for an observation, respectively. If there is no left-censoring, then $c^ l = \tau _ h$ ; so $F(c^ l)=1$ . If there is no right-censoring, then $c^ r = \tau ^ l$ ; so $F(c^ r)=0$ .

The set of input observations can be categorized into the following four subsets within each BY group:

$E$ is the set of uncensored and untruncated observations. The likelihood of an observation in $E$ is

$l_{E} = \Pr (Y=y) = f_\Theta (y)$
$E_ t$ is the set of uncensored observations that are truncated. The likelihood of an observation in $E_ t$ is

$l_{E_ t} = \Pr (Y=y | t^ l < Y \leq t^ r) = \frac{f_\Theta (y)}{F_\Theta (t^ r) - F_\Theta (t^ l)}$
$C$ is the set of censored observations that are not truncated. The likelihood of an observation $C$ is

$l_{C} = \Pr (c^ r < Y \leq c^ l) = F_\Theta (c^ l) - F_\Theta (c^ r)$
$C_ t$ is the set of censored observations that are truncated. The likelihood of an observation $C_ t$ is

$l_{C_ t} = \Pr (c^ r < Y \leq c^ l | t^ l < Y \leq t^ r) = \frac{F_\Theta (c^ l) - F_\Theta (c^ r)}{F_\Theta (t^ r) - F_\Theta (t^ l)}$

Note that $(E \cup E_ t) \cap (C \cup C_ t) = \emptyset$ . Also, the sets $E_ t$ and $C_ t$ are empty when you do not specify truncation, and the sets $C$ and $C_ t$ are empty when you do not specify censoring.

Given this, the likelihood of the data $L$ is as follows:

$\begin{equation*} L = {\displaystyle \prod _{E} f_\Theta (y)} {\displaystyle \prod _{E_ t} \frac{f_\Theta (y)}{F_\Theta (t^ r) - F_\Theta (t^ l)}} {\displaystyle \prod _{C} F_\Theta (c^ l) - F_\Theta (c^ r)} {\displaystyle \prod _{C_ t} \frac{F_\Theta (c^ l) - F_\Theta (c^ r)}{F_\Theta (t^ r) - F_\Theta (t^ l)}} \end{equation*}$

The maximum likelihood procedure used by PROC HPSEVERITY finds an optimal set of parameter values $\hat{\Theta }$ that maximizes $\log (L)$ subject to the boundary constraints on parameter values. For a distribution dist, you can specify such boundary constraints by using the dist_LOWERBOUNDS and dist_UPPERBOUNDS subroutines. For more information, see the section Defining a Severity Distribution Model with the FCMP Procedure. Some aspects of the optimization process can be controlled by using the NLOPTIONS statement.

Estimating Covariance and Standard Errors

PROC HPSEVERITY computes an estimate of the covariance matrix of the parameters by using the asymptotic theory of the maximum likelihood estimators (MLE). If $N$ denotes the number of observations used for estimating a parameter vector $\pmb {\theta }$ , then the theory states that as $N \rightarrow \infty$ , the distribution of $\hat{\pmb {\theta }}$ , the estimate of $\pmb {\theta }$ , converges to a normal distribution with mean $\pmb {\theta }$ and covariance $\hat{\mathbf{C}}$ such that $\mathbf{I}(\pmb {\theta }) \cdot \hat{\mathbf{C}} \rightarrow 1$ , where $\mathbf{I}(\pmb {\theta }) = -E\left[ \nabla ^2 \log (L(\pmb {\theta }))\right]$ is the information matrix for the likelihood of the data, $L(\pmb {\theta })$ . The covariance estimate is obtained by using the inverse of the information matrix.

In particular, if $\mathbf{G} = \nabla ^2 \log (-L(\pmb {\theta }))$ denotes the Hessian matrix of the negative of log likelihood, then the covariance estimate is computed as

$\hat{\mathbf{C}} = \frac{N}{d} \mathbf{G}^{-1}$

where $d$ is a denominator that is determined by the VARDEF= option. If VARDEF=N, then $d = N$ , which yields the asymptotic covariance estimate. If VARDEF=DF, then $d=N - k$ , where $k$ is number of parameters (the model’s degrees of freedom). The VARDEF=DF option is the default, because it attempts to correct the potential bias introduced by the finite sample.

The standard error $s_ i$ of the parameter $\theta _ i$ is computed as the square root of the $i$ th diagonal element of the estimated covariance matrix; that is, $s_ i = \sqrt {\hat{C}_{ii}}$ .

If you specify a custom objective function, then the covariance matrix of the parameters is still computed by inverting the information matrix, except that the Hessian matrix $\mathbf{G}$ is computed as $\mathbf{G} = \nabla ^2 \log (U(\pmb {\theta }))$ , where $U$ denotes your custom objective function that is minimized by the optimizer.

Covariance and standard error estimates might not be available if the Hessian matrix is found to be singular at the end of the optimization process. This can especially happen if the optimization process stops without converging.