The HPSEVERITY Procedure

Parameter Estimation Method

Likelihood Function
Estimating Covariance and Standard Errors

PROC HPSEVERITY uses the maximum likelihood (ML) method to estimate the parameters of each model. A nonlinear optimization process is used to maximize the log of the likelihood function.

Likelihood Function

Let $f_\Theta (x)$ and $F_\Theta (x)$ denote the PDF and CDF, respectively, evaluated at for a set of parameter values $\Theta$ . Let denote the random response variable, and let denote its value recorded in an observation in the input data set. Let and denote the random variables for the left-truncation and right-truncation threshold, respectively, and let and denote their values for an observation, respectively. If there is no left-truncation, then $t^ l = \tau ^ l$ , where $\tau ^ l$ is the smallest value in the support of the distribution; so . If there is no right-truncation, then $t^ r = \tau _ h$ , where $\tau _ h$ is the largest value in the support of the distribution; so . Let and denote the random variables for the left-censoring and right-censoring limit, respectively, and let and denote their values for an observation, respectively. If there is no left-censoring, then $c^ l = \tau _ h$ ; so . If there is no right-censoring, then $c^ r = \tau ^ l$ ; so .

The set of input observations can be categorized into the following four subsets:

is the set of uncensored and untruncated observations. The likelihood of an observation in is

$l_{E} = \Pr (Y=y) = f_\Theta (y)$
is the set of uncensored observations that are truncated. The likelihood of an observation in is

$l_{E_ t} = \Pr (Y=y | t^ l < Y \leq t^ r) = \frac{f_\Theta (y)}{F_\Theta (t^ r) - F_\Theta (t^ l)}$
is the set of censored observations that are not truncated. The likelihood of an observation is

$l_{C} = \Pr (c^ r < Y \leq c^ l) = F_\Theta (c^ l) - F_\Theta (c^ r)$
is the set of censored observations that are truncated. The likelihood of an observation is

$l_{C_ t} = \Pr (c^ r < Y \leq c^ l | t^ l < Y \leq t^ r) = \frac{F_\Theta (c^ l) - F_\Theta (c^ r)}{F_\Theta (t^ r) - F_\Theta (t^ l)}$

Note that $(E \cup E_ t) \cap (C \cup C_ t) = \emptyset$ . Also, the sets and are empty when no truncation is specified, and the sets and are empty when no censoring is specified.

Given this, the likelihood of the data is as follows:

$\begin{equation*} L = {\displaystyle \prod _{E} f_\Theta (y)} {\displaystyle \prod _{E_ t} \frac{f_\Theta (y)}{F_\Theta (t^ r) - F_\Theta (t^ l)}} {\displaystyle \prod _{C} F_\Theta (c^ l) - F_\Theta (c^ r)} {\displaystyle \prod _{C_ t} \frac{F_\Theta (c^ l) - F_\Theta (c^ r)}{F_\Theta (t^ r) - F_\Theta (t^ l)}} \end{equation*}$

The maximum likelihood procedure used by PROC HPSEVERITY finds an optimal set of parameter values $\hat{\Theta }$ that maximizes $\log (L)$ subject to the boundary constraints on parameter values. For a distribution dist, such boundary constraints can be specified by using the dist_LOWERBOUNDS and dist_UPPERBOUNDS subroutines. For more information, see the section Defining a Severity Distribution Model with the FCMP Procedure. Some aspects of the optimization process can be controlled by using the NLOPTIONS statement.

Estimating Covariance and Standard Errors

PROC HPSEVERITY computes an estimate of the covariance matrix of the parameters by using the asymptotic theory of the maximum likelihood estimators (MLE). If denotes the number of observations used for estimating a parameter vector $\pmb {\theta }$ , then the theory states that as $N \rightarrow \infty$ , the distribution of $\hat{\pmb {\theta }}$ , the estimate of $\pmb {\theta }$ , converges to a normal distribution with mean $\pmb {\theta }$ and covariance $\hat{\mathbf{C}}$ such that $\mathbf{I}(\pmb {\theta }) \cdot \hat{\mathbf{C}} \rightarrow 1$ , where $\mathbf{I}(\pmb {\theta }) = -E\left[ \nabla ^2 \log (L(\pmb {\theta }))\right]$ is the information matrix for the likelihood of the data, $L(\pmb {\theta })$ . The covariance estimate is obtained by using the inverse of the information matrix.

In particular, if $\mathbf{G} = \nabla ^2 \log (-L(\pmb {\theta }))$ denotes the Hessian matrix of the negative of log likelihood, then the covariance estimate is computed as

$\hat{\mathbf{C}} = \frac{N}{d} \mathbf{G}^{-1}$

where is a denominator that is determined by the VARDEF= option. If VARDEF=N, then , which yields the asymptotic covariance estimate. If VARDEF=DF, then , where is number of parameters (the model’s degrees of freedom). The VARDEF=DF option is the default, because it attempts to correct the potential bias introduced by the finite sample.

The standard error of the parameter $\theta _ i$ is computed as the square root of the th diagonal element of the estimated covariance matrix; that is, $s_ i = \sqrt {\hat{C}_{ii}}$ .

If you have specified a custom objective function, then the covariance matrix of the parameters is still computed by inverting the information matrix, except that the Hessian matrix $\mathbf{G}$ is computed as $\mathbf{G} = \nabla ^2 \log (U(\pmb {\theta }))$ , where denotes your custom objective function that is minimized by the optimizer.

Covariance and standard error estimates might not be available if the Hessian matrix is found to be singular at the end of the optimization process. This can especially happen if the optimization process stops without converging.