PROC SEVERITY: Predefined Distribution Models :: SAS/ETS(R) 9.22 User's Guide

The SEVERITY Procedure

Predefined Distribution Models

A set of predefined distribution models is provided with the SEVERITY procedure. A summary of the models is provided in Table 22.3. For each distribution model, the table lists the parameters in the order in which they appear in the signature of the functions or subroutines that accept distribution parameters as input or output arguments. The table also mentions the bounds on the parameters. If the bounds are different from their default values, then the distribution model contains appropriately defined name_LOWERBOUNDS or name_UPPERBOUNDS subroutines.

All the predefined distribution models, except LOGN, are parameterized such that their first parameter is the scale parameter. For LOGN, the first parameter $\text{[math]}$ is a log-transformed scale parameter, which is specified by using the LOGN_SCALETRANSFORM subroutine. The presence of scale parameter enables you to use any of the predefined distributions as a candidate for estimating regression effects.

If you need to use the functions or subroutines defined in the predefined distributions in SAS statements other than the PROC SEVERITY step (such as in a DATA step), then they are available to you in the SASHELP.SVRTDIST library. Specify the library by using the OPTIONS global statement to set the CMPLIB= system option prior to using these functions. Note that you do not need to use the CMPLIB= option in order to use the predefined distributions with PROC SEVERITY.

Table 22.3 Predefined SEVERITY Distributions
Name	Distribution	Parameters	PDF ( $\text{[math]}$ ) and CDF ( $\text{[math]}$ )
BURR	Burr	$\text{[math]}$ , $\text{[math]}$ ,	$\text{[math]}$	$\text{[math]}$
		$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
EXP	Exponential	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
			$\text{[math]}$	$\text{[math]}$
GAMMA	Gamma	$\text{[math]}$ , $\text{[math]}$	$\text{[math]}$	$\text{[math]}$
			$\text{[math]}$	$\text{[math]}$
GPD	Generalized	$\text{[math]}$ , $\text{[math]}$	$\text{[math]}$	$\text{[math]}$
	Pareto		$\text{[math]}$	$\text{[math]}$
IGAUSS	Inverse Gaussian	$\text{[math]}$ , $\text{[math]}$	$\text{[math]}$	$\text{[math]}$
	(Wald)		$\text{[math]}$	$\text{[math]}$
				$\text{[math]}$
LOGN	Lognormal	$\text{[math]}$ (no bounds),	$\text{[math]}$	$\text{[math]}$
		$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
PARETO	Pareto	$\text{[math]}$ , $\text{[math]}$	$\text{[math]}$	$\text{[math]}$
			$\text{[math]}$	$\text{[math]}$
WEIBULL	Weibull	$\text{[math]}$ , $\text{[math]}$	$\text{[math]}$	$\text{[math]}$
			$\text{[math]}$	$\text{[math]}$
Notes:
1. $\text{[math]}$ , wherever $\text{[math]}$ is used.
2. $\text{[math]}$ denotes the scale parameter for all the distributions. For LOGN, $\text{[math]}$ .
3. Parameters are listed in the order in which they are defined in the distribution model.
4. $\text{[math]}$ is the lower incomplete gamma function.
5. $\text{[math]}$ is the standard normal CDF.

Parameter Initialization for Predefined Distribution Models

The definition of each distribution model also contains a name_PARMINIT subroutine to initialize the parameters. The parameters are initialized by using the method of moments for all the distributions, except for the gamma and the Weibull distributions. For the gamma distribution, approximate maximum likelihood estimates are used. For the Weibull distribution, the method of percentile matching is used.

Given $\text{[math]}$ observations of the severity value $\text{[math]}$ ( $\text{[math]}$ ), the estimate of $\text{[math]}$ th raw moment is denoted by $\text{[math]}$ and computed as

$\text{[math]}$

The 100 $\text{[math]}$ th percentile is denoted by $\text{[math]}$ ( $\text{[math]}$ ). By definition, $\text{[math]}$ satisfies

$\text{[math]}$

where $\text{[math]}$ . PROC SEVERITY uses the following practical method of computing $\text{[math]}$ . Let $\text{[math]}$ denote the empirical distribution function (EDF) estimate at a severity value $\text{[math]}$ . This estimate is computed by PROC SEVERITY and supplied to the name_PARMINIT subroutine. Let $\text{[math]}$ and $\text{[math]}$ denote two consecutive values in the array of $\text{[math]}$ values such that $\text{[math]}$ and $\text{[math]}$ . Then, the estimate $\text{[math]}$ is computed as

$\text{[math]}$

where $\text{[math]}$ and $\text{[math]}$ .

Let $\text{[math]}$ denote the smallest double-precision floating-point number such that $\text{[math]}$ . This machine precision constant can be obtained by using the CONSTANT function in Base SAS software.

The details of how parameters are initialized for each predefined distribution model are as follows:

BURR

The parameters are initialized by using the method of moments. The $\text{[math]}$ th raw moment of the Burr distribution is:

$\text{[math]}$

Three moment equations $\text{[math]}$ ( $\text{[math]}$ ) need to be solved for initializing the three parameters of the distribution. In order to get an approximate closed form solution, the second shape parameter $\text{[math]}$ is initialized to a value of $\text{[math]}$ . If $\text{[math]}$ , then simplifying and solving the moment equations yields the following feasible set of initial values:

$\text{[math]}$

If $\text{[math]}$ , then the parameters are initialized as follows:

$\text{[math]}$

EXP

The parameters are initialized by using the method of moments. The $\text{[math]}$ th raw moment of the exponential distribution is:

$\text{[math]}$

Solving $\text{[math]}$ yields the initial value of $\text{[math]}$ .

GAMMA

The parameter $\text{[math]}$ is initialized by using its approximate maximum likelihood (ML) estimate. For a set of $\text{[math]}$ iid observations $\text{[math]}$ ( $\text{[math]}$ ), drawn from a gamma distribution, the log likelihood, $\text{[math]}$ , is defined as follows:

	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$

Using a shorter notation of $\text{[math]}$ to denote $\text{[math]}$ and solving the equation $\text{[math]}$ yields the following ML estimate of $\text{[math]}$ :

$\text{[math]}$

Substituting this estimate in the expression of $\text{[math]}$ and simplifying gives

$\text{[math]}$

Let $\text{[math]}$ be defined as follows:

$\text{[math]}$

Solving the equation $\text{[math]}$ yields the following expression in terms of the digamma function, $\text{[math]}$ :

$\text{[math]}$

The digamma function can be approximated as follows:

$\text{[math]}$

This approximation is within 1.4% of the true value for all the values of $\text{[math]}$ except when $\text{[math]}$ is arbitrarily close to the positive root of the digamma function (which is approximately 1.461632). Even for the values of $\text{[math]}$ that are close to the positive root, the absolute error between true and approximate values is still acceptable ( $\text{[math]}$ for $\text{[math]}$ ). Solving the equation that arises from this approximation yields the following estimate of $\text{[math]}$ :

$\text{[math]}$

If this approximate ML estimate is infeasible, then the method of moments is used. The $\text{[math]}$ th raw moment of the gamma distribution is:

$\text{[math]}$

Solving $\text{[math]}$ and $\text{[math]}$ yields the following initial value for $\text{[math]}$ :

$\text{[math]}$

If $\text{[math]}$ (almost zero sample variance), then $\text{[math]}$ is initialized as follows:

$\text{[math]}$

After computing the estimate of $\text{[math]}$ , the estimate of $\text{[math]}$ is computed as follows:

$\text{[math]}$

Both the maximum likelihood method and the method of moments arrive at the same relationship between $\text{[math]}$ and $\text{[math]}$ .

GPD

The parameters are initialized by using the method of moments. Notice that for $\text{[math]}$ , the CDF of the generalized Pareto distribution (GPD) is:

	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$

This is equivalent to a Pareto distribution with scale parameter $\text{[math]}$ and shape parameter $\text{[math]}$ . Using this relationship, the parameter initialization method used for the PARETO distribution model is used to get the following initial values for the parameters of the GPD distribution model:

$\text{[math]}$

If $\text{[math]}$ (almost zero sample variance) or $\text{[math]}$ , then the parameters are initialized as follows:

$\text{[math]}$

IGAUSS

The parameters are initialized by using the method of moments. Note that the standard parameterization of the inverse Gaussian distribution (also known as the Wald distribution), in terms of the location parameter $\text{[math]}$ and shape parameter $\text{[math]}$ , is as follows (Klugman, Panjer, and Willmot 1998, p. 583):

	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$

For this parameterization, it is known that the mean is $\text{[math]}$ and the variance is $\text{[math]}$ , which yields the second raw moment as $\text{[math]}$ (computed by using $\text{[math]}$ ).

The predefined IGAUSS distribution model in PROC SEVERITY uses the following alternate parameterization to allow the distribution to have a scale parameter, $\text{[math]}$ :

	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$

The parameters $\text{[math]}$ (scale) and $\text{[math]}$ (shape) of this alternate form are related to the parameters $\text{[math]}$ and $\text{[math]}$ of the preceding form such that $\text{[math]}$ and $\text{[math]}$ . Using this relationship, the first and second raw moments of the IGAUSS distribution are:

	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$

Solving $\text{[math]}$ and $\text{[math]}$ yields the following initial values:

$\text{[math]}$

If $\text{[math]}$ (almost zero sample variance), then the parameters are initialized as follows:

$\text{[math]}$

LOGN

The parameters are initialized by using the method of moments. The $\text{[math]}$ th raw moment of the lognormal distribution is:

$\text{[math]}$

Solving $\text{[math]}$ and $\text{[math]}$ yields the following initial values:

$\text{[math]}$

PARETO

The parameters are initialized by using the method of moments. The $\text{[math]}$ th raw moment of the Pareto distribution is:

$\text{[math]}$

Solving $\text{[math]}$ and $\text{[math]}$ yields the following initial values:

$\text{[math]}$

If $\text{[math]}$ (almost zero sample variance) or $\text{[math]}$ , then the parameters are initialized as follows:

$\text{[math]}$

WEIBULL

The parameters are initialized by using the percentile matching method. Let $\text{[math]}$ and $\text{[math]}$ denote the estimates of the $\text{[math]}$ th and $\text{[math]}$ th percentiles, respectively. Using the formula for the CDF of Weibull distribution, they can be written as

	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$

Simplifying and solving these two equations yields the following initial values:

$\text{[math]}$

where $\text{[math]}$ . These initial values agree with those suggested in Klugman, Panjer, and Willmot (1998).

A summary of the initial values of all the parameters for all the predefined distributions is given in Table 22.4. The table also provides the names of the parameters to use in the INIT= option in the DIST statement if you want to provide a different initial value.

Table 22.4 Parameter Initialization for Predefined Distributions
Distribution	Parameter	Name for INIT option	Default Initial Value
BURR	$\text{[math]}$	theta	$\text{[math]}$
	$\text{[math]}$	alpha	$\text{[math]}$
	$\text{[math]}$	gamma	$\text{[math]}$
EXP	$\text{[math]}$	theta	$\text{[math]}$
GAMMA	$\text{[math]}$	theta	$\text{[math]}$
	$\text{[math]}$	alpha	$\text{[math]}$
GPD	$\text{[math]}$	theta	$\text{[math]}$
	$\text{[math]}$	xi	$\text{[math]}$
IGAUSS	$\text{[math]}$	theta	$\text{[math]}$
	$\text{[math]}$	alpha	$\text{[math]}$
LOGN	$\text{[math]}$	mu	$\text{[math]}$
	$\text{[math]}$	sigma	$\text{[math]}$
PARETO	$\text{[math]}$	theta	$\text{[math]}$
	$\text{[math]}$	alpha	$\text{[math]}$
WEIBULL	$\text{[math]}$	theta	$\text{[math]}$
	$\text{[math]}$	tau	$\text{[math]}$
Notes:
$\text{[math]}$ $\text{[math]}$ denotes the $\text{[math]}$ th raw moment
$\text{[math]}$ $\text{[math]}$
$\text{[math]}$ $\text{[math]}$ and $\text{[math]}$ denote the $\text{[math]}$ th and $\text{[math]}$ th percentiles, respectively
$\text{[math]}$ $\text{[math]}$

Note: This procedure is experimental.

Top of Page