DIST
distribution-name-or-keyword <(distribution-option) <distribution-name-or-keyword <(distribution-option)>> ...></ preprocess-options>
;
The DIST statement specifies candidate distributions to be estimated by the SEVERITY procedure. You can specify multiple DIST
statements, and each statement can contain one or more distribution specifications.
For your convenience, PROC SEVERITY provides the following 10 different predefined distributions (the name in the parentheses
is the name to use in the DIST statement): Burr (BURR), exponential (EXP), gamma (GAMMA), generalized Pareto (GPD), inverse
Gaussian or Wald (IGAUSS), lognormal (LOGN), Pareto (PARETO), Tweedie (TWEEDIE), scaled Tweedie (STWEEDIE), and Weibull (WEIBULL).
These are described in detail in the section Predefined Distributions.
You can specify any of the predefined distributions or any distribution that you have defined. If the specified distribution
is not a predefined distribution, then you must submit the CMPLIB= system option with appropriate libraries before you submit
the PROC SEVERITY step to enable the procedure to find the functions associated with your distribution. The predefined distributions
are defined in the Sashelp.Svrtdist
library. However, you are not required to specify this library in the CMPLIB= system option.
As a convenience, you can also use a shortcut keyword to indicate a list of distributions. You can specify one or more of
the following keywords:
-
_ALL_
-
specifies all the predefined distributions and the distributions that you have defined in the libraries that are specified
in the CMPLIB= system option. In addition to the eight predefined distributions included by the _PREDEFINED_ keyword, this
list also includes the Tweedie and scaled Tweedie distributions that are defined in the Sashelp.Svrtdist
library.
-
_PREDEFINED_
-
specifies the list of eight predefined distributions: BURR, EXP, GAMMA, GPD, IGAUSS, LOGN, PARETO, and WEIBULL. Although the
TWEEDIE and STWEEDIE distributions are available in the Sashelp.Svrtdist
library along with these eight distributions, they are not included by this keyword. If you want to fit the TWEEDIE and STWEEDIE
distributions, then you must specify them explicitly or use the _ALL_ keyword.
-
_USER_
-
specifies the list of all the distributions that you have defined in the libraries that are specified in the CMPLIB= system
option. This list does not include the distributions defined in the Sashelp.Svrtdist
library, even if you have specified Sashelp.Svrtdist
in the CMPLIB= option.
The use of these keywords, especially _ALL_, can result in a large list of distributions, which might take a longer time to
estimate. A warning is printed to the SAS log if the number of total distribution models to estimate exceeds 10.
If you specify the OUTCDF= option or request a CDF plot and you do not specify any DIST statement, then PROC SEVERITY does
not fit any distributions and produces the empirical estimates of the cumulative distribution function.
The following distribution-option values can be used in the DIST statement for a distribution name that is not a shortcut keyword:
-
INIT=(name=value …name=value)
-
specifies the initial values to be used for the distribution parameters to start the parameter estimation process. The values
must be specified by parameter names. The parameter names must match the names used in the model definition. For example,
let a model M’s definition contain a M_PDF function with following signature:
function M_PDF(x, alpha, beta);
For this model, the names alpha
and beta
must be used for the INIT option. The names are case-insensitive. If you do not specify initial values for some parameters
in the INIT statement, then a default value of 0.001 is assumed for those parameters. If you specify an incorrect parameter,
PROC SEVERITY prints a warning to the SAS log and does not fit the model. All specified values must be nonmissing.
If you are modeling regression effects, then the initial value of the first distribution parameter (alpha
in the preceding example) should be the initial base value of the scale parameter or log-transformed scale parameter. For more information, see the section Estimating Regression Effects.
The use of INIT= option is one of the three methods available for initializing the parameters. For more information, see the
section Parameter Initialization. If none of the initialization methods is used, then PROC SEVERITY initializes all parameters to 0.001.
You can specify the following preprocess-options in the DIST statement:
-
LISTONLY
-
specifies that the list of all candidate distributions be printed to the SAS log without doing any further processing on them.
This option is especially useful when you use a shortcut keyword to include a list of distributions. It enables you to find
out which distributions are included by the keyword.
-
VALIDATEONLY
-
specifies that all candidate distributions be checked for validity without doing any further processing on them. If a distribution
is invalid, the reason for invalidity is written to the SAS log. If all distributions are valid, then the distribution information
is written to the SAS log. The information includes name, description, validity status (valid or invalid), and number of distribution
parameters. The information is not written to the SAS log if you have specified an OUTMODELINFO= data set or the PRINT=DISTINFO
or PRINT=ALL option in the PROC SEVERITY statement. This option is especially useful when you specify your own distributions
or when you specify the _USER_ or _ALL_ keywords in the DIST statement. It enables you to check whether your custom distribution
definitions satisfy PROC SEVERITY’s requirements for the specified modeling task. It is recommended that you specify the SCALEMODEL
statement if you intend to fit a model with regression effects, because the SCALEMODEL statement instructs PROC SEVERITY to
perform additional checks to validate whether regression effects can be modeled on each candidate distribution.