The SEVERITY Procedure

ODS Graphics

Statistical procedures use ODS Graphics to create graphs as part of their output. ODS Graphics is described in detail in Chapter 21: Statistical Graphics Using ODS in SAS/STAT 12.1 User's Guide.

Before you create graphs, ODS Graphics must be enabled (for example, with the ODS GRAPHICS ON statement). For more information about enabling and disabling ODS Graphics, see the section Enabling and Disabling ODS Graphics in that chapter.

The overall appearance of graphs is controlled by ODS styles. Styles and other aspects of using ODS Graphics are discussed in the section A Primer on ODS Statistical Graphics in that chapter.

This section describes the use of ODS for creating graphics with the SEVERITY procedure.

ODS Graph Names

PROC SEVERITY assigns a name to each graph it creates by using ODS. You can use these names to selectively reference the graphs. The names are listed in Table 23.6.

Table 23.6: ODS Graphics Produced by PROC SEVERITY

ODS Graph Name

Plot Description

PLOTS= Option

CDFPlot

Comparative CDF Plot

CDF

CDFDistPlot

CDF Plot per Distribution

CDFPERDIST

PDFPlot

Comparative PDF Plot

PDF

PDFDistPlot

PDF Plot per Distribution

PDFPERDIST

PPPlot

P-P Plot of CDF and EDF

PP

QQPlot

Q-Q Plot

QQ


Comparative CDF Plot

The comparative CDF plot helps you visually compare the cumulative distribution function (CDF) estimates of all the candidate distribution models and the empirical distribution function (EDF) estimate. The plot does not contain CDF estimates for models whose parameter estimation process does not converge. The horizontal axis represents the values of the response variable. The vertical axis represents the values of the CDF or EDF estimates.

If truncation is specified, then conditional CDF estimates are plotted. Otherwise, unconditional CDF estimates are plotted. The conditional estimates are computed using the method described in the section Truncation and Conditional CDF Estimates.

If regressor variables are specified, then the plotted CDF estimates are from a mixture distribution. See the section CDF and PDF Estimates with Regression Effects for more information.

CDF Plot per Distribution

The CDF plot per distribution shows the CDF estimates of each candidate distribution model unless that model’s parameter estimation process does not converge. The plot also contains estimates of the EDF. The horizontal axis represents the values of the response variable. The vertical axis represents the values of the CDF or EDF estimates.

This plot shows the lower and upper pointwise confidence limits for the EDF estimates. For an EDF estimate $F_ n$ with standard error $\sigma _ n$, they are computed as $\mbox{MAX}(0, F_ n - z_{(1-\alpha /2)} \sigma _ n)$ and $\mbox{MIN}(1, F_ n + z_{(1-\alpha /2)} \sigma _ n)$ respectively, where $z_ p$ is the $p$th quantile from the standard normal distribution and $\alpha $ denotes the confidence level that you specified in the EDFALPHA= option (the default is $\alpha =0.05$).

If truncation is specified, then conditional CDF estimates are plotted. Otherwise unconditional CDF estimates are plotted. The conditional estimates are computed using the method described in the section Truncation and Conditional CDF Estimates.

If regressor variables are specified, then the plotted CDF estimates are from a mixture distribution. See the section CDF and PDF Estimates with Regression Effects for more information.

Comparative PDF Plot

The comparative PDF plot helps you visually compare the probability density function (PDF) estimates of all the candidate distribution models. The plot does not contain PDF estimates for models whose parameter estimation process does not converge. The horizontal axis represents the values of the response variable. The vertical axis represents the values of the PDF estimates.

If the HISTOGRAM option is specified, then the plot also contains the histogram of response variable values. If the KERNEL option is specified, then the plot also contains the kernel density estimate for the response variable values.

If regressor variables are specified, then the plotted PDF estimates are from a mixture distribution. See the section CDF and PDF Estimates with Regression Effects for more information.

PDF Plot per Distribution

The PDF plot per distribution shows the PDF estimates of each candidate distribution model unless that model’s parameter estimation process does not converge. The horizontal axis represents the values of the response variable. The vertical axis represents the values of the PDF estimates.

If the HISTOGRAM option is specified, then the plot also contains the histogram of response variable values. If the KERNEL option is specified, then the plot also contains the kernel density estimate for the response variable values.

If regressor variables are specified, then the plotted PDF estimates are from a mixture distribution. See the section CDF and PDF Estimates with Regression Effects for more information.

P-P Plot of CDF and EDF

The P-P plot of CDF and EDF is the probability-probability plot that compares the CDF estimates of a distribution with the EDF estimates. A plot is not prepared for models whose parameter estimation process does not converge. The horizontal axis represents the CDF estimates of a candidate distribution and the vertical axis represents the EDF estimates.

This plot can be interpreted as displaying the data that are used for computing the EDF-based statistics of fit for the given candidate distribution. As described in the section EDF-Based Statistics, these statistics are computed by comparing the EDF, denoted by $F_ n(y)$, and the CDF, denoted by $F(y)$, at each of the response variable values $y$. Using the probability inverse transform $z = F(y)$, this is equivalent to comparing the EDF of the $z$, denoted by $F_ n(z)$, and the CDF of $z$, denoted by $F(z)$ (D’Agostino and Stephens 1986, Ch. 4). Given that the CDF of $z$ is a uniform distribution ($F(z)=z$), the EDF-based statistics can be computed by comparing the EDF estimate of $z$ with the estimate of $z$. The horizontal axis of the plot represents the estimated CDF $\hat{z}=\hat{F}(y)$. The vertical axis represents the estimated EDF of $z$, $\hat{F}_ n(z)$. The plot contains a scatter plot of ($\hat{z}$, $\hat{F}_ n(z)$) points and a reference line $F_ n(z)=z$ that represents the expected uniform distribution of $z$. Points scattered closer to the reference line indicate a better fit than the points scattered away from the reference line.

If truncation is specified, then the EDF estimates are conditional as described in the section EDF Estimates and Truncation. So, conditional estimates of CDF are displayed, which are computed using the method described in the section Truncation and Conditional CDF Estimates.

If regressor variables are specified, then the displayed CDF estimates, both unconditional and conditional, are from a mixture distribution. See the section CDF and PDF Estimates with Regression Effects for more information.

Q-Q Plot

The Q-Q plot is a quantile-quantile scatter plot that compares the empirical quantiles with the quantiles from a candidate distribution. A plot is not prepared for models whose parameter estimation process does not converge. The horizontal axis represents the quantiles from a candidate distribution, and the vertical axis represents the empirical quantiles.

Each point in the plot corresponds to a specific value of EDF estimate, $F_ n$. The Y coordinate is the value of the response variable for which $F_ n$ is computed. The X coordinate is computed by using one of two following methods for a candidate distribution named dist:

  • If you have defined the dist_QUANTILE function that satisfies the requirements listed in the section <phrase remap="Argument">dist</phrase>_QUANTILE, then that function is invoked with $F_ n$ and estimated distribution parameters as arguments. The QUANTILE function is defined in the Sashelp.Svrtdist library for all the predefined distributions except for the Burr distribution.

  • If the dist_QUANTILE function is not defined, then PROC SEVERITY numerically inverts the dist_CDF function at the CDF value of $F_ n$ for the estimated distribution parameters. If the dist_CDF function is not defined, then the exp(dist_LOGCDF) function is inverted. If the inversion fails, the corresponding point is not plotted in the Q-Q plot.

If truncation is specified, then the EDF estimates are conditional as described in the section EDF Estimates and Truncation. The CDF inversion process, whether done numerically or by evaluating the dist_QUANTILE function, needs to accept an unconditional CDF value. So, the $F_ n$ value is first transformed to an unconditional estimate $F_ n^ u$ as

\[  F_ n^ u = F_ n \cdot (\hat{F}(t^ r_{\text {max}}) - \hat{F}(t^ l_{\text {min}})) + \hat{F}(t^ l_{\text {min}})  \]

where $\hat{F}(t^ r_{\text {max}})$ and $\hat{F}(t^ l_{\text {min}})$ are as defined in the section Truncation and Conditional CDF Estimates.

If regressor variables are specified, then the value of the first distribution parameter is the mean scale value computed from the scale values that are implied by all the observations in the current BY group (or in the entire DATA= data set if the BY statement is not specified).