The UNIVARIATE Procedure

CDFPLOT Statement

  • CDFPLOT <variables> < / options>;

The CDFPLOT statement plots the observed cumulative distribution function (cdf) of a variable, defined as

\begin{eqnarray*} F_{N}(\Emph{x}) & = & \mbox{percent of nonmissing values} \leq \Emph{x} \\ & = & \frac{\mbox{number of values} \leq x}{N} \times 100\% \end{eqnarray*}

where N is the number of nonmissing observations. The cdf is an increasing step function that has a vertical jump of $\frac{1}{N}$ at each value of x equal to an observed value. The cdf is also referred to as the empirical cumulative distribution function (ECDF).

You can use any number of CDFPLOT statements in the UNIVARIATE procedure. The components of the CDFPLOT statement are as follows.

variables

specify variables for which to create cdf plots. If you specify a VAR statement, the variables must also be listed in the VAR statement. Otherwise, the variables can be any numeric variables in the input data set. If you do not specify a list of variables, then by default the procedure creates a cdf plot for each variable listed in the VAR statement, or for each numeric variable in the DATA= data set if you do not specify a VAR statement.

For example, suppose a data set named Steel contains exactly three numeric variables: Length, Width, and Height. The following statements create a cdf plot for each of the three variables:

proc univariate data=Steel;
   cdfplot;
run;

The following statements create a cdf plot for Length and a cdf plot for Width:

proc univariate data=Steel;
   var Length Width;
   cdfplot;
run;

The following statements create a cdf plot for Width:

proc univariate data=Steel;
   var Length Width;
   cdfplot Width;
run;
options

specify the theoretical distribution for the plot or add features to the plot. If you specify more than one variable, the options apply equally to each variable. Specify all options after the slash (/) in the CDFPLOT statement. You can specify only one option that names a distribution in each CDFPLOT statement, but you can specify any number of other options. The distributions available are listed in Table 4.2. By default, the procedure produces a plot for the normal distribution.

Table 4.2 through Table 4.4 list the CDFPLOT options by function. For complete descriptions, see the sections Dictionary of Options and Dictionary of Common Options. Options can be any of the following:

  • primary options

  • secondary options

  • general options

Distribution Options

Table 4.2 lists primary options for requesting a theoretical distribution.

Table 4.2: Primary Options for Theoretical Distribution

Option

Description

BETA(beta-options)

plots two-parameter beta distribution function, parameters $\theta $ and $\sigma $ assumed known

EXPONENTIAL(exponential-options)

plots one-parameter exponential distribution function, parameter $\theta $ assumed known

GAMMA(gamma-options)

plots two-parameter gamma distribution function, parameter $\theta $ assumed known

GUMBEL(Gumbel-options)

plots Gumbel distribution with location parameter $\mu $ and scale parameter $\sigma $

IGAUSS(iGauss-options)

plots inverse Gaussian distribution with mean $\mu $ and shape parameter $\lambda $

LOGNORMAL(lognormal-options)

plots two-parameter lognormal distribution function, parameter $\theta $ assumed known

NORMAL(normal-options)

plots normal distribution function

PARETO(Pareto-options)

plots generalized Pareto distribution with threshold parameter $\theta $, scale parameter $\sigma $, and shape parameter $\alpha $

POWER(power-options)

plots power function distribution with threshold parameter $\theta $, scale parameter $\sigma $, and shape parameter $\alpha $

RAYLEIGH(Rayleigh-options)

plots Rayleigh distribution with threshold parameter $\theta $ and scale parameter $\sigma $

WEIBULL(Weibull-options)

plots two-parameter Weibull distribution function, parameter $\theta $ assumed known


Table 4.3 lists secondary options that specify distribution parameters and control the display of a theoretical distribution function. Specify these options in parentheses after the distribution keyword. For example, you can request a normal probability plot with a distribution reference line by specifying the NORMAL option as follows:

proc univariate;
   cdfplot / normal(mu=10 sigma=0.5 color=red);
run;

The COLOR= option specifies the color for the curve, and the normal-options MU= and SIGMA= specify the parameters $\mu = 10$ and $\sigma = 0.5$ for the distribution function. If you do not specify these parameters, maximum likelihood estimates are computed.

Table 4.3: Secondary Distribution Options

Option

Description

Options Used with All Distributions

COLOR=

specifies color of theoretical distribution function

L=

specifies line type of theoretical distribution function

W=

specifies width of theoretical distribution function

Beta-Options

ALPHA=

specifies first shape parameter $\alpha $ for beta distribution function

BETA=

specifies second shape parameter $\beta $ for beta distribution function

SIGMA=

specifies scale parameter $\sigma $ for beta distribution function

THETA=

specifies lower threshold parameter $\theta $ for beta distribution function

Exponential-Options

SIGMA=

specifies scale parameter $\sigma $ for exponential distribution function

THETA=

specifies threshold parameter $\theta $ for exponential distribution function

Gamma-Options

ALPHA=

specifies shape parameter $\alpha $ for gamma distribution function

ALPHADELTA=

specifies change in successive estimates of $\alpha $ at which the Newton-Raphson approximation of $\hat{\alpha }$ terminates

ALPHAINITIAL=

specifies initial value for $\alpha $ in the Newton-Raphson approximation of $\hat{\alpha }$

MAXITER=

specifies maximum number of iterations in the Newton-Raphson approximation of $\hat{\alpha }$

SIGMA=

specifies scale parameter $\sigma $ for gamma distribution function

THETA=

specifies threshold parameter $\theta $ for gamma distribution function

Gumbel-Options

MU=

specifies location parameter $\mu $ for Gumbel distribution function

SIGMA=

specifies scale parameter $\sigma $ for Gumbel distribution function

IGauss-Options

LAMBDA=

specifies shape parameter $\lambda $ for inverse Gaussian distribution function

MU=

specifies mean $\mu $ for inverse Gaussian distribution function

Lognormal-Options

SIGMA=

specifies shape parameter $\sigma $ for lognormal distribution function

THETA=

specifies threshold parameter $\theta $ for lognormal distribution function

ZETA=

specifies scale parameter $\zeta $ for lognormal distribution function

Normal-Options

MU=

specifies mean $\mu $ for normal distribution function

SIGMA=

specifies standard deviation $\sigma $ for normal distribution function

Pareto-Options

ALPHA=

specifies shape parameter $\alpha $ for generalized Pareto distribution function

SIGMA=

specifies scale parameter $\sigma $ for generalized Pareto distribution function

THETA=

specifies threshold parameter $\theta $ for generalized Pareto distribution function

Power-Options

ALPHA=

specifies shape parameter $\alpha $ for power function distribution

SIGMA=

specifies scale parameter $\sigma $ for power function distribution

THETA=

specifies threshold parameter $\theta $ for power function distribution

Rayleigh-Options

SIGMA=

specifies scale parameter $\sigma $ for Rayleigh distribution function

THETA=

specifies threshold parameter $\theta $ for Rayleigh distribution function

Secondary Weibull-Options

C=

specifies shape parameter c for Weibull distribution function

ITPRINT

requests table of iteration history and optimizer details

MAXITER=

specifies maximum number of iterations in the Newton-Raphson approximation of $\hat{c}$

SIGMA=

specifies scale parameter $\sigma $ for Weibull distribution function

THETA=

specifies threshold parameter $\theta $ for Weibull distribution function


General Options

Table 4.4 summarizes general options for enhancing cdf plots.

Table 4.4: General Graphics Options

Option

Description

General Graphics Options

HREF=

specifies reference lines perpendicular to the horizontal axis

HREFLABELS=

specifies labels for HREF= lines

HREFLABPOS=

specifies position for HREF= line labels

NOECDF

suppresses plot of empirical (observed) distribution function

NOHLABEL

suppresses label for horizontal axis

NOVLABEL

suppresses label for vertical axis

NOVTICK

suppresses tick marks and tick mark labels for vertical axis

STATREF=

specifies reference lines at values of summary statistics

STATREFLABELS=

specifies labels for STATREF= lines

STATREFSUBCHAR=

specifies substitution character for displaying statistic values in STATREFLABELS= labels

VAXISLABEL=

specifies label for vertical axis

VREF=

specifies reference lines perpendicular to the vertical axis

VREFLABELS=

specifies labels for VREF= lines

VREFLABPOS=

specifies position for VREF= line labels

VSCALE=

specifies scale for vertical axis

Options for Traditional Graphics Output

ANNOTATE=

specifies annotate data set

CAXIS=

specifies color for axis

CFRAME=

specifies color for frame

CHREF=

specifies colors for HREF= lines

CSTATREF=

specifies colors for STATREF= lines

CTEXT=

specifies color for text

CVREF=

specifies colors for VREF= lines

DESCRIPTION=

specifies description for graphics catalog member

FONT=

specifies text font

HAXIS=

specifies AXIS statement for horizontal axis

HEIGHT=

specifies height of text used outside framed areas

HMINOR=

specifies number of horizontal axis minor tick marks

INFONT=

specifies software font for text inside framed areas

INHEIGHT=

specifies height of text inside framed areas

LHREF=

specifies line types for HREF= lines

LSTATREF=

specifies line types for STATREF= lines

LVREF=

specifies line types for VREF= lines

NAME=

specifies name for plot in graphics catalog

NOFRAME

suppresses frame around plotting area

TURNVLABELS

turns and vertically strings out characters in labels for vertical axis

VAXIS=

specifies AXIS statement for vertical axis

VMINOR=

specifies number of vertical axis minor tick marks

WAXIS=

specifies line thickness for axes and frame

Options for ODS Graphics Output

ODSFOOTNOTE=

specifies footnote displayed on plot

ODSFOOTNOTE2=

specifies secondary footnote displayed on plot

ODSTITLE=

specifies title displayed on plot

ODSTITLE2=

specifies secondary title displayed on plot

OVERLAY

overlays plots for different class levels

Options for Comparative Plots

ANNOKEY

applies annotation requested in ANNOTATE= data set to key cell only

CFRAMESIDE=

specifies color for filling row label frames

CFRAMETOP=

specifies color for filling column label frames

CPROP=

specifies color for proportion of frequency bar

CTEXTSIDE=

specifies color for row labels

CTEXTTOP=

specifies color for column labels

INTERTILE=

specifies distance between tiles in comparative plot

NCOLS=

specifies number of columns in comparative plot

NROWS=

specifies number of rows in comparative plot

Miscellaneous Options

CONTENTS=

specifies table of contents entry for cdf plot grouping


Dictionary of Options

The following entries provide detailed descriptions of the options specific to the CDFPLOT statement. See the section Dictionary of Common Options for detailed descriptions of options common to all plot statements.

ALPHA=value

specifies the shape parameter $\alpha $ for distribution functions requested with the BETA , GAMMA , PARETO , and POWER options. Enclose the ALPHA= option in parentheses after the distribution keyword. If you do not specify a value for $\alpha $, the procedure calculates a maximum likelihood estimate. For examples, see the entries for the BETA and GAMMA options.

BETA<(beta-options )>

displays a fitted beta distribution function on the cdf plot. The equation of the fitted cdf is

\[ F(x) = \left\{ \begin{array}{ll} 0 & \mbox{for }x \leq \theta \\ I_{\frac{x - \theta }{\sigma }} (\alpha , \beta ) & \mbox{for }\theta < x < \theta + \sigma \\ 1 & \mbox{for }x \geq \sigma + \theta \end{array} \right. \]

where $I_ y (\alpha , \beta )$ is the incomplete beta function and

  • $\theta =$ lower threshold parameter (lower endpoint)

  • $\sigma =$ scale parameter $(\sigma >0)$

  • $\alpha =$ shape parameter $(\alpha >0)$

  • $\beta =$ shape parameter $(\beta >0)$

The beta distribution is bounded below by the parameter $\theta $ and above by the value $\theta + \sigma $. You can specify $\theta $ and $\sigma $ by using the THETA= and SIGMA= beta-options, as illustrated in the following statements, which fit a beta distribution bounded between 50 and 75. The default values for $\theta $ and $\sigma $ are 0 and 1, respectively.

proc univariate;
   cdfplot / beta(theta=50 sigma=25);
run;

The beta distribution has two shape parameters: $\alpha $ and $\beta $. If these parameters are known, you can specify their values with the ALPHA= and BETA= beta-options. If you do not specify values for $\alpha $ and $\beta $, the procedure calculates maximum likelihood estimates.

The BETA option can appear only once in a CDFPLOT statement. Table 4.3 lists options you can specify with the BETA distribution option.

BETA=value
B=value

specifies the second shape parameter $\beta $ for beta distribution functions requested by the BETA option. Enclose the BETA= option in parentheses after the BETA keyword. If you do not specify a value for $\beta $, the procedure calculates a maximum likelihood estimate. For examples, see the preceding entry for the BETA option.

C=value

specifies the shape parameter c for Weibull distribution functions requested with the WEIBULL option. Enclose the C= option in parentheses after the WEIBULL keyword. If you do not specify a value for c, the procedure calculates a maximum likelihood estimate. You can specify the SHAPE= option as an alias for the C= option.

EXPONENTIAL<(exponential-options )>
EXP<(exponential-options )>

displays a fitted exponential distribution function on the cdf plot. The equation of the fitted cdf is

\[ F(x) = \left\{ \begin{array}{ll} 0 & \mbox{for }x \leq \theta \\ 1 - \exp \left(-\frac{x - \theta }{\sigma } \right) & \mbox{for }x > \theta \end{array} \right. \]

where

  • $\theta =$ threshold parameter

  • $\sigma =$ scale parameter $(\sigma >0)$

The parameter $\theta $ must be less than or equal to the minimum data value. You can specify $\theta $ with the THETA= exponential-option. The default value for $\theta $ is 0. You can specify $\sigma $ with the SIGMA= exponential-option. By default, a maximum likelihood estimate is computed for $\sigma $. For example, the following statements fit an exponential distribution with $\theta = 10$ and a maximum likelihood estimate for $\sigma $:

proc univariate;
   cdfplot / exponential(theta=10 l=2 color=green);
run;

The exponential curve is green and has a line type of 2.

The EXPONENTIAL option can appear only once in a CDFPLOT statement. Table 4.3 lists the options you can specify with the EXPONENTIAL option.

GAMMA<(gamma-options)>

displays a fitted gamma distribution function on the cdf plot. The equation of the fitted cdf is

\[ F(x) = \left\{ \begin{array}{ll} 0 & \mbox{for }x \leq \theta \\ \frac{1}{\Gamma (\alpha ) \sigma } \int _{\theta }^ x \left(\frac{t-\theta }{\sigma } \right)^{\alpha - 1} \exp \left( -\frac{t - \theta }{\sigma } \right) dt & \mbox{for }x > \theta \end{array} \right. \]

where

  • $\theta =$ threshold parameter

  • $\sigma =$ scale parameter $(\sigma >0)$

  • $\alpha =$ shape parameter $(\alpha >0)$

The parameter $\theta $ for the gamma distribution must be less than the minimum data value. You can specify $\theta $ with the THETA= gamma-option. The default value for $\theta $ is 0. In addition, the gamma distribution has a shape parameter $\alpha $ and a scale parameter $\sigma $. You can specify these parameters with the ALPHA= and SIGMA= gamma-options. By default, maximum likelihood estimates are computed for $\alpha $ and $\sigma $. For example, the following statements fit a gamma distribution function with $\theta =4$ and maximum likelihood estimates for $\alpha $ and $\sigma $:

proc univariate;
   cdfplot / gamma(theta=4);
run;

Note that the maximum likelihood estimate of $\alpha $ is calculated iteratively using the Newton-Raphson approximation. The gamma-options ALPHADELTA= , ALPHAINITIAL= , and MAXITER= control the approximation.

The GAMMA option can appear only once in a CDFPLOT statement. Table 4.3 lists the options you can specify with the GAMMA option.

GUMBEL<(Gumbel-options)>

displays a fitted Gumbel distribution (also known as Type 1 extreme value distribution) function on the cdf plot. The equation of the fitted cdf is

\[ F(x) = \exp \left( -e^{-(x - \mu )/\sigma } \right) \]

where $\mu =$ location parameter $\sigma =$ scale parameter $(\sigma >0)$

You can specify known values for $\mu $ and $\sigma $ with the MU= and SIGMA= Gumbel-options. By default, maximum likelihood estimates are computed for $\mu $ and $\sigma $.

The GUMBEL option can appear only once in a CDFPLOT statement. Table 4.3 lists secondary options you can specify with the GUMBEL option.

IGAUSS<(iGauss-options)>

displays a fitted inverse Gaussian distribution function on the cdf plot. The equation of the fitted cdf is

\[ F(x) = \Phi \left\{ \sqrt {\frac{\lambda }{x}} \left( \frac{x}{\mu } - 1 \right) \right\} + e^{2\lambda /\mu } \Phi \left\{ -\sqrt {\frac{\lambda }{x}} \left( \frac{x}{\mu } + 1 \right) \right\} \]

where $\Phi (\cdot )$ is the standard normal cumulative distribution function, and $\mu =$ mean parameter $(\mu > 0)$ $\lambda =$ shape parameter $(\lambda >0)$

You can specify known values for $\mu $ and $\lambda $ with the MU= and LAMBDA= iGauss-options. By default, maximum likelihood estimates are computed for $\mu $ and $\lambda $.

The IGAUSS option can appear only once in a CDFPLOT statement. Table 4.3 lists secondary options you can specify with the IGAUSS option.

LAMBDA=value

specifies the shape parameter $\lambda $ for distribution functions requested with the IGAUSS option. Enclose the LAMBDA= option in parentheses after the IGAUSS distribution keyword. If you do not specify a value for $\lambda $, the procedure calculates a maximum likelihood estimate.

LOGNORMAL<(lognormal-options)>

displays a fitted lognormal distribution function on the cdf plot. The equation of the fitted cdf is

\[ F(x) = \left\{ \begin{array}{ll} 0 & \mbox{for }x \leq \theta \\ \Phi \left( \frac{\log (x-\theta )-\zeta }{\sigma } \right) & \mbox{for }x > \theta \end{array} \right. \]

where $\Phi (\cdot )$ is the standard normal cumulative distribution function and

  • $\theta =$ threshold parameter

  • $\zeta =$ scale parameter

  • $\sigma =$ shape parameter $(\sigma >0)$

The parameter $\theta $ for the lognormal distribution must be less than the minimum data value. You can specify $\theta $ with the THETA= lognormal-option. The default value for $\theta $ is 0. In addition, the lognormal distribution has a shape parameter $\sigma $ and a scale parameter $\zeta $. You can specify these parameters with the SIGMA= and ZETA= lognormal-options. By default, maximum likelihood estimates are computed for $\sigma $ and $\zeta $. For example, the following statements fit a lognormal distribution function with $\theta =10$ and maximum likelihood estimates for $\sigma $ and $\zeta $:

proc univariate;
   cdfplot / lognormal(theta = 10);
run;

The LOGNORMAL option can appear only once in a CDFPLOT statement. Table 4.3 lists options that you can specify with the LOGNORMAL option.

MU=value

specifies the parameter $\mu $ for theoretical cumulative distribution functions requested with the GUMBEL , IGAUSS , and NORMAL option. Enclose the MU= option in parentheses after the distribution keyword. For the inverse Gaussian and normal distributions, the default value is the sample mean. If you do not specify a value for $\mu $ for the Gumbel distribution, the procedure calculates a maximum likelihood estimate. For an example, see the entry for the NORMAL option.

NOECDF

suppresses the observed distribution function (the empirical cumulative distribution function) of the variable, which is drawn by default. This option enables you to create theoretical cdf plots without displaying the data distribution. The NOECDF option can be used only with a theoretical distribution (such as the NORMAL option).

NORMAL<(normal-options)>

displays a fitted normal distribution function on the cdf plot. The equation of the fitted cdf is

\[ F(x) = \left.\begin{array}{ll} \Phi \left( \frac{x - \mu }{\sigma } \right) & \mbox{for }-\infty < x < \infty \end{array} \right. \]

where $\Phi (\cdot )$ is the standard normal cumulative distribution function and

  • $\mu =$ mean

  • $\sigma =$ standard deviation $(\sigma >0)$

You can specify known values for $\mu $ and $\sigma $ with the MU= and SIGMA= normal-options, as shown in the following statements:

proc univariate;
   cdfplot / normal(mu=14 sigma=.05);
run;

By default, the sample mean and sample standard deviation are calculated for $\mu $ and $\sigma $. The NORMAL option can appear only once in a CDFPLOT statement. Table 4.3 lists options that you can specify with the NORMAL option.

PARETO<(Pareto-options)>

displays a fitted generalized Pareto distribution function on the cdf plot. The equation of the fitted cdf is

\[ F(x) = 1 - { \left( 1 - \frac{\alpha (x - \theta )}{\sigma } \right) }^\frac {1}{\alpha } \]

where $\theta =$ threshold parameter $\sigma =$ scale parameter $(\sigma >0)$ $\alpha =$ shape parameter

The parameter $\theta $ for the generalized Pareto distribution must be less than the minimum data value. You can specify $\theta $ with the THETA= Pareto-option. The default value for $\theta $ is 0. In addition, the generalized Pareto distribution has a shape parameter $\alpha $ and a scale parameter $\sigma $. You can specify these parameters with the ALPHA= and SIGMA= Pareto-options. By default, maximum likelihood estimates are computed for $\alpha $ and $\sigma $.

The PARETO option can appear only once in a CDFPLOT statement. Table 4.3 lists options that you can specify with the PARETO option.

POWER<(power-options)>

displays a fitted power function distribution on the cdf plot. The equation of the fitted cdf is

\[ F(x) = \left\{ \begin{array}{ll} 0 & \mbox{for }x \leq \theta \\ {\left( \frac{x - \theta }{\sigma } \right)}^{\alpha } & \mbox{for }\theta < x < \theta + \sigma \\ 1 & \mbox{for }x \geq \theta + \sigma \end{array} \right. \]

where $\theta =$ lower threshold parameter (lower endpoint) $\sigma =$ scale parameter $(\sigma > 0)$ $\alpha =$ shape parameter $(\alpha > 0)$

The power function distribution is bounded below by the parameter $\theta $ and above by the value $\theta + \sigma $. You can specify $\theta $ and $\sigma $ by using the THETA= and SIGMA= power-options. The default values for $\theta $ and $\sigma $ are 0 and 1, respectively.

You can specify a value for the shape parameter, $\alpha $, with the ALPHA= power-option. If you do not specify a value for $\alpha $, the procedure calculates a maximum likelihood estimate.

The power function distribution is a special case of the beta distribution with its second shape parameter, $\beta = 1$.

The POWER option can appear only once in a CDFPLOT statement. Table 4.3 lists options that you can specify with the POWER option.

RAYLEIGH<(Rayleigh-options)>

displays a fitted Rayleigh distribution function on the cdf plot. The equation of the fitted cdf is

\[ F(x) = 1 - e^{-(x - \theta )^2/(2\sigma ^2)} \]

where $\theta =$ threshold parameter $\sigma =$ scale parameter $(\sigma >0)$

The parameter $\theta $ for the Rayleigh distribution must be less than the minimum data value. You can specify $\theta $ with the THETA= Rayleigh-option. The default value for $\theta $ is 0. You can specify $\sigma $ with the SIGMA= Rayleigh-option. By default, a maximum likelihood estimate is computed for $\sigma $.

The RAYLEIGH option can appear only once in a CDFPLOT statement. Table 4.3 lists options that you can specify with the RAYLEIGH option.

SIGMA=value | EST

specifies the parameter $\sigma $ for distribution functions requested by the BETA, EXPONENTIAL, GAMMA, LOGNORMAL, NORMAL, and WEIBULL options. Enclose the SIGMA= option in parentheses after the distribution keyword. The following table summarizes the use of the SIGMA= option:

Distribution Option

SIGMA= Specifies

Default Value

Alias

BETA

scale parameter $\sigma $

1

SCALE=

EXPONENTIAL

scale parameter $\sigma $

maximum likelihood estimate

SCALE=

GAMMA

scale parameter $\sigma $

maximum likelihood estimate

SCALE=

GUMBEL

scale parameter $\sigma $

maximum likelihood estimate

 

LOGNORMAL

shape parameter $\sigma $

maximum likelihood estimate

SHAPE=

NORMAL

scale parameter $\sigma $

standard deviation

 

PARETO

scale parameter $\sigma $

maximum likelihood estimate

 

POWER

scale parameter $\sigma $

1

 

RAYLEIGH

scale parameter $\sigma $

maximum likelihood estimate

 

WEIBULL

scale parameter $\sigma $

maximum likelihood estimate

SCALE=

THETA=value | EST
THRESHOLD=value | EST

specifies the lower threshold parameter $\theta $ for theoretical cumulative distribution functions requested with the BETA , EXPONENTIAL , GAMMA , LOGNORMAL , PARETO , POWER , RAYLEIGH , and WEIBULL options. Enclose the THETA= option in parentheses after the distribution keyword. The default value is 0.

VSCALE=PERCENT | PROPORTION

specifies the scale of the vertical axis. The value PERCENT scales the data in units of percent of observations per data unit. The value PROPORTION scales the data in units of proportion of observations per data unit. The default is PERCENT.

WEIBULL<(Weibull-options)>

displays a fitted Weibull distribution function on the cdf plot. The equation of the fitted cdf is

\[ F(x) = \left\{ \begin{array}{ll} 0 & \mbox{for }x \leq \theta \\ 1 - \exp \left( - \left( \frac{x - \theta }{\sigma } \right)^ c \right) & \mbox{for }x > \theta \end{array} \right. \]

where

  • $\theta =$ threshold parameter

  • $\sigma =$ scale parameter $(\sigma >0)$

  • $\mi{c} =$ shape parameter $(\mi{c} >0)$

The parameter $\theta $ must be less than the minimum data value. You can specify $\theta $ with the THETA= Weibull-option. The default value for $\theta $ is 0. In addition, the Weibull distribution has a shape parameter c and a scale parameter $\sigma $. You can specify these parameters with the SIGMA= and C= Weibull-options. By default, maximum likelihood estimates are computed for c and $\sigma $. For example, the following statements fit a Weibull distribution function with $\theta =15$ and maximum likelihood estimates for $\sigma $ and c:

proc univariate;
   cdfplot / weibull(theta=15);
run;

The WEIBULL option can appear only once in a CDFPLOT statement. Table 4.3 lists options that you can specify with the WEIBULL option.

ZETA=value

specifies a value for the scale parameter $\zeta $ for a lognormal distribution function requested with the LOGNORMAL option. Enclose the ZETA= option in parentheses after the LOGNORMAL keyword. If you do not specify a value for $\zeta $, a maximum likelihood estimate is computed. You can specify the SCALE= option as an alias for the ZETA= option.