The PROBPLOT statement creates a probability plot, which compares ordered variable values with the percentiles of a specified theoretical distribution. If the data distribution matches the theoretical distribution, the points on the plot form a linear pattern. Consequently, you can use a probability plot to determine how well a theoretical distribution models a set of measurements.
Probability plots are similar to Q-Q plots, which you can create with the QQPLOT statement. Probability plots are preferable for graphical estimation of percentiles, whereas Q-Q plots are preferable for graphical estimation of distribution parameters.
You can use any number of PROBPLOT statements in the UNIVARIATE procedure. The components of the PROBPLOT statement are as follows.
Table 4.18 lists options for requesting a theoretical distribution.
Table 4.18: Primary Options for Theoretical Distributions
Option |
Description |
---|---|
specifies beta probability plot for shape parameters and specified with mandatory ALPHA= and BETA= beta-options |
|
specifies exponential probability plot |
|
specifies gamma probability plot for shape parameter specified with mandatory ALPHA= gamma-option |
|
specifies Gumbel probability plot |
|
specifies lognormal probability plot for shape parameter specified with mandatory SIGMA= lognormal-option |
|
specifies normal probability plot |
|
specifies generalized Pareto probability plot for shape parameter specified with mandatory ALPHA= Pareto-option |
|
specifies power function probability plot for shape parameter specified with mandatory ALPHA= power-option |
|
specifies Rayleigh probability plot |
|
specifies three-parameter Weibull probability plot for shape parameter c specified with mandatory C= Weibull-option |
|
specifies two-parameter Weibull probability plot |
Table 4.19 lists secondary options that specify distribution parameters and control the display of a distribution reference line. Specify these options in parentheses after the distribution keyword. For example, you can request a normal probability plot with a distribution reference line by specifying the NORMAL option as follows:
proc univariate; probplot Length / normal(mu=10 sigma=0.3 color=red); run;
The MU= and SIGMA= normal-options display a distribution reference line that corresponds to the normal distribution with mean and standard deviation , and the COLOR= normal-option specifies the color for the line.
Table 4.19: Secondary Distribution Options
Option |
Description |
---|---|
Options Used with All Distributions |
|
specifies color of distribution reference line |
|
specifies line type of distribution reference line |
|
specifies width of distribution reference line |
|
Beta-Options |
|
specifies mandatory shape parameter |
|
specifies mandatory shape parameter |
|
specifies for distribution reference line |
|
specifies for distribution reference line |
|
Exponential-Options |
|
specifies for distribution reference line |
|
specifies for distribution reference line |
|
Gamma-Options |
|
specifies mandatory shape parameter |
|
specifies change in successive estimates of at which the Newton-Raphson approximation of terminates |
|
specifies initial value for in the Newton-Raphson approximation of |
|
specifies maximum number of iterations in the Newton-Raphson approximation of |
|
specifies for distribution reference line |
|
specifies for distribution reference line |
|
Gumbel-Options |
|
specifies for distribution reference line |
|
specifies for distribution reference line |
|
Lognormal-Options |
|
specifies mandatory shape parameter |
|
specifies slope of distribution reference line |
|
specifies for distribution reference line |
|
specifies for distribution reference line (slope is ) |
|
Normal-Options |
|
specifies for distribution reference line |
|
specifies for distribution reference line |
|
Pareto-Options |
|
specifies mandatory shape parameter |
|
specifies for distribution reference line |
|
specifies for distribution reference line |
|
Power-Options |
|
specifies mandatory shape parameter |
|
specifies for distribution reference line |
|
specifies for distribution reference line |
|
Rayleigh-Options |
|
specifies for distribution reference line |
|
specifies for distribution reference line |
|
Weibull-Options |
|
specifies mandatory shape parameter c |
|
requests table of iteration history and optimizer details |
|
specifies maximum number of iterations in the Newton-Raphson approximation of |
|
specifies for distribution reference line |
|
specifies for distribution reference line |
|
Weibull2-Options |
|
specifies for distribution reference line (slope is ) |
|
requests table of iteration history and optimizer details |
|
specifies maximum number of iterations in the Newton-Raphson approximation of |
|
specifies for distribution reference line (intercept is ) |
|
specifies slope of distribution reference line |
|
specifies known lower threshold |
Table 4.20 summarizes the general options for enhancing probability plots.
Table 4.20: General Graphics Options
Option |
Description |
---|---|
General Graphics Options |
|
creates a grid |
|
specifies reference lines perpendicular to the horizontal axis |
|
specifies labels for HREF= lines |
|
specifies position for HREF= line labels |
|
suppresses label for horizontal axis |
|
suppresses label for vertical axis |
|
suppresses tick marks and tick mark labels for vertical axis |
|
specifies tick mark labels for percentile axis |
|
switches horizontal and vertical axes |
|
displays plot in square format |
|
specifies reference lines perpendicular to the vertical axis |
|
specifies labels for VREF= lines |
|
specifies horizontal position of labels for VREF= lines |
|
specifies label for vertical axis |
|
Options for Traditional Graphics Output |
|
specifies annotate data set |
|
specifies color for axis |
|
specifies color for frame |
|
specifies color for grid lines |
|
specifies colors for HREF= lines |
|
specifies colors for STATREF= lines |
|
specifies color for text |
|
specifies colors for VREF= lines |
|
specifies description for plot in graphics catalog |
|
specifies software font for text |
|
specifies AXIS statement for horizontal axis |
|
specifies height of text used outside framed areas |
|
specifies number of horizontal minor tick marks |
|
specifies software font for text inside framed areas |
|
specifies height of text inside framed areas |
|
specifies a line type for grid lines |
|
specifies line types for HREF= lines |
|
specifies line types for STATREF= lines |
|
specifies line types for VREF= lines |
|
specifies name for plot in graphics catalog |
|
suppresses frame around plotting area |
|
requests minor tick marks for percentile axis |
|
specifies line thickness for axes and frame |
|
specifies line thickness for grid |
|
turns and vertically strings out characters in labels for vertical axis |
|
specifies AXIS statement for vertical axis |
|
specifies number of vertical minor tick marks |
|
Options for ODS Graphics Output |
|
specifies footnote displayed on plot |
|
specifies secondary footnote displayed on plot |
|
specifies title displayed on plot |
|
specifies secondary title displayed on plot |
|
overlays plots for different class levels (ODS Graphics only) |
|
Options for Comparative Plots |
|
applies annotation requested in ANNOTATE= data set to key cell only |
|
specifies color for filling frame for row labels |
|
specifies color for filling frame for column labels |
|
specifies color for proportion of frequency bar |
|
specifies color for row labels |
|
specifies color for column labels |
|
specifies distance between tiles |
|
specifies number of columns in comparative probability plot |
|
specifies number of rows in comparative probability plot |
|
Miscellaneous Options |
|
specifies table of contents entry for probability plot grouping |
|
adjusts sample size when computing percentiles |
|
adjusts ranks when computing percentiles |
The following entries provide detailed descriptions of options in the PROBPLOT statement. Options marked with † are applicable only when traditional graphics are produced. See the section Dictionary of Common Options for detailed descriptions of options common to all plot statements.