SUPPORT / SAMPLES & SAS NOTES
 

Support

Usage Note 23135: Testing fit of continuous and discrete distributions to observed data

DetailsAboutRate It

Continuous Distributions

You can test the fit of many different continuous distributions to your data using the UNIVARIATE procedure in Base SAS®, the CAPABILITY procedure in SAS/QC® (see the distribution options in the various plotting statements of these procedures), the SEVERITY and HPSEVERITY procedures in SAS/ETS®, or the RELIABILITY procedure in SAS/QC.

The following lists the distribution families available and the procedures that can be used to estimate them:

  • Beta (also known as Pearson Type I or Type II distributions and includes Uniform, power-function and arc-sine distributions as special cases) - UNIVARIATE, CAPABILITY
  • Burr - SEVERITY, HPSEVERITY
  • Exponential (special case of gamma, Weibull, Pareto distributions) - UNIVARIATE, CAPABILITY, SEVERITY, HPSEVERITY, RELIABILITY
  • Extreme Value - RELIABILITY
  • Gamma (also known as Pearson Type III distributions and includes chi-square, Erlang, and Rayleigh distributions as special cases) - UNIVARIATE, CAPABILITY, SEVERITY, HPSEVERITY
  • Pareto and Generalized Pareto (Uniform and exponential are special cases) - UNIVARIATE, SEVERITY, HPSEVERITY
  • Gumbel (also known as Type I extreme value distribution) - UNIVARIATE, CAPABILITY
  • Inverse Gaussian (the Wald distribution is a special case) - UNIVARIATE, SEVERITY, HPSEVERITY
  • Johnson SU and SB - UNIVARIATE, CAPABILITY (parameters can be estimated but tests of fit are not available)
  • Logistic and Log Logistic - RELIABILITY
  • Lognormal - UNIVARIATE, CAPABILITY, SEVERITY, HPSEVERITY, RELIABILITY
  • Normal/Gaussian - UNIVARIATE, CAPABILITY, RELIABILITY
  • Power Function - UNIVARIATE, CAPABILITY
  • Rayleigh - UNIVARIATE, CAPABILITY
  • Tweedie - SEVERITY, HPSEVERITY
  • Weibull - UNIVARIATE, CAPABILITY, SEVERITY, HPSEVERITY, RELIABILITY

Modeling procedures, such as SAS/STAT® procedures GENMOD, GLIMMIX, NLMIXED, and FMM, can also be used to estimate the parameters of specified distributions. See this note illustrating the use of GENMOD to estimate parameters of several distributions.

Kernel density estimation is also available for fitting distributions of unspecified or more general types such as multimodal distributions.

The SEVERITY and HPSEVERITY procedures can automatically fit all of its predefined distributions to the data and identify the best fitting distribution using several criteria such as AIC, BIC, and others.

The RELIABILITY procedure can estimate the parameters for the common life distributions when the data are complete, right censored, or interval censored.

For details, see these sections of procedure documentation:

  • "Details: Formulas for Fitted Continuous Distributions" in the UNIVARIATE procedure documentation,
  • "Histogram Statement Details: Formulas for Fitted Curves" in the CAPABILITY documentation,
  • "Details: Predefined Distributions" in the SEVERITY or HPSEVERITY documentation, and
  • "Detail: Probability Distributions" in the RELIABILITY documentation.

To compare the distributions (of unspecified type) from two or more samples, use the EDF option in the NPAR1WAY procedure in SAS/STAT.

Discrete Distributions

You can use features in the FREQ procedure to test the fit of many discrete distributions. Use the TABLES statement to specify a one-way table of observed frequencies or probabilities. In the TABLES statement, specify the CHISQ option to request a Pearson chi-square test of fit, and the TESTF= or TESTP= option to specify the expected frequencies or probabilities of the hypothesized distribution. If you estimated distribution parameters in order to determine the expected values, you should also specify the DF= option in order to properly adjust the degrees of freedom of the test. 

Testing the fit of discrete distributions is further discussed and illustrated in this note.



Operating System and Release Information

Product FamilyProductSystemSAS Release
ReportedFixed*
SAS SystemSAS/QCAlln/a
SAS SystemBase SASAlln/a
SAS SystemSAS/ETSz/OS
OpenVMS VAX
Microsoft® Windows® for 64-Bit Itanium-based Systems
Microsoft Windows Server 2003 Datacenter 64-bit Edition
Microsoft Windows Server 2003 Enterprise 64-bit Edition
Microsoft Windows XP 64-bit Edition
Microsoft® Windows® for x64
OS/2
Microsoft Windows 95/98
Microsoft Windows 2000 Advanced Server
Microsoft Windows 2000 Datacenter Server
Microsoft Windows 2000 Server
Microsoft Windows 2000 Professional
Microsoft Windows NT Workstation
Microsoft Windows Server 2003 Datacenter Edition
Microsoft Windows Server 2003 Enterprise Edition
Microsoft Windows Server 2003 Standard Edition
Microsoft Windows Server 2003 for x64
Microsoft Windows Server 2008
Microsoft Windows Server 2008 for x64
Microsoft Windows XP Professional
Windows 7 Enterprise 32 bit
Windows 7 Enterprise x64
Windows 7 Home Premium 32 bit
Windows 7 Home Premium x64
Windows 7 Professional 32 bit
Windows 7 Professional x64
Windows 7 Ultimate 32 bit
Windows 7 Ultimate x64
Windows Millennium Edition (Me)
Windows Vista
Windows Vista for x64
64-bit Enabled AIX
64-bit Enabled HP-UX
64-bit Enabled Solaris
ABI+ for Intel Architecture
AIX
HP-UX
HP-UX IPF
IRIX
Linux
Linux for x64
Linux on Itanium
OpenVMS Alpha
OpenVMS on HP Integrity
Solaris
Solaris for x64
Tru64 UNIX
* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.