• Print  |
  • Feedback  |

FOCUS AREAS

Distributions

New SAS Functions for Computing Probabilities

Beginning with Release 6.12, Base SAS Software now provides new functions for computing probabilities, densities, and the logarithm of probabilities for numerous continuous and discrete distributions. The new CDF, PDF, LOGPDF, SDF, and LOGSDF functions present a more consistent and simplified mechanism for computing probability values and provide capabilities for directly calculating upper and lower tail probabilities.

The syntax of the new functions simplifies the calling and naming mechanisms required for computing probability and density values. For this new syntax, you must specify a string identifying the distribution, the random variable, and additional parameters that describe the shape, scale, location, and other appropriate features of the distribution.

CDF Function

The CDF function computes the cumulative distribution function for a wide range of distributions and consolidates the computations available in existing functions such as PROBF, PROBNORM, and PROBT. For example, consider calculating the probability of experiencing 6 or fewer successes in 20 independent Bernoulli trials with a 0.45 probability of success. You can use the CDF function to compute the probability P(X <= 6) as

    CDF('Binomial',6,0.45,20) = 0.1300

PDF and LOGPDF Functions

The PDF function computes probability density and mass functions for continuous and discrete distributions, and the LOGPDF function computes the logarithm of the probability density function. Using the previous example, you can use the PMF function to compute the probability of exactly 6 successes in 20 trials:

    PMF('Binomial',6,0.45,20) = 0.0746

SDF and LOGSDF Functions

The SDF function computes the upper tail of a specified distribution, also known as the survivor distribution function. You can use the SDF function to compute the probability of more than six successes in 20 trials as

   SDF('Binomial',6,0.45,20) = 0.8700

which is equal to 1 - P(X <= 6) as computed by the CDF function. In addition, you can use the LOGSDF function to compute the logarithm of the survivor function when you are computing very extreme upper tail probabilities of a distribution.

Accurately Computing Extreme Upper Tail Probabilities

In order to avoid cancellation error due to finite precision arithmetic, the SDF or LOGSDF functions are recommended for directly computing upper tail probabilities. Consider calculating an upper tail probability for a random variable X that is distributed as chi-squared with 100 degrees of freedom. Using the CDF function, you can compute the upper tail probability P(X > 265) as

    1-CDF('chisquared',265,100) = 1.1102E-16

However, since the SAS System stores numerical results using double precision, this answer is incorrect due to cancellation error. If you use the SDF function,

   SDF('chisquared',265,100) = 7.2119E-17

the result is accurate to at least 10 digits of relative precision.


Statistics and Operations Research Home Page | What's New in Data Analysis