FOCUS AREAS

SAS/STAT Topics

SAS/STAT Software

Nonparametric Analysis

In statistical inference, or hypothesis testing, the traditional tests are called parametric tests because they depend on the specification of a probability distribution (such as the normal) except for a set of free parameters. Parametric tests are said to depend on distributional assumptions. Nonparametric tests, on the other hand, do not require any strict distributional assumptions. Even if the data are distributed normally, nonparametric methods are often almost as powerful as parametric methods.

The SAS/STAT nonparametric analysis procedures include the following:

FREQ Procedure


The FREQ procedure produces one-way to n-way frequency and contingency (crosstabulation) tables. For two-way tables, PROC FREQ computes tests and measures of association. For n-way tables, PROC FREQ provides stratified analysis by computing statistics across, as well as within, strata. The following are highlights of the FREQ procedure's features:

  • computes goodness-of-fit tests for equal proportions or specified null proportions for one-way frequency tables
  • provides confidence limits and tests for binomial proportions, including tests for noninferiority and equivalence for one-way frequency tables
  • compute various statistics to examine the relationships between two classification variables. The statistics for contingency tables include the following:
    • chi-square tests and measures
    • measures of association
    • risks (binomial proportions) and risk differences for 2 x 2 tables
    • odds ratios and relative risks for 2 x 2 tables
    • tests for trend
    • tests and measures of agreement
    • Cochran-Mantel-Haenszel statistics
  • computes asymptotic standard errors, confidence intervals, and tests for measures of association and measures of agreement
  • computes score confidence limits for odds ratios
  • computes exact p-values, exact mid-p-values, and confidence intervals for many test statistics and measures
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • accepts either raw data or cell count data to produce frequency and crosstabulation tables
  • creates a SAS data set that contains the computed statistics
  • creates a SAS data set that corresponds to any output table
  • automatically creates graphs by using ODS Graphics
For further details, see FREQ Procedure

KDE Procedure


The KDE procedure performs univariate and bivariate kernel density estimation. Statistical density estimation involves approximating a hypothesized probability density function from observed data. Kernel density estimation is a nonparametric technique for density estimation in which a known density function (the kernel) is averaged across the observed data points to create a smooth approximation. PROC KDE uses a Gaussian density as the kernel, and its assumed variance determines the smoothness of the resulting estimate. The following are highlights of the KDE procedure's features:

  • computes a variety of common statistics, including estimates of the percentiles of the hypothesized probability density function
  • produces a variety of plots, including univariate and bivariate histograms, plots of the kernel density estimates, and contour plots
  • saves kernel density estimates into SAS data sets
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • perform weighted estimation
  • create a SAS data set that corresponds to any output table
  • automatically creates graphs by using ODS Graphics
For further details, see KDE Procedure

NPAR1WAY Procedure


The NPAR1WAY procedure performs nonparametric tests for location and scale differences across a one-way classification. PROC NPAR1WAY also provides a standard analysis of variance on the raw data and tests based on the empirical distribution function. The following are highlights of the NPAR1WAY procedure's features:

  • performs nonparametric tests for location and scale differences across a one-way classification based on the following scores of a response variable
    • Wilcoxon
    • median
    • Van der Waerden (normal)
    • Savage
    • Siegel-Tukey
    • Ansari-Bradley
    • Klotz
    • Mood
    • Conover
    • raw data
  • computes tests based on simple linear rank statistics when the data are classified into two samples
  • computes tests based on one-way ANOVA statistics when the data are classified into more than two samples
  • provides asymptotic, exact p-values, and exact mid-p-values for tests
  • provides Hodges-Lehmann estimate of location shift including exact confidence limits
  • provides tests based on Conover scores inclusing exact tests
  • provides stratified rank-based analysis of two-sample data
  • computes the following empirical distribution function (EDF) statistics:
    • Kolmogorov-Smirnov test
    • Cramer-von Mises test
    • Kuiper test
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • creates a SAS data set that corresponds to any output table
  • automatically creates graphs by using ODS Graphics
For further details, see NPAR1WAY Procedure