Use the Distribution
Analysis transformation to generate distribution analysis data in
a target table and on the
Output tab of the
Job Editor. The target receives data only for the columns that are
involved in the analysis. You can control many aspects of how data
is generated, including choosing the type of analysis and which columns
are analyzed.
The Distribution Analysis
transformation is based on the UNIVARIATE procedure, which is documented
in the "The UNIVARIATE Procedure" section in
Base SAS Procedures
Guide: Statistical Procedures.
The UNIVARIATE procedure
provides the following:
-
descriptive statistics based on
moments (including skewness and kurtosis), quantiles or percentiles
(such as the median), frequency tables, and extreme values
-
histograms and comparative histograms.
These can also be fitted with probability density curves for various
distributions and with threaded kernel density estimates.
-
quantile-quantile plots (Q-Q plots)
and probability plots. These plots facilitate the comparison of a
data distribution with various theoretical distributions.
-
goodness-of-fit tests for a variety
of distributions including the normal
-
the ability to inset summary statistics
on plots produced on a graphics device
-
the ability to analyze data sets
with a frequency variable
-
the ability to create output data
sets containing summary statistics, histogram intervals, and parameters
of fitted curves
You can use the UNIVARIATE
procedure, together with the VAR statement, to compute summary statistics.
In addition, you can use the following statements to request plots:
-
the HISTOGRAM statement for creating
histograms, the QQPLOT statement for creating Q-Q plots, and the PROBPLOT
statement for creating probability plots.
-
the CLASS statement together with
the HISTOGRAM, QQPLOT, and PROBPLOT statement for creating comparative
histograms, Q-Q plots, and probability plots.
-
the INSET statement with any of
the plot statements for enhancing the plot with an inset table of
summary statistics. The INSET statement is applicable only to plots
produced on graphics devices.
You can specify grouping
columns in the Distribution Analysis transformation. Doing so causes
a SAS BY statement to order target rows according to the values in
the grouping columns. The Distribution Analysis transformation requires
that grouping columns be sorted in ascending order in the source.
If you specify grouping columns, you can sort those columns before
the Distribution Analysis transformation by using a SAS Sort transformation.