Distribution Analysis: Outlier Detection |
The Method Tab
You can use the Method tab to specify the following options for
estimating the location and scale parameters for the data, and for
specifying the scale multiple. The Method tab is shown in
Figure 17.3.
- Location estimate
-
lists statistics used to estimate the location parameter for the
data. Each statistic is described in the "Details" section of
the UNIVARIATE procedure documentation in the SAS/STAT User's Guide. The
statistics are as follows:
- Mean
- estimates the location parameter by using the mean of the data. (
Caution: The mean is not a robust statistic; it is influenced by outliers.)
- Median
- estimates the location parameter by using the median of the data.
- Trimmed mean
- estimates the location parameter by using the trimmed mean of the data.
- Winsorized mean
- estimates the location parameter by using the Winsorized mean of the data.
- Trimmed/Winsorized cutoff
-
specifies the number of observations or proportion of observations
used to estimate a trimmed or Winsorized mean.
- Scale estimate
-
lists the statistics for estimating the scale parameter for the
(uncontaminated) data.
The statistics are as follows:
- Standard deviation
-
estimates the scale parameter by using the standard deviation of the data. (
Caution: The standard deviation is not a robust statistic; it is influenced by outliers.)
- MAD
-
estimates the scale parameter by using 1.4826 times the median absolute
deviation from the median of the data.
- Sn
- estimates the scale parameter by using a constant times the robust statistic
of the data.
- Qn
- estimates the scale parameter by using a constant times the robust statistic
of the data.
- Interquartile range
-
estimates the scale parameter by using the interquartile
range of the data divided by 1.34898.
- Gini's mean difference
-
estimates the scale parameter by using times Gini's mean difference.
- Scale multiplier
-
specifies the constant used to multiply the scale estimate. The
resulting product, , determines outliers: all
values whose distance to the location estimate is greater than are
labeled as outliers.
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.