HISTOGRAM Statement: CAPABILITY Procedure

Kernel Density Estimates

You can use the KERNEL option to superimpose kernel density estimates on histograms. Smoothing the data distribution with a kernel density estimate can be more effective than using a histogram to examine features that might be obscured by the choice of histogram bins or sampling variation. A kernel density estimate can also be more effective than a parametric curve fit when the process distribution is multimodal. See Example 5.12.

The general form of the kernel density estimator is

\[  \hat{f}_{\lambda }(x) = \frac{1}{n\lambda } \sum ^ n_{i=1}K_{0}\left(\frac{x-x_{i}}{\lambda }\right)  \]

where $K_0(\cdot )$ is a kernel function, $\lambda $ is the bandwidth, n is the sample size, and $x_ i$ is the ith observation.

The KERNEL option provides three kernel functions ($K_0$): normal, quadratic, and triangular. You can specify the function with the K= kernel-option in parentheses after the KERNEL option. Values for the K= option are NORMAL, QUADRATIC, and TRIANGULAR (with aliases of N, Q, and T, respectively). By default, a normal kernel is used. The formulas for the kernel functions are

\[  \begin{array}{lll} \mbox{Normal} &  K_0(t) = \frac{1}{\sqrt {2\pi }} \exp (-\frac{1}{2}t^{2}) &  \mbox{for } -\infty < t < \infty \\[.1in] \mbox{Quadratic} &  K_0(t) = \frac{3}{4}(1-t^2) &  \mbox{for } |t| \leq 1 \\[.1in] \mbox{Triangular} &  K_0(t) = 1-|t| &  \mbox{for } |t| \leq 1 \end{array}  \]

The value of $\lambda $, referred to as the bandwidth parameter, determines the degree of smoothness in the estimated density function. You specify $\lambda $ indirectly by specifying a standardized bandwidth c with the C= kernel-option. If Q is the interquartile range, and n is the sample size, then c is related to $\lambda $ by the formula

\[  \lambda = cQn^{-\frac{1}{5}}  \]

For a specific kernel function, the discrepancy between the density estimator $\hat{f}_{\lambda }(x)$ and the true density $f(x)$ is measured by the mean integrated square error (MISE):

\[  \mbox{MISE}(\lambda ) = \int _{x}\{  E(\hat{f}_{\lambda }(x)) - f(x)\} ^{2}dx + \int _{x}var(\hat{f}_{\lambda }(x))dx  \]

The MISE is the sum of the integrated squared bias and the variance. An approximate mean integrated square error (AMISE) is

\[  \mbox{AMISE}(\lambda ) = \frac{1}{4}\lambda ^{4} \left(\int _{t}t^{2}K(t)dt\right)^2 \int _ x\left(f^{\prime \prime }(x)\right)^2dx + \frac{1}{n\lambda }\int _{t}K(t)^2dt  \]

A bandwidth that minimizes AMISE can be derived by treating $f(x)$ as the normal density having parameters $\mu $ and $\sigma $ estimated by the sample mean and standard deviation. If you do not specify a bandwidth parameter or if you specify C=MISE, the bandwidth that minimizes AMISE is used. The value of AMISE can be used to compare different density estimates. For each estimate, the bandwidth parameter c, the kernel function type, and the value of AMISE are reported in the SAS log.

The general kernel density estimates assume that the domain of the density to estimate can take on all values on a real line. However, sometimes the domain of a density is an interval bounded on one or both sides. For example, if a variable Y is a measurement of only positive values, then the kernel density curve should be bounded so that it is zero for negative Y values.

The CAPABILITY procedure uses a reflection technique to create the bounded kernel density curve, as described in Silverman (1986, pp 30-31). It adds the reflections of kernel density that are outside the boundary to the bounded kernel estimates. The general form of the bounded kernel density estimator is computed by replacing $K_{0}\left(\frac{x-x_{i}}{\lambda }\right)$ in the original equation with

\[  \left\{  K_0\left(\frac{x - x_ i}{\lambda }\right) + K_0\left(\frac{(x - x_ l) + (x_ i - x_ l)}{\lambda }\right) + K_0\left(\frac{(x_ u - x) + (x_ u - x_ i)}{\lambda }\right) \right\}   \]

where $x_ l$ is the lower bound and $x_ u$ is the upper bound.

Without a lower bound, $x_ l = \infty $ and $K_0\left(\frac{(x-x_ l)+(x_ i-x_ l)}{\lambda }\right)$ equals zero. Similarly, without an upper bound, $x_ u = \infty $ and $K_0\left(\frac{(x_ u-x)+(x_ u-x_ i)}{\lambda }\right)$ equals zero.

When C=MISE is used with a bounded kernel density, the CAPABILITY procedure uses a bandwidth that minimizes the AMISE for its corresponding unbounded kernel.