Previous Page | Next Page

The KDE Procedure

Kernel Density Estimates

A weighted univariate kernel density estimate involves a variable and a weight variable . Let , denote a sample of and of size . The weighted kernel density estimate of , the density of , is as follows:

     

where is the bandwidth and

     

is the standard normal density rescaled by the bandwidth. If and , then the optimal bandwidth is

     

This optimal value is unknown, and so approximations methods are required. For a derivation and discussion of these results, refer to Silverman (1986, Chapter 3) and Jones, Marron, and Sheather (1996).

For the bivariate case, let be a bivariate random element taking values in with joint density function

     

and let , be a sample of size drawn from this distribution. The kernel density estimate of based on this sample is

     
     

where , and are the bandwidths, and is the rescaled normal density

     

where is the standard normal density function

     

Under mild regularity assumptions about , the mean integrated squared error (MISE) of is

     
     
     

as , and .

Now set

     
     

which is the asymptotic mean integrated squared error (AMISE). For fixed , this has a minimum at defined as

     

and

     

These are the optimal asymptotic bandwidths in the sense that they minimize MISE. However, as in the univariate case, these expressions contain the second derivatives of the unknown density being estimated, and so approximations are required. Refer to Wand and Jones (1993) for further details.

Previous Page | Next Page | Top of Page