The KDE Procedure |
Convolutions |
The formulas for the binned estimator in the previous subsection are in the form of a convolution product between two matrices, one of which contains the bin counts, the other of which contains the rescaled kernels evaluated at multiples of grid increments. This section defines these two matrices explicitly, and shows that is their convolution.
Beginning with the weighted univariate case, define the following matrices:
The first thing to note is that many terms in are negligible. The term is taken to be when , so you can define
as the maximum integer multiple of the grid increment to get nonzero evaluations of the rescaled kernel. Here denotes the largest integer less than or equal to .
Next, let be the smallest power of that is greater than ,
where denotes the smallest integer greater than or equal to .
Modify as follows:
Essentially, the negligible terms of are omitted, and the rest are symmetrized (except for one term). The whole matrix is then padded to size with zeros in the middle. The dimension is a highly composite number—that is, one that decomposes into many factors—leading to the most efficient fast Fourier transform operation (refer to Wand; 1994).
The third operation is to pad the bin count matrix with zeros to the same size as :
The convolution is then a matrix, and the preceding formulas show that its first entries are exactly the estimates .
For bivariate smoothing, the matrix is defined similarly as
where , and so forth, and .
The bin count matrix is defined as
As with the univariate case, the upper-left corner of the convolution is the matrix of the estimates .
Most of the results in this subsection are found in Wand (1994).
Copyright © SAS Institute, Inc. All Rights Reserved.