The KDE Procedure

Convolutions

The formulas for the binned estimator $\hat{f}$ in the previous subsection are in the form of a convolution product between two matrices, one of which contains the bin counts, the other of which contains the rescaled kernels evaluated at multiples of grid increments. This section defines these two matrices explicitly, and shows that $\hat{f}$ is their convolution.

Beginning with the weighted univariate case, define the following matrices:

\begin{eqnarray*} \bK & = & \frac{1}{n}(\varphi _{h}(0),\varphi _{h}(\delta ), \ldots ,\varphi _{h}((g-1)\delta ))’ \\ \bC & = & (c_{1},c_{2},\ldots ,c_{g})’ \end{eqnarray*}

The first thing to note is that many terms in $\bK $ are negligible. The term $\varphi _{h}(i\delta )$ is taken to be 0 when $|i\delta / h|\geq 5$, so you can define

\[ l = \min (g-1,\mr{floor}(5h/\delta )) \]

as the maximum integer multiple of the grid increment to get nonzero evaluations of the rescaled kernel. Here $\mr{floor}(x)$ denotes the largest integer less than or equal to x.

Next, let p be the smallest power of 2 that is greater than $ g+l+1$,

\[ p = 2^{\mr{ceil}(\log _{2} (g+l+1))} \]

where $\mr{ceil}(x)$ denotes the smallest integer greater than or equal to x.

Modify $\bK $ as follows:

\[ \bK = \frac{1}{n}(\varphi _{h}(0), \varphi _{h}(\delta ), \ldots ,\varphi _{h}(l\delta ), \underbrace{0,\ldots ,0}_{p-2l-1}, \varphi _{h}(l\delta ), \ldots ,\varphi _{h}(\delta ))’ \]

Essentially, the negligible terms of $\bK $ are omitted, and the rest are symmetrized (except for one term). The whole matrix is then padded to size $p\times 1$ with zeros in the middle. The dimension p is a highly composite number—that is, one that decomposes into many factors—leading to the most efficient fast Fourier transform operation (see Wand; 1994).

The third operation is to pad the bin count matrix $\bC $ with zeros to the same size as $\bK $:

\[ \bC = (c_{1},c_{2},\ldots ,c_{g}, \underbrace{0,\ldots ,0}_{p-g})’ \]

The convolution $\bK *\bC $ is then a $p\times 1$ matrix, and the preceding formulas show that its first g entries are exactly the estimates $\hat{f}(x_{i}), \  i=1,2,\ldots ,g$.

For bivariate smoothing, the matrix $\bK $ is defined similarly as

\[ \bK = \left[ \begin{array}{cccccccccc}\kappa _{0,0} & \kappa _{0,1} & \ldots & \kappa _{0,l_{Y}} & \mb{0} & \kappa _{0,l_{Y}} & \ldots & \kappa _{0,1} \\ \kappa _{1,0} & \kappa _{1,1} & \ldots & \kappa _{1,l_{Y}} & \mb{0} & \kappa _{1,l_{Y}} & \ldots & \kappa _{1,1} \\ \vdots \\ \kappa _{l_{X},0} & \kappa _{l_{X},1} & \ldots & \kappa _{l_{X},l_{Y}} & \mb{0} & \kappa _{l_{X},l_{Y}} & \ldots & \kappa _{l_{X},1} \\ \mb{0} & \mb{0} & \ldots & \mb{0} & \mb{0} & \mb{0} & \ldots & \mb{0} \\ \kappa _{l_{X},0} & \kappa _{l_{X},1} & \ldots & \kappa _{l_{X},l_{Y}} & \mb{0} & \kappa _{l_{X},l_{Y}} & \ldots & \kappa _{l_{X},1} \\ \vdots \\ \kappa _{1,0} & \kappa _{1,1} & \ldots & \kappa _{1,l_{Y}} & \mb{0} & \kappa _{1,l_{Y}} & \ldots & \kappa _{1,1} \end{array} \right]_{p_{X} \times p_{Y}} \]

where $l_{X} = \min (g_{X}-1, \mr{floor}(5h_{X}/\delta _{X})),  p_{X} = 2^{\mr{ceil}(\log _{2}(g_{X}+l_{X}+1))}$, and so forth, and $\kappa _{i,j} = \frac{1}{n}\varphi _{\mb{h}} (i\delta _{X},j\delta _{Y}) \  i = 0,1,\ldots ,l_{X}, \  j = 0,1,\ldots ,l_{Y}$.

The bin count matrix $\bC $ is defined as

\[ \bC = \left[ \begin{array}{ccccccc} {c}_{1,1} & {c}_{1,2} & \ldots & {c}_{1,g_{Y}} & 0 & \ldots & 0\\ {c}_{2,1} & {c}_{2,2} & \ldots & {c}_{2,g_{Y}} & 0 & \ldots & 0\\ \vdots \\ {c}_{g_{X},1} & {c}_{g_{X},2} & \ldots & {c}_{g_{X},g_{Y}} & 0 & \ldots & 0 \\ 0 & 0 & \ldots & 0 & 0 & \ldots & 0 \\ \vdots \\ 0 & 0 & \ldots & 0 & 0 & \ldots & 0 \end{array} \right]_{p_{X} \times p_{Y}} \]

As with the univariate case, the $g_{X} \times g_{Y}$ upper-left corner of the convolution $\bK *\bC $ is the matrix of the estimates $\hat{f}(\mi{grid})$.

Most of the results in this subsection are found in Wand (1994).