The SPP Procedure

Statistics Based on Second-Order Characteristics

Statistics that are based on second-order characteristics include Ripley’s K function, Besag’s L function, and the pair correlation function (also called the g function). To understand why these functions are based on second-order characteristics, see Illian et al. (2008, p. 223-243). These functions usually involve computation of pairwise distances between points.

The K function of a stationary point process is defined such that $\lambda K(r)$ is the expected number of points within a distance of r from an arbitrary point of the process. The empirical K function of a set of points is the weighted and renormalized empirical distribution function of the set of pairwise distances between points. The empirical K function can be written as

$\hat{K}(r) = \frac{1}{\hat{\lambda }^{2} |W|}\sum _ i \sum _{j\ne i} \Strong{1}\{ ||x_ i - x_ j|| \leq r\} e(x_ i,x_ j;r)$

where $e(x_ i,x_ j;r)$ is the border edge correction that is described in the section Border Edge Correction for Distance Functions.

For a homogeneous Poisson process, $K_ P(r)$ can be written as

$K_ P(r) = \pi r^{2}$

Exploratory analysis usually involves computing both the empirical K function, $\hat{K}(r)$ , and the K function for a Poisson process, $K_ P(r)$ . A comparison of $\hat{K(r)}$ and $K_ P(r)$ might indicate clustering or regularity depending on whether $\hat{K}(r) > K_ P(r)$ or $\hat{K}(r) < K_ P(r)$ .

Besag’s L function is a transformation of the K function and is defined as

$L(r) = \sqrt {\frac{K(r)}{\pi } }$

For a homogeneous Poisson process, $L_ P(r) = r$ .

The pair correlation function, g(r), can also be expressed as a transformation of the K function:

$g(r) = \frac{K'(r)}{2 \pi r}$

Illian et al. (2008), Stoyan (1987), and Fiksel (1988) suggest an alternative expression for $g(r)$ :

$g(r) = \rho (r)/ \lambda ^{2}$

where $\rho (r)$ is the second-order product density function. Cressie and Collins (2001) provides an expression for $\rho (r)$ as

$\rho (r) = \frac{\hat{\lambda ^{2}} K'(r)}{2\pi r}$

where $\hat{\lambda }^{2} K’(r)$ can be written as a kernel estimate,

$\hat{\lambda }^{2} K’(r) = \frac{1}{a}\sum _{i=1}^{n}\sum _{j\ne i}k_ h (||x_ i-x_ j||-r)$

where a is the area, $k_ h(u) = k(u/h) / h$ , and $k(.)$ is a kernel such as the uniform kernel or the Epanechnikov kernel (Silverman, 1986). PROC SPP uses the version that is based on the uniform kernel; for more information about the uniform kernel, see the section Nonparametric Intensity Estimation. Based on the formula for the second-order product density $\rho (r)$ in terms of the kernel estimate, Stoyan (1987) gives an edge-corrected kernel estimate for $\rho (r)$ as

$\rho (r) = \frac{1}{2\pi r }\sum _ i \sum _{j\ne i} \frac{k_ h(||x_ i - x_ j||-r)}{a(W_ i \cap W_ j)}$

Based on the preceding expression for the product density and a planar version of Moller and Waagepetersen (2004), $g(r)$ can be written as

$g(r) = \frac{\rho (r)}{\hat{\lambda }^{2}} = \frac{1}{2\pi r \hat{\lambda }^{2} |W|} \sum _ i \sum _{j\ne i}\frac{k_ h(||x_ i - x_ j||-r)}{|W \cap W_{i-j}|}$

A border-edge-corrected version of $g(r)$ can be written as

$g(r) = \frac{1}{2\pi r \hat{\lambda }} \frac{\sum _ i \sum _{j\ne i} k_{h}(||x_ i-x_ j||-r)}{\sum _ i\Strong{1}\{ b_ i \geq r\} }$

where $x_ i$ and $x_ j$ are points within the boundary at a distance greater than or equal to r; where $b_ i$ is the distance of $x_ i$ to the boundary of $W$ , $\partial W$ ; and where $k_ h(u) = k(u/h)/ h$ for a kernel $k(.)$ , such as the uniform kernel or the Epanechnikov kernel. For more information about the uniform kernel, see the section Nonparametric Intensity Estimation. For a homogeneous Poisson process, $g(r) = 1$ . For any point pattern, values of $g(r)$ greater than 1 indicate clustering or attraction at distance r, whereas values of $g(r)$ less than 1 indicate regularity.