The SPP Procedure

Introduction to Point Pattern Analysis

In point pattern analysis, you want to describe characteristics of the events (observations) that compose the pattern. The events are manifestations of a phenomenon or process at random locations. Therefore, your analysis goal is to investigate underlying connections among these events that could explain the phenomenon.

In some cases, events might have additional attributes, known as marks. If a point pattern has a mark, then it is called a marked point pattern. There can be continuous marks or categorical marks, depending on whether the mark attribute takes continuous values or values from a list of discrete levels, respectively. A marked point pattern that has a categorical mark attribute is known as a multitype point pattern. A multitype point pattern is also called a multivariate point pattern because you can view it as a collection of point patterns, one for each type.

To study the events, you use the concepts of the study region (also called a study window) to represent the area where the point pattern is defined. The window selection can be a subjective choice, and it definitely can affect the analysis. When the window is a subregion of a larger region where the point process operates, you might need to account for edge effects. This term describes discrepancies that can appear in the analysis, depending on whether you consider that events close to the window edges have neighbors outside the window area.

Point pattern analysis often focuses on whether interaction exists among the observations in a spatial point pattern. That is, you test whether the points are spread evenly around the study region with no particular pattern, or alternatively whether there tends to be more or less clumping of points than you would expect purely from randomness. To this end, you usually test the hypothesis of complete spatial randomness (CSR) in the point pattern. According to CSR, the events follow a Poisson distribution with constant mean, and they have no interactions. A point pattern can follow CSR, in which case it is known as a homogeneous Poisson process. Alternatively, a point pattern can demonstrate event interaction or clustering.

You can test CSR by using heuristic approaches that use sparse sampling methods in exploratory and summary analysis. Two general approaches to this are as follows:

  • distance methods, where you compare the empirical distribution function (EDF) of distance between events with an EDF that is based on the CSR assumption

  • quadrats, where you partition the spatial framework into smaller subregions and study the number of events (also known as the quadrat count) in each subregion

The SPP procedure provides options for implementing both of these approaches. For more information, see the sections Testing for Complete Spatial Randomness and Statistics Based on Second-Order Characteristics.

You can tell a lot about the behavior of a point pattern if you have an expression for the point pattern intensity, which shows the number of events per unit area. A simple way to estimate intensity from the point pattern events is to produce kernel density estimates. You can also model the intensity by maximizing suitable pseudolikelihood expressions for the logarithmic intensity. Intensity models can also incorporate information about covariate variables; together with distance methods, they enable you to examine whether a covariate plays a significant role in the underlying process.

A SAS/STAT procedure that compares to PROC SPP is the KDE procedure, which fits the special case of Gaussian bivariate kernels for the purpose of nonparametric density estimation. PROC SPP enables you to perform much more extensive nonparametric intensity estimation by using different types of kernels, and it provides support for adaptive kernel estimation. In addition, PROC SPP enables you to fit parametric inhomogeneous poisson process models and use a variety of residual diagnostics to perform model validation.