The SPP Procedure

PROC SPP Statement

  • PROC SPP options;

The PROC SPP statement invokes the SPP procedure. Table 105.1 summarizes the options available in the PROC SPP statement.

Table 105.1: Options Available in the PROC SPP Statement

Option

Description

DATA=

Specifies the input data set

EDGECORR=

Requests edge correction in the analysis

NODUP

Specifies inclusion or exclusion of collocated observations

NOPRINT

Suppresses normal display of results

PLOTS

Specifies the plot display and options

SEED=

Specifies the seed value for the random number generator


You can specify the following options in the PROC SPP statement.

DATA=SAS-data-set

specifies a SAS data set that contains the x and y coordinate variables of one or more point patterns, associated mark variables, and event identifiers. Mark variables and event identifiers are specified using MARK= and EVENT= options, respectively, in the PROCESS statement. If your analysis involves covariates, you must also include them in the DATA= data set. When you include covariates, you must identify individual point patterns by specifying the EVENT= option in the PROCESS statement. You must specify a DATA=SAS-data-set; there is no default.

EDGECORR=ON | OFF

specifies whether you want to correct edge effects in the distance function computations and kernel density estimation. Edge correction is not applicable for the J function. For more information about how SPP implements edge correction, see the section Border Edge Correction for Distance Functions. By default, EDGECORR=ON.

NODUP=nodup-option

specifies whether to eliminate multiple records of data that have the same pairs of coordinates in the DATA= data set. When multiple such records exist among observations of the event, or among observations of the same covariate variable, they are known as duplicates. For example, if two or more event records feature the same coordinates, then your data contain duplicates. However, if your data contain a record of an event and a record of a covariate that happen to be sampled at the same coordinates, then they are not duplicates.

The analysis of a spatial point pattern usually requires that no two events can share the same location. If your data include such duplicates, this option enables you to deal with them in different ways. You can specify the following values:

TRUE <(true-suboption)>

removes duplicates from the analysis. You can also specify the following true-suboption:

KEEP=AVG | ONE

specifies how to treat removal of duplicate records. You can specify the following values.

AVG

removes all but one record from a set of records that contain duplicate coordinates. In addition, if the duplicates are records of a numeric mark or covariate, then the average attribute value of all duplicate records is assigned to the single record that is retained. If any of the duplicate records has a missing value for the numeric mark or covariate, then it does not contribute to the average. Character variables ignore the KEEP=AVG suboption and retain only the last value in any series of collocated records.

ONE

keeps only a single record out of multiple records that have the same duplicate coordinates. When you specify KEEP=ONE, PROC SPP retains the last record in any series of collocated records.

By default, KEEP=ONE.

FALSE

retains and uses all duplicates in the analysis.

If mark or covariate variables are included in the analysis, the NODUP= option specification applies the same mode of action to each individual variable. If PROC SPP finds duplicates, then it issues a note. By default, NODUP=TRUE(KEEP=ONE).

NOPRINT

suppresses the normal display of results. This option is useful when you want only to create one or more output data sets with the procedure. Note: This option temporarily disables the Output Delivery System (ODS). For more information, see the section ODS Graphics.

PLOTS <(global-plot-options)> <= plot-request <(options)>>
PLOTS <(global-plot-options)> <= (plot-request <(options)> <... plot-request <(options)>>)>

controls the plots that are produced through ODS Graphics. When you specify only one plot-request, you can omit the parentheses around the plot request. Here are some examples:

plots=none
plots=observ
plots=(observ intensity)
plots(unpack)=observ
plots=(observ(attr=mark) observ(attr=event))

ODS Graphics must be enabled before plots can be requested. For example:

ods graphics on;

For more information about enabling and disabling ODS Graphics, see the section Enabling and Disabling ODS Graphics in Chapter 21: Statistical Graphics Using ODS.

You can specify the following global-plot-options:

EQUATE

produces all plots that have coordinates so that the axes coordinates have equal size units. This option is ignored for panel plots.

ONLY

suppresses the default plots. Only plots that are specifically requested are displayed.

UNPACKPANEL
UNPACK

suppresses paneling. By default, multiple plots can appear in some output panels. Specify UNPACKPANEL to get each plot in a separate panel. You can specify PLOTS(UNPACKPANEL) to unpack the default plots. You can also unpack individual panel plots by specifying the UNP suboption in the FFUN, GFUN, KFUN, LFUN, and OBSERVATIONS(LEVEL=(SPLIT)) plot options.

You can specify the following individual plot-requests and options:

ALL

produces all appropriate plots. You can also specify other options with ALL.

CSRKSTEST

produces a plot for the Kolmogorov-Smirnov weighted EDF test for complete spatial randomness in the presence of covariates. To request this plot, you must specify the COVTEST statement and include trends on the right side of the COVTEST statement.

EMPTYSPACE <(emptyspace-plot-options)>

produces a plot of the nearest-neighbor distance for every grid node in the window. You can specify the following emptyspace-plot-options:

FILL=ON | OFF

specifies whether to produce a surface plot of the nearest neighbor distances. By default, FILL=ON.

LINE=ON | OFF

specifies whether to produce a contour line plot of the nearest neighbor distances. By default, LINE=OFF.

OBS=ON | OFF

specifies whether to produce an overlaid scatter plot of the observations in addition to nearest neighbor distances. By default, OBS=OFF.

F <(UNPACK)>

requests that a panel of diagnostics for the empty space function F be produced. The F function is the empirical distribution of observed distances to the nearest observation from any location in the point pattern window. The panel contains four plots: an EDF plot that shows simulation envelopes for CSR, an EDF-CSR difference plot, a PP plot that compares the EDF of the summary statistic, and a confidence interval plot that shows envelopes for the confidence intervals of the summary statistic. If you specify the PLOTS=F option without requesting any distance function calculations in the PROCESS statement, then it is ignored. You can specify the following option:

UNPACK

suppresses paneling of the F function plots and produces each constituent plot in the panel separately.

The F plot is produced when you specify the F option in the PROCESS statement.

G <(UNPACK)>

produces a panel of diagnostics for the nearest-neighbor distance function G. The G function is the empirical distribution function of observed distances to the nearest observation from any other observation in the point pattern window. The panel contains four plots: an EDF plot that shows simulation envelopes for CSR, an EDF-CSR difference plot, a PP plot that compares the EDF of the summary statistic, and a confidence interval plot that shows envelopes for the confidence intervals of the summary statistic. If you specify PLOTS=G option without requesting any distance function calculations in the PROCESS statement, then it is ignored.

You can specify the following option:

UNPACK

suppresses paneling of the G function plots and produces each constituent plot in the panel separately.

The G plot is produced when you specify the G if you specify the G function option in the PROCESS statement.

INTENSITY <(intensity-plot-options)>

produces a plot of the estimated intensity function for every grid node in the window. You can specify the following intensity-plot-options:

EST=KERNEL | FIT

specifies the source to use for the intensity estimate. You can specify the following values:

KERNEL

produces a plot of the intensity kernel density estimate. This suboption is incompatible with requests for standard error in the FILL= and LINE= intensity plot options. If you specify EST=KERNEL and either the FILL=SE suboption or the LINE=SE suboption, then intensity plot request is ignored.

FIT

produces a plot of the estimated intensity on the basis of a model fit when you fit an intensity model by specifying the MODEL statement.

FILL=INTENSITY | NONE | SE

specifies which type of surface plot to produce. You can specify the following values:

INTENSITY

produces an estimated intensity surface plot.

NONE

produces no surface plot.

SE

produces a standard errors surface plot.

The default behavior depends on the LINE suboption as follows: If you specify LINE=NONE or entirely omit the LINE suboption, then FILL=INTENSTIY. If you specify LINE=INTENSITY or LINE=SE, then the FILL= suboption is set to the same value as the LINE suboption.

LINE=INTENSITY | NONE | SE

specifies which type of plot to produce. You can specify the following values:

INTENSITY

produces an estimated intensity contour line plot.

NONE

produces no contour line plot.

SE

produces a standard errors contour line plot.

If you omit the LINE suboption, the behavior depends on the FILL suboption as follows: If you specify FILL=NONE or entirely omit the FILL= suboption, then LINE=INTENSITY. If you specify FILL=INTENSITY or FILL=SE, then the LINE suboption is set to the same value as the FILL suboption.

OBS=ON | OFF

specifies whether to produce an overlaid scatter plot of the observations in addition to the intensity plot. By default, OBS=OFF.

You can specify multiple instances of the INTENSITY plot option to produce intensity plots that have different characteristics. If you specify multiple instances of any of the FILL=, LINE=, or OBS= suboptions in the same INTENSITY plot request, then one plot is produced that honors the last value specified for any of these suboptions. If you explicitly specify (or the suboptions imply) the combination FILL=NONE and LINE=NONE, then the intensity plot is not produced.

J <(UNPACK)>

produces a combined plot of the J function. The J function is the ratio of transformations of the F and G nearest-neighbor functions. The combined plot shows both the confidence intervals for the summary statistic and the simulation envelope for comparison with CSR. You can specify the following option:

UNPACK

produces each constituent J plot separately.

J plots are produced when you specify the J option in the PROCESS statement. If you specify PLOTS=J without specifying the J option in the PROCESS statement, then PLOTS=J is ignored.

K <(UNPACK)>

produces a panel of Ripley’s K function. The K function is the ratio of the expected number of point pattern observations within distance r of any other observation divided by the average intensity value of the point pattern. The panel contains four plots: an EDF plot that shows simulation envelopes for CSR, an EDF-CSR difference plot, a PP plot that compares the EDF of the summary statistic, and a confidence interval plot that shows envelopes for the confidence intervals of the summary statistic.

The K plot is produced when you specify the K option in the PROCESS statement. If you specify PLOTS=K without specifying the K option in the PROCESS statement, then PLOTS=K is ignored. You can specify the following option:

UNPACK

suppresses paneling of the K function plots and produces each constituent plot separately.

L <(UNPACK)>

produces a panel of the L function, which is a transformation of the K function. The panel contains four plots: an EDF plot with simulation envelopes for CSR, an EDF-CSR difference plot, a PP plot that compares the EDF of the summary statistic, and a confidence interval plot that shows envelopes for the confidence intervals of the summary statistic.

The L plot is produced when you specify the L option in the PROCESS statement. If you specify the PLOTS=L option without requesting the L option in the PROCESS statement, then PLOTS=L is ignored. You can specify the following option:

UNPACK

suppresses paneling of the L function plots and produces each constituent plot separately.

LURKING <(lurking-plot-options)>

requests lurking variable plots, which show the cumulative raw residual with respect to the covariates or the coordinate variables or both. By default, PROC SPP computes lurking variable panel plots with respect to both covariates and coordinates. You can specify the following lurking-plot-options:

ALL

creates lurking variable plots of the model’s covariates and of the coordinate variables that are specified in the PROCESS statement .

COORD

creates lurking variable plots only of the coordinate variables that are specified in the PROCESS statement.

COVAR

creates lurking variable plots only of the covariates and does not create plots with respect to the coordinate variables X and Y.

UNPACK

unpacks the lurking variable panel plots into individual lurking variable plots.

The default is LURKING(ALL).

NONE

suppresses all plots.

OBSERVATIONS <(observations-plot-option)>
OBSERV <(observations-plot-option)>
OBS <(observations-plot-option)>

produces the observed data plot. You can specify the following observations-plot-options:

ATTR=EVENT | MARK

specifies the observations attribute that you want to plot. You can specify the following values:

EVENT

specifies a plot of the locations of the point-pattern event observations.

MARK

specifies a plot of the locations and the mark values of the point-pattern event observations. If you do not specify OBS(MARK) or if the analysis skips the specified mark variable, then the observations plot request is ignored.

PCF <(UNPACK)>

produces a combined plot of the pair correlation function, g. The combined plot shows both the confidence intervals for the summary statistic and the simulation envelope for comparison with CSR.

The PCF plot is produced when you specify the PCF option in the PROCESS statement. If you specify the PLOTS=PCF option without specifying the PCF option in the PROCESS statement, then PLOTS=PCF is ignored. You can specify the following option:

UNPACK

suppresses the combination of different PCF plots into a single plot and produces each constituent plot separately.

RESIDUAL <(residual-plot-options)>

produces a plot of the residual diagnostics. By default, the SPP procedure produces a panel plot that contains smoothed raw residuals, raw residuals, and lurking variable plots with respect to the X and Y coordinates. In addition, you can specify the following residual-plot-options:

TYPE=CUM | RES

specifies the type of residual to be plotted in the lurking variable plots of the coordinate variables. You can specify the following values:

CUM

plots the cumulative residual

RES

plots a noncumulative residual as a scatter plot.

UNPACK

unpacks the panel plot, which contains smoothed raw residuals, raw residuals, and a lurking variable plot, into four separate plots.

SEED=seed-value

specifies the seed to use for the random number generator. The SEED= value has to be an integer.

TRENDS

produces a plot of all trend covariates. This option is ignored if no trend covariates are specified in the TREND statement.