The KRIGE2D Procedure

Theoretical Semivariogram Models

Consider a stochastic spatial process represented by the stationary spatial random field (SRF) {$Z(\bm {s}), \bm {s} \in D \subset \mathcal{R}^2 $} (Christakos, 1992). The VARIOGRAM procedure computes the empirical (also known as sample or experimental) semivariance of $Z(\bm {s})$. Prediction of the spatial process $Z(\bm {s})$ at unsampled locations by techniques such as ordinary kriging requires a theoretical semivariogram or covariance.

When you use PROC VARIOGRAM and PROC KRIGE2D to perform spatial prediction, you must determine a suitable theoretical semivariogram based on the sample semivariogram. Various methods exist to fit semivariogram models, such as least squares, maximum likelihood, and robust methods (Cressie, 1993, section 2.6). You can use PROC VARIOGRAM to perform automated fitting of a semivariogram model with weighted or ordinary least squares. A different approach is manual fitting, in which a theoretical semivariogram is chosen based on visual inspection of the empirical estimate; see, for example, Hohn (1988, p. 25).

In some cases, a plot of the experimental semivariogram suggests that a single theoretical model is inadequate. Nested models, anisotropic models, and the nugget effect increase the scope of theoretical models available. All of these concepts are discussed in this section. The specification of the final theoretical model is provided by the syntax of PROC KRIGE2D.

Figure 49.5 shows the general flow of investigation. The empirical semivariogram is computed after a suitable choice is made for the LAGDISTANCE= and MAXLAGS= options in PROC VARIOGRAM, and possibly the NDIR= option or the DIRECTIONS statement for computations in more than one directions. Potential theoretical models (which can also incorporate nesting, anisotropy, and the nugget effect) are then plotted against the empirical semivariogram and evaluated. A suitable theoretical model is found by using the methodology presented in the section Examples: VARIOGRAM Procedure in the VARIOGRAM procedure.

Eight theoretical models are supported by PROC KRIGE2D: the Gaussian, exponential, Matérn, spherical, cubic, pentaspherical, sine hole effect and power models. See also the section Theoretical Semivariogram Models in the VARIOGRAM procedure. These eight model forms are now examined in more detail: the Gaussian, exponential, and Matérn forms are examined as one group; the spherical, cubic, and pentaspherical as a second group; and the remaining power and sine hole effect models are examined individually. For comparison purposes, the axes in the forms’ illustrations are kept the same across the plots, and the corresponding parameters of the different forms have the same values.

In PROC KRIGE2D the parameters $a_0$ and $c_0$ for all forms correspond to the RANGE= and SCALE= options, respectively, in the MODEL statement. For all model forms, the dimension of $c_0$ is the same as the dimension of the variance of the spatial process $Z(\bm {s})$. For all forms but the power model, the dimension of $a_0$ is length with same units as the distance h in the semivariance $\gamma _ z(h)$. See the section The Power Semivariogram Model for more details about interpretation of the power model $a_0$ parameter.

Figure 49.5: Flowchart for Semivariogram Selection

Flowchart for Semivariogram Selection


The Gaussian Semivariogram Model

The form of the Gaussian model is

\[  \gamma _ z(h) = c_0\left[1-\exp \left(-\frac{h^2}{a_0^2}\right)\right]  \]

The shape is displayed in Figure 49.6, using range $a_0=1$ and scale $c_0=4$.

The vertical line at $h=r_{\epsilon }=\sqrt {3}a_0$ shows the effective (or practical) range as defined by Deutsch and Journel (1992) or the range $\epsilon $ defined by Christakos (1992). The effective range is the h-value where the covariance is approximately 5% of its value at zero. Alternatively, the stationarity assumption implies that the effective range is the h value where the semivariance is approximately 5% of the sill value, as shown in Figure 49.6.

In the Gaussian model the semivariance $\gamma _ z(h)$ approaches the sill asymptotically at $c_0$.

The Exponential Semivariogram Model

The form of the exponential model is

\[  \gamma _ z(h) = c_0\left[1-\exp \left(-\frac{h}{a_0}\right)\right]  \]

The shape is displayed in Figure 49.6, using range $a_0=1$ and scale $c_0=4$.

The vertical line at $h=r_{\epsilon }=3a_0$ is the effective (or practical) range or the range $\epsilon $ (that is, the h-value where the covariance is approximately 5% of its value at zero).

As in the Gaussian model, the sill in this example is at 4.0 variance units (corresponding to $c_0=4$) and is approached asymptotically.

The major distinguishing feature of the Gaussian and exponential forms is the shape in the neighborhood of the origin h = 0, as Figure 49.6 illustrates. In general, small lags are important in determining an appropriate theoretical form based on an empirical semivariogram.

The Matérn Semivariogram Model

The form of the Matérn model is

\[  \gamma _ z(h) = c_0 \left[ 1 - \frac{ 2 }{ \Gamma (\nu ) } \left( \frac{ h \sqrt {\nu } }{ a_0 } \right)^{\nu } K_{\nu } \left( 2 \frac{ h \sqrt {\nu } }{ a_0 } \right) \right]  \]

where $\nu > 0$ is the smoothness factor parameter. Figure 49.6 shows an example of the Matérn form, where range $a_0=1$, scale $c_0=4$, and $\nu = 1.5$.

The Matérn semivariance $\gamma _ z(h)$ is a class of semivariance models that emerge for different values of the smoothing parameter $\nu $. The Matérn form reaches its sill value $c_0$ asymptotically.

The Gaussian and exponential semivariances are two frequently used members of the Matérn class of semivariances. In particular, the exponential semivariance model is derived from the Matérn class of models for $\nu =0.5$. Also, when $\nu \rightarrow \infty $ then the Matérn semivariance gives the Gaussian model. In Figure 49.6 the selected value of $\nu = 1.5$ places the Matérn form in between the Gaussian and the exponential. The Matérn semivariance typically begins to look and behave as the Gaussian for values of $\nu > 10$.

Figure 49.6: Gaussian, Exponential, and Matérn Semivariograms with Parameters $a_0=1$, $c_0=4$, and $\nu =1.5$

Gaussian, Exponential, and Matérn Semivariograms with Parameters a0=1, c0=4, and =1.5


The Spherical Semivariogram Model

The form of the spherical model is

\[  \gamma _ z(h) = \left\{  \begin{array}{lc} c_0\left[\frac{3}{2}\frac{h}{a_0}-\frac{1}{2}(\frac{h}{a_0})^3\right] &  \mbox{for $h \le a_0$} \\ c_0 &  \mbox{for $h > a_0$} \end{array} \right.  \]

The shape is displayed in Figure 49.7, using range $a_0=1$ and scale $c_0=4$.

The vertical line at h = 1 shows the range $a_0$ of the model.

In the case of the spherical model, $\gamma _ z(h)$ actually reaches the sill value at $c_0$, unlike the Gaussian and exponential types where the sill is a horizontal asymptote.

The Cubic Semivariogram Model

The form of the cubic model is

\[  \gamma _ z(h) = \left\{  \begin{array}{lc} c_0 \left[ 7(\frac{h}{a_0})^2 - \frac{35}{4}(\frac{h}{a_0})^3 + \frac{7}{2}(\frac{h}{a_0})^5 - \frac{3}{4}(\frac{h}{a_0})^7 \right] &  \mbox{for $h \le a_0$} \\ c_0 &  \mbox{for $h > a_0$} \end{array} \right.  \]

The cubic form shape is displayed in Figure 49.7, using range $a_0=1$ and scale $c_0=4$.

The vertical line at h = 1 shows the range $a_0$ of the model.

Similarly to the spherical model, the cubic model, $\gamma _ z(h)$ reaches the sill value at $c_0$ and maintains this value after a distance h equal to the model range.

The Pentaspherical Semivariogram Model

The form of the pentaspherical model is

\[  \gamma _ z(h) = \left\{  \begin{array}{lc} c_0 \left[ \frac{15}{8}\frac{h}{a_0}- \frac{5}{4}(\frac{h}{a_0})^3 + \frac{3}{8}(\frac{h}{a_0})^5 \right] &  \mbox{for $h \le a_0$} \\ c_0 &  \mbox{for $h > a_0$} \end{array} \right.  \]

The pentaspherical form shape is displayed in Figure 49.7, using range $a_0=1$ and scale $c_0=4$.

The vertical line at h = 1 shows the range $a_0$ of the model.

The pentaspherical semivariance behaves like the spherical and cubic semivariances, in that $\gamma _ z(h)$ increases with distance until it reaches the sill value $c_0$ at the distance h equal to the model range $a_0$.

Figure 49.7 accents the differences in the behavior of the featured semivariances. Specifically, the cubic and pentaspherical forms reach the sill value faster than the spherical form. Also, the spherical and pentaspherical forms exhibit a more linear behavior at distances close to the origin h = 0.

Figure 49.7: Spherical, Cubic, and Pentaspherical Semivariograms with Parameters $a_0=1$ and $c_0=4$

Spherical, Cubic, and Pentaspherical Semivariograms with Parameters a0=1 and c0=4


The Sine Hole Effect Semivariogram Model

The form of the sine hole effect model is

\[  \gamma _ z(h) = c_0 \left[ 1 - \frac{\sin (\pi h / a_0)}{\pi h / a_0} \right]  \]

Figure 49.8 shows an example of the sine hole effect form, where range $a_0=1$ and scale $c_0=4$.

The vertical line at h = 1 shows the range $a_0$ of the model.

The sine hole effect semivariance $\gamma _ z(h)$ increases with distance. It has the distinct characteristic that it reaches the sill at a distance $h=a_0$ equal to the model range and then it oscillates around the sill value with a decreasing amplitude as it moves to higher values of h.

The Power Semivariogram Model

The form of the power model is

\[  \gamma _ z(h) = c_0h^{a_0}  \]

For this model, the parameter $a_0$ is known as the power exponent. This is a dimensionless quantity which must range within $0 \leq a_0 < 2$ so that the power model is a permissible semivariance model.

The KRIGE2D procedure enables you to specify power exponent values that are outside this range when you also explicitly specify the POWNOBOUND option in the MODEL statement. However, parameter values equal to or greater than 2 can result in singular covariance matrices or negative prediction errors.

For the special case of $a_0 = 1$ the form yields a straight line. In this case the power model reduces to the linear model. The parameter $c_0$ designates the slope of the power form and has dimensions of the variance as in the other models.

The power model has no sill; this differentiates it from the rest of the models presented earlier. Spatial correlation that is described by a power model indicates that the stochastic process variance increases constantly with distance. The shape of the power model with $a_0=0.4$ and $c_0=4$ is displayed in Figure 49.8.

Figure 49.8: Sine Hole Effect Semivariogram with Range $a_0=1$ and Scale $c_0=4$, and Power Semivariogram with Exponent $a_0=0.4$ and Slope $c_0=4$

Sine Hole Effect Semivariogram with Range a0=1 and Scale c0=4, and Power Semivariogram with Exponent a0=0.4 and Slope c0=4


For comparison purposes, Figure 49.9 displays all eight semivariance forms that you can use with PROC KRIGE2D. The figure displays a composition of the different forms with the parameter values selected earlier throughout this section. Depending on the empirical semivariogram, these models provide you with flexibility to select an appropriate theoretical semivariance model for prediction.

Figure 49.9: Semivariogram Forms Used in PROC KRIGE2D

Semivariogram Forms Used in PROC KRIGE2D


Nested Models

For a given set of spatial data, a plot of an experimental semivariogram might not seem to fit any of the individual theoretical models. In such a case, you might obtain a more accurate fit if you consider your covariance model to be the sum of two or more covariance structures. Such covariance models are called nested models. Nesting is common in geologic applications where correlations can exist at different length scales. At small lag distances h, the smaller scale correlations dominate, while the large scale correlations dominate at larger lag distances.

Nested models are permissible covariances if they are the sum of permissible models. Therefore, you can include in a sum any combination of the models presented in the preceding subsections and produce permissible covariance models. As an illustration, consider two semivariogram models: an exponential and a spherical,

\[  \gamma _{z,1}(h) = c_{0,1}\exp (-\frac{h}{a_{0,1}})  \]

and

\[  \gamma _{z,2}(h) = \left\{  \begin{array}{lc} c_{0,2}\left[\frac{3}{2}\frac{h}{a_{0,2}}-\frac{1}{2}(\frac{h}{a_{0,2}})^3\right], &  \mbox{for $h \le a_{0,2}$} \\ c_{0,2}, &  \mbox{for $h > a_{0,2}$} \end{array} \right\}   \]

with $c_{0,1}=1, a_{0,1}=2.5, c_{0,2}=2$, and $a_{0,2}=1$. If both of these correlation structures are present in a spatial process {$Z(\bm {s}), \bm {s} \in D$}, then the semivariance $\gamma _{z}(h)$ of this process can be expressed as

\[  \gamma _{z}(h) = \gamma _{z,1}(h) + \gamma _{z,2}(h)  \]

In this case, the experimental semivariogram $\gamma _{z}(h)$ for the process $Z(\bm {s})$ resembles the semivariogram of the sum of $\gamma _{z,1}(h)$ and $\gamma _{z,2}(h)$. This is illustrated in Figure 49.10.

The sum of $\gamma _{z,1}(h)$ and $\gamma _{z,2}(h)$ in Figure 49.10 does not resemble any single theoretical semivariogram; however, its shape at h = 1 is similar to a spherical form. The asymptotic approach to a sill at three variance units, along with the shape around h = 0, indicates an exponential structure. The sill value $c_{0}$ of the sum is the sum of the individual sills $c_{0,1}=1$ and $c_{0,2}=2$. In general, a nested model has a sill equal to the sum of the sills of its nested structures plus the nugget effect, if present.

See Hohn (1988, p. 38ff) for further examples of nested correlation structures.

Figure 49.10: Sum of Exponential and Spherical Structures at Different Scales

Sum of Exponential and Spherical Structures at Different Scales