Previous Page | Next Page

The VARIOGRAM Procedure

Example 95.2 An Anisotropic Case Study with Surface Trend in the Data

This example shows how to examine data for nonrandom surface trends and anisotropy. You will use simulated data where the variable is atmospheric ozone (O) concentrations measured in Dobson units (DU). The coordinates are offsets from a point in the southwest corner of the measurement area, with the east and north distances in units of kilometers (km). You will be working with 300 measurements in a square area of 100 km 100 km.

The following statements read the data set.

   title 'Semivariogram in Anisotropic Case With Trend Removal Example';
   data ozoneSet;
      input East North Ozone @@;
      datalines;
      34.9 68.2 286  39.2 12.5 270  44.4 37.7 275  90.5 27.0 282 
      91.1 40.8 285  98.6 61.6 294  61.8 26.7 281  64.0 11.5 274
      22.4 26.5 274  89.3 18.3 279  32.3 28.3 274  31.1 53.1 279
      43.0 17.5 272  79.3 42.3 283  99.9 57.9 291   1.8 24.1 273
      81.7 73.5 294  22.9 32.0 273  64.9 67.5 292  76.5 56.3 285
      78.7 11.7 276  61.8 99.3 307  49.1 86.6 299  40.0 35.8 273
      69.3  3.8 278  23.4  9.3 270  66.3 94.3 304  71.3  6.5 275
       9.7 54.4 280  85.2 81.7 300  30.3 60.9 284  94.6 94.3 309
      10.6 10.3 271  73.0 43.0 280   4.9 50.7 280  19.0 79.4 289
       2.4 73.1 287  77.7 25.2 278   8.4 27.1 276  93.5 19.7 279
       0.2 34.5 275  50.4 91.3 302  55.7 26.2 279  50.3  2.3 274
      16.3 84.4 293  19.0  6.9 272  57.1 92.3 303  61.0  0.4 275
      10.7 18.7 271  15.2 43.5 277  67.0 87.4 301  79.0 54.0 285
      36.0 53.3 279  58.3 52.1 282  56.6 79.7 294  40.4 32.4 275
      48.9 64.1 286  54.0 54.9 281  27.5 48.5 279  36.4 30.3 275
      10.5 31.0 273  87.0 39.4 283  47.9 37.5 274  64.7 63.4 288
       0.5 90.8 294  22.8 22.4 275  31.1 78.8 291  93.6 49.8 290
       2.5 39.3 273  83.6 25.6 282  49.8 24.1 278  73.1 91.8 305
      30.5 90.6 297  26.0 61.2 284  58.4 66.2 289  30.5  4.3 273
      38.3 85.6 298  89.2 96.6 309  53.4  6.3 275  27.3 12.8 271
      43.4 56.5 281  99.5 86.9 305  85.8 22.8 281  83.0 10.9 278
      24.8 16.7 271  51.1 18.8 275  59.0 54.3 283  35.5 91.4 298
      18.1 56.0 279  78.0 36.4 277  56.8  6.9 275  21.1 44.5 277
      73.9 75.9 296  54.2  0.1 274  33.2 75.1 290  38.2  3.3 274
      15.2 14.7 272  15.9 84.2 292  60.2 95.2 304   9.8 27.2 276
      91.2 56.4 289  94.7 86.9 303  56.7 49.6 281  24.2  9.5 270
      43.0 17.0 272  85.9 10.7 278  53.9 41.1 276  30.4 63.4 286
      62.8 86.3 299  76.8 24.6 279  31.6 94.0 300  26.9 73.8 287
      18.9 68.4 284  99.4 37.2 285  79.1  3.3 277  34.9 74.7 289
       6.4 33.8 277  48.4 82.2 294  86.0 58.0 289  92.0 60.4 293
      50.2 91.6 300  12.2 38.3 275  72.7 48.9 283  82.7 34.1 279
      77.0 51.0 286  86.6 15.8 278  42.0 42.7 277  99.3  8.2 278
      17.4 70.6 286  11.2 92.4 295  60.2 28.8 280  92.0 73.3 297
      25.3 30.6 273  36.6  8.9 274  34.2  4.4 273  26.6 54.7 278
       1.7 27.4 278  49.6  1.1 275  62.8 89.3 301  28.0 49.3 279
      51.2 75.1 293  59.3 93.5 304  83.6 90.5 304  79.4 87.0 302
      78.0 28.3 281  16.8 19.1 272   9.1 81.2 292  23.7 55.8 277
      75.5 21.3 279  64.4 43.3 279  38.9 98.9 303  22.5 87.9 293
      96.7 37.9 285  92.3 93.9 308  16.9 25.4 273  15.2 61.5 283
      73.8 94.0 306  57.4 97.2 305  73.2  4.9 276  39.2 82.3 294
      95.7 99.4 315  66.0 98.4 306  95.3 26.9 283  45.4 75.3 291
      64.8 15.4 276  69.8 55.4 284  36.3 74.9 290   9.9 22.2 276
      65.8 13.9 276  13.0 82.0 293  95.6 77.2 301  32.5 55.6 279
      45.8 35.5 275  62.2  6.6 274  25.2 51.2 279  92.4  8.1 277
      40.5 35.3 273   9.9  3.9 271  43.5 44.0 278  68.6 61.3 287
      64.2 77.5 296  57.6 81.6 294  69.5 64.7 291  64.3 95.1 304
       2.8 62.4 283  33.2 83.3 294  10.7 71.0 285  24.3 88.2 294
      94.5 32.2 283  21.0 67.6 286  20.1 71.6 286  85.2 71.3 296
      94.8 30.7 283  53.4 92.0 301  81.0 50.0 287  54.6 29.9 277
      71.1 90.1 303  15.2  2.9 271  83.6 17.8 278  76.0 21.8 279
      55.6 37.4 275  86.7 83.7 303  43.6 83.6 295  44.2 31.7 274
      90.0 83.3 300   6.2  0.5 270  42.2 87.7 298  31.7  4.3 273
      91.4 41.2 285  78.0 50.6 286  27.1 56.1 278  72.6 63.9 291
      29.3 49.9 281  49.0 36.9 275  13.9 53.5 280  93.1 83.2 300
      73.0 61.6 289  63.1 27.5 280  38.3 72.5 287  72.7 34.2 277
       6.9 32.3 274  17.1 58.6 280  19.6 94.6 297   2.7 36.5 276
      34.5  5.5 275  98.6 95.9 313   9.1 71.1 285  88.6 55.8 287
      26.8 78.5 289  64.8 66.6 292  59.7 25.7 280  47.3 70.2 288
       6.1 94.4 296  50.5 82.7 296   9.1 41.6 276  86.0 71.0 296
      75.2 69.8 293  73.3 84.8 300  42.5 15.9 274  56.1 76.1 292
      87.9 41.2 285  65.1  9.8 274  79.0 41.2 282  44.6 65.1 287
      54.7 68.3 289  57.0 26.8 279   8.7 12.3 270  33.7 61.9 286
      25.0 55.8 278  69.3 94.9 306  49.2 64.6 287  78.2 93.7 307
      47.9 26.6 277  96.9 51.4 292  39.6 73.4 287  37.9 66.1 285
      94.5 71.4 296  51.6 18.3 276  37.6 73.2 287  68.5 10.7 274
      46.7  9.6 273  87.4 38.9 282  45.6 43.9 277  70.7 76.9 296
      82.8 53.6 287  82.5 55.4 286  37.8  5.1 275  89.8 96.1 309
      63.9  4.9 276   2.0 11.7 270  31.3 59.2 282  93.9 65.3 296
      47.9 93.0 301  29.9 36.0 274  14.6 28.3 274  17.5 70.1 286
       2.6 68.5 282  23.1 12.0 268  36.8 20.4 273  80.9  9.0 276
      39.2  0.0 274  26.2 44.3 276  81.9 12.9 277   3.2 21.4 272
      76.9 76.7 297  88.6  7.7 277   9.7  8.4 273  26.7 91.5 296
      73.8  6.1 276  33.7 39.3 276  64.0 58.4 286   5.7 91.2 295
      85.8 93.8 307  85.8 39.1 281  93.9 63.4 295  53.1 46.3 278   
      51.9 42.9 277  16.8 75.7 288  29.2 66.9 285  37.4 72.5 287
      ;
   run;

The initial step is to explore the data set by inspecting the data spatial distribution. Run PROC VARIOGRAM, specifying the NOVARIOGRAM option in the COMPUTE statement as follows:

   ods graphics on;
   proc variogram data=ozoneSet; 
      compute novariogram nhc=35;
      coord xc=East yc=North;
      var Ozone;
   run;

The result is a scatter plot of the observed data shown in Output 95.2.1. The scatter plot suggests an almost uniform spread of the measurements throughout the prediction area. No direct inference can be made about the existence of a surface trend in the data. However, the apparent stratification of ozone values in the northeast–southwest direction might indicate a nonrandom trend.

Output 95.2.1 Ozone Observation Data Scatter Plot
 Ozone Observation Data Scatter Plot

You will need to define the size and count of the data classes by specifying suitable values for the LAGDISTANCE= and MAXLAGS= options, respectively. Compared to the smaller sample of thickness data used in Theoretical Semivariogram Model Fitting, the larger size of the ozoneSet data results in more densely populated distance classes for the same value of the NHC= option. Once you experiment with a variety of values for the NHC= option, you can adjust LAGDISTANCE= to have a relatively small number. Then you can account for a large value of MAXLAGS= so that you obtain many sample semivariogram points within your data correlation range. Specifying these values requires some exploration, for which you might need to return to this point from a later stage in your semivariogram analysis. For illustration purposes you now specify NHC=35.

Your choice of NHC=35 yields the pairwise distance intervals table in Output 95.2.2 and the corresponding histogram in Output 95.2.3.

Output 95.2.2 Pairwise Distance Intervals Table
Pairwise Distance Intervals
Lag Class Bounds Number of Pairs Percentage of
Pairs
0 0.00 2.01 52 0.12%
1 2.01 6.03 420 0.94%
2 6.03 10.06 815 1.82%
3 10.06 14.08 1143 2.55%
4 14.08 18.10 1518 3.38%
5 18.10 22.12 1680 3.75%
6 22.12 26.15 1931 4.31%
7 26.15 30.17 2135 4.76%
8 30.17 34.19 2285 5.09%
9 34.19 38.21 2408 5.37%
10 38.21 42.24 2551 5.69%
11 42.24 46.26 2444 5.45%
12 46.26 50.28 2535 5.65%
13 50.28 54.30 2487 5.55%
14 54.30 58.33 2460 5.48%
15 58.33 62.35 2391 5.33%
16 62.35 66.37 2302 5.13%
17 66.37 70.39 2285 5.09%
18 70.39 74.41 2079 4.64%
19 74.41 78.44 1786 3.98%
20 78.44 82.46 1640 3.66%
21 82.46 86.48 1493 3.33%
22 86.48 90.50 1243 2.77%
23 90.50 94.53 925 2.06%
24 94.53 98.55 710 1.58%
25 98.55 102.57 421 0.94%
26 102.57 106.59 274 0.61%
27 106.59 110.62 200 0.45%
28 110.62 114.64 120 0.27%
29 114.64 118.66 55 0.12%
30 118.66 122.68 35 0.08%
31 122.68 126.71 14 0.03%
32 126.71 130.73 11 0.02%
33 130.73 134.75 2 0.00%
34 134.75 138.77 0 0.00%
35 138.77 142.80 0 0.00%

Notice the overall high pair count in the majority of classes in Output 95.2.2. You can see that even for higher values of NHC= the classes are still sufficiently populated for your semivariogram analysis according to the rule of thumb stated in the section Choosing the Size of Classes. Based on the displayed information in Output 95.2.3, you specify LAGDISTANCE=4 km. You can further experiment with smaller lag sizes to obtain more points in your sample semivariogram.

You will return to the MAXLAGS= specification later. The important step now is to investigate the presence of trends in the measurement. The following section makes a suggestion about how to remove surface trends from your data, and then continues the semivariogram analysis with the detrended data.

Output 95.2.3 Distribution of Pairwise Distances for Ozone Observation Data
 Distribution of Pairwise Distances for Ozone Observation Data

Previous Page | Next Page | Top of Page