Previous Page | Next Page

 The LOESS Procedure

## Example 50.4 El Niño Southern Oscillation

The following data set contains measurements of monthly averaged atmospheric pressure differences between Easter Island and Darwin, Australia, for a period of 168 months (National Institute of Standards and Technology; 1998):

data ENSO;
input Pressure @@;
Month=_N_;
format Pressure 4.1;
format Month 3.0;
datalines;
12.9  11.3  10.6  11.2  10.9   7.5   7.7  11.7
12.9  14.3  10.9  13.7  17.1  14.0  15.3   8.5
5.7   5.5   7.6   8.6   7.3   7.6  12.7  11.0
12.7  12.9  13.0  10.9  10.4  10.2   8.0  10.9
13.6  10.5   9.2  12.4  12.7  13.3  10.1   7.8
4.8   3.0   2.5   6.3   9.7  11.6   8.6  12.4
10.5  13.3  10.4   8.1   3.7  10.7   5.1  10.4
10.9  11.7  11.4  13.7  14.1  14.0  12.5   6.3
9.6  11.7   5.0  10.8  12.7  10.8  11.8  12.6
15.7  12.6  14.8   7.8   7.1  11.2   8.1   6.4
5.2  12.0  10.2  12.7  10.2  14.7  12.2   7.1
5.7   6.7   3.9   8.5   8.3  10.8  16.7  12.6
12.5  12.5   9.8   7.2   4.1  10.6  10.1  10.1
11.9  13.6  16.3  17.6  15.5  16.0  15.2  11.2
14.3  14.5   8.5  12.0  12.7  11.3  14.5  15.1
10.4  11.5  13.4   7.5   0.6   0.3   5.5   5.0
4.6   8.2   9.9   9.2  12.5  10.9   9.9   8.9
7.6   9.5   8.4  10.7  13.6  13.7  13.7  16.5
16.8  17.1  15.4   9.5   6.1  10.1   9.3   5.3
11.2  16.6  15.6  12.0  11.5   8.6  13.8   8.7
8.6   8.6   8.7  12.8  13.2  14.0  13.4  14.8
;

The following PROC SGPLOT statements produce the simple scatter plot of these data, displayed in Output 50.4.1.

proc sgplot data=ENSO;
scatter y=Pressure x=Month;
run;

Output 50.4.1 Scatter Plot of ENSO Data

You can compute a loess fit and obtain graphical results for these data by using the following statements:

ods graphics on;

proc loess data=ENSO plots=residuals(smooth);
model Pressure=Month;
run;

The "Smoothing Criterion" and "Fit Summary" tables are shown in Output 50.4.2, and the fit plot is shown in Output 50.4.3.

Output 50.4.2 Output from PROC LOESS
The LOESS Procedure
Dependent Variable: Pressure

Optimal Smoothing Criterion
AICC Smoothing
Parameter
3.41105 0.22321

The LOESS Procedure
Selected Smoothing Parameter: 0.223
Dependent Variable: Pressure

Fit Summary
Fit Method kd Tree
Blending Linear
Number of Observations 168
Number of Fitting Points 33
kd Tree Bucket Size 7
Degree of Local Polynomials 1
Smoothing Parameter 0.22321
Points in Local Neighborhood 37
Residual Sum of Squares 1654.27725
Trace[L] 8.74180
GCV 0.06522
AICC 3.41105

Output 50.4.3 Oversmoothed Loess Fit for the ENSO Data

This weather-related data should exhibit an annual cycle. However, the loess fit in Output 50.4.3 indicates a longer cycle but no annual cycle. This suggests that the loess fit is oversmoothed. One way to detect oversmoothing is to look for patterns in the fit residuals. With ODS Graphics enabled, PROC LOESS produces a scatter plot of the residuals versus each regressor in the model. To aid in visually detecting patterns in these scatter plots, it is useful to superimpose a nonparametric fit on these scatter plots. You can request this by specifying the SMOOTH suboption of the PLOTS=RESIDUALS option in the PROC LOESS statement. The nonparametric fit that is produced is again a loess fit that is produced independently of the loess fit used to obtain these residuals.

With the superimposed loess fit shown in Output 50.4.4, you can clearly identify an annual cycle in the residuals, which confirms that the loess fit for the ENSO is oversmoothed. What accounts for this poor fit?

Output 50.4.4 Residuals for the Loess Fit for the ENSO Data

The smoothing parameter value used for the loess fit shown in Output 50.4.3 was chosen using the default method of PROC LOESS, namely a golden section minimization of the AICC criterion over the interval . One possibility is that the golden section search has found a local rather than a global minimum of the AICC criterion. You can test this by redoing the fit requesting a global minimum. You do this with the following statements:

proc loess data=ENSO;
model Pressure=Month/select=AICC(global);
run;

Output 50.4.5 AICC versus Smoothing Parameter Showing Local Minima

The explanation for the oversmoothed fit in Output 50.4.3 is now apparent. Output 50.4.5 shows that the golden section search algorithm found the local minimum that occurs near the value 0.22 of the smoothing parameter rather than the global minimum that occurs near 0.06. Note that if you restrict the range of smoothing parameter values examined to lie below 0.2, then the golden section search finds the global minimum, as the following statements demonstrate:

proc loess data=ENSO;
model Pressure=Month/select=AICC(range(0.03,0.2));
run;

Output 50.4.6 Selected Smoothing Parameter Value
The LOESS Procedure
Dependent Variable: Pressure

Optimal Smoothing Criterion
AICC Smoothing
Parameter
2.86660 0.05655

Output 50.4.6 shows that with the restricted range of smoothing parameter values examined, PROC LOESS finds the global minimum of the AICC criterion. Often you might not know an appropriate range of smoothing parameter values to examine. In such cases, you can use the PRESEARCH suboption of the SELECT= option in the MODEL statement. When you specify this option, PROC LOESS does a preliminary search to try to locate a smoothing parameter value range that contains just the first local minimum of the criterion being used for the selection. The following statements provide an example.

proc loess data=ENSO plots=residuals(smooth);
model Pressure=Month/select=AICC(presearch);
run;

ods graphics off;

Output 50.4.7 Selected Smoothing Parameter Value When Presearch Is Specified
The LOESS Procedure
Dependent Variable: Pressure

Optimal Smoothing Criterion
AICC Smoothing
Parameter
2.86660 0.05655

Output 50.4.7 shows that with the PRESEARCH suboption specified, PROC LOESS selects the smoothing parameter value that yields the global minimum of the AICC criterion. The fit obtained is shown in Output 50.4.8, and a plot of the residuals with a superimposed loess fit is shown in Output 50.4.9.

Output 50.4.8 Loess Fit Showing an Annual Cycle

Output 50.4.9 Residuals of the Selected Model

In contrast to the residual plot show in Output 50.4.4, the residuals plotted in Output 50.4.9 do not exhibit any pattern, indicating that the corresponding loess fit has captured all the systematic variation in the data.

An interesting question is whether there is some phenomenon captured in the data that would explain the presence of the local minimum near in the AICC curve. Note that there is some evidence of a cycle of about 42 months in the oversmoothed fit in Output 50.4.3. You can see this cycle because the strong annual cycle in Output 50.4.8 has been smoothed out. The physical phenomenon that accounts for the existence of this cycle has been identified as the periodic warming of the Pacific Ocean known as "El Niño."

 Previous Page | Next Page | Top of Page