The SSM Procedure

Example 27.3 Backcasting, Forecasting, and Interpolation

This example illustrates how you can do model-based extrapolation—backcasting, forecasting, or interpolation—of a response variable. All you need is to appropriately augment the input data set with the relevant ID and predictor information and assign missing values to the response variable in these places. The following DATA step creates one such augmented data set by using a well-known data set that contains recordings of the Nile River water level measured between the years 1871 and 1970. Suppose you want to backcast the Nile water level for two years before 1871, forecast it for two years after 1970, and interpolate its value for the year 1921—for illustration purposes, this value is assumed to be missing in the available data set.

data Nile;
   input level @@;
   year = intnx( 'year', '1jan1869'd, _n_-1 );
   format year year4.;
   if year = '1jan1921'd then level=.;
datalines;
. .
1120  1160  963  1210  1160  1160  813  1230   1370  1140
995   935   1110 994   1020  960   1180 799    958   1140
1100  1210  1150 1250  1260  1220  1030 1100   774   840
874   694   940  833   701   916   692  1020   1050  969
831   726   456  824   702   1120  1100 832    764   821
768   845   864  862   698   845   744  796    1040  759
781   865   845  944   984   897   822  1010   771   676
649   846   812  742   801   1040  860  874    848   890
744   749   838  1050  918   986   797  923    975   815
1020  906   901  1170  912   746   919  718    714   740
. .
;

It is also known that for this time span the Nile water level can be reasonably modeled as a sum of a random walk trend, a level shift in the year 1899, and the observation error. The following statements fit this model to the data:

 proc ssm data=Nile;
     id year interval=year;
     shift1899 = ( year >= '1jan1899'd );
     trend rw(rw);
     irregular wn;
     model level = shift1899 RW wn  / print=smooth;
     output out=nileOut;
 quit;

The model-based interpolated and extrapolated values of the Nile water level are shown in Output 27.3.1, which is produced by using the PRINT=SMOOTH option in the MODEL statement.

Output 27.3.1: Interpolated and Extrapolated Nile Water Level

The SSM Procedure

Full-Sample Prediction of Missing
Values for level
Obs ID Estimate Standard
Error
95% Confidence
Limits
1 1869 1098 130 843 1353
2 1870 1098 130 843 1353
53 1921 851 129 599 1104
103 1971 851 129 599 1104
104 1972 851 129 599 1104