This example illustrates how you can do model-based extrapolation—backcasting, forecasting, or interpolation—of a response variable. All you need is to appropriately augment the input data set with the relevant ID and predictor information and assign missing values to the response variable in these places. The following DATA step creates one such augmented data set by using a well-known data set that contains recordings of the Nile River water level measured between the years 1871 and 1970. Suppose you want to backcast the Nile water level for two years before 1871, forecast it for two years after 1970, and interpolate its value for the year 1921—for illustration purposes, this value is assumed to be missing in the available data set.
data Nile; input level @@; year = intnx( 'year', '1jan1869'd, _n_-1 ); format year year4.; if year = '1jan1921'd then level=.; datalines; . . 1120 1160 963 1210 1160 1160 813 1230 1370 1140 995 935 1110 994 1020 960 1180 799 958 1140 1100 1210 1150 1250 1260 1220 1030 1100 774 840 874 694 940 833 701 916 692 1020 1050 969 831 726 456 824 702 1120 1100 832 764 821 768 845 864 862 698 845 744 796 1040 759 781 865 845 944 984 897 822 1010 771 676 649 846 812 742 801 1040 860 874 848 890 744 749 838 1050 918 986 797 923 975 815 1020 906 901 1170 912 746 919 718 714 740 . . ;
It is also known that for this time span the Nile water level can be reasonably modeled as a sum of a random walk trend, a level shift in the year 1899, and the observation error. The following statements fit this model to the data:
proc ssm data=Nile; id year interval=year; shift1899 = ( year >= '1jan1899'd ); trend rw(rw); irregular wn; model level = shift1899 RW wn / print=smooth; output out=nileOut; quit;
The model-based interpolated and extrapolated values of the Nile water level are shown in Output 34.3.1, which is produced by using the PRINT=SMOOTH option in the MODEL statement.
Output 34.3.1: Interpolated and Extrapolated Nile Water Level