Working with Time Series Data


Output Data Sets of SAS/ETS Procedures

Some SAS/ETS procedures (such as PROC FORECAST) produce interleaved output data sets, and other SAS/ETS procedures produce standard form time series data sets. The form a procedure uses depends on whether the procedure is normally used to produce multiple result series for each of many input series in one step (as PROC FORECAST does).

For example, the ARIMA procedure can output actual series, forecast series, residual series, and confidence limit series just as the FORECAST procedure does. The PROC ARIMA output data set uses the standard form because PROC ARIMA is designed for the detailed analysis of one series at a time and so forecasts only one series at a time.

The following statements show the use of the ARIMA procedure to produce a forecast of the USCPI data set. Figure 3.7 shows part of the output data set that is produced by the ARIMA procedure’s FORECAST statement. (The printed output from PROC ARIMA is not shown.) Compare the PROC ARIMA output data set shown in Figure 3.7 with the PROC FORECAST output data set shown in Figure 3.6.

title "PROC ARIMA Output Data Set";

proc arima data=uscpi;
   identify var=cpi(1);
   estimate q=1;
   forecast id=date interval=month
                    lead=12 out=arimaout;
run;
proc print data=arimaout(obs=6);
run;

Figure 3.7: Partial Listing of Output Data Set Produced by PROC ARIMA

PROC ARIMA Output Data Set

Obs date cpi FORECAST STD L95 U95 RESIDUAL
1 JUN1990 129.9 . . . . .
2 JUL1990 130.4 130.368 0.36160 129.660 131.077 0.03168
3 AUG1990 131.6 130.881 0.36160 130.172 131.590 0.71909
4 SEP1990 132.7 132.354 0.36160 131.645 133.063 0.34584
5 OCT1990 133.5 133.306 0.36160 132.597 134.015 0.19421
6 NOV1990 133.8 134.046 0.36160 133.337 134.754 -0.24552



The output data set produced by the ARIMA procedure’s FORECAST statement stores the actual values in a variable with the same name as the response series, stores the forecast series in a variable named FORECAST, stores the residuals in a variable named RESIDUAL, stores the 95% confidence limits in variables named L95 and U95, and stores the standard error of the forecast in the variable STD.

This method of storing several different result series as a standard form time series data set is simple and convenient. However, it works well only for a single input series. The forecast of a single series can be stored in the variable FORECAST. But if two series are forecast, two different FORECAST variables are needed.

The STATESPACE procedure handles this problem by generating forecast variable names FOR1, FOR2, and so forth. The SPECTRA procedure uses a similar method. Names such as FOR1, FOR2, RES1, RES2, and so forth require you to remember the order in which the input series are listed. This is why PROC FORECAST, which is designed to forecast a whole list of input series at once, stores its results in interleaved form.

Other SAS/ETS procedures are often used for a single input series but can also be used to process several series in a single step. Thus, they are not clearly like PROC FORECAST nor clearly like PROC ARIMA in the number of input series they are designed to work with. These procedures use a third method for storing multiple result series in an output data set. These procedures store output time series in standard form (as PROC ARIMA does) but require an OUTPUT statement to give names to the result series.