FOCUS AREAS

SAS/ETS Examples

Overlaying Multiple Forecast Methods in Time Series Plots


Contents | SAS Program



Overview

It is said that forecasting is much more of an art than a science. Many practitioners use a visual comparison of several forecasts to assess their accuracy and choose among the various forecasting methods.

This example shows how to plot on the same graph the original values of a time series variable and the predicted values from three different forecasting methods, thus facilitating a visual comparison.



Analysis

This example uses lead production data as the forecast variable. The values represent monthly totals of U.S. lead production, in tons, for the period January 1990 to September 1992.

Begin by resetting all graphics options and entering the lead production data with a SAS DATA step.




   goptions reset=all;

   /* -------  Read Initial Data  ------- */

   data leadprd;
      input date:monyy5. leadprod @@;
      format date monyy5.;
      title 'Lead Production Data';
      title2 '(in tons)';
      datalines;
   jan90 38500 feb90 37900 mar90 36900 apr90 38600
   may90 36400 jun90 33300 jul90 34000 aug90 38000
   sep90 37400 oct90 42300 nov90 36900 dec90 34800
   jan91 33900 feb91 34000 mar91 37200 apr91 33300
   may91 29800 jun91 24700 jul91 30800 aug91 31100
   sep91 32400 oct91 32900 nov91 29100 dec91 31800
   jan92 32100 feb92 30500 mar92 36800 apr92 30300
   may92 29500 jun92 24700 jul92 27600 aug92 23800
   sep92 21400
   ;
   run;

Next produce your forecasts and save their predicted values to SAS data sets. This example uses the forecasting capabilities of the FORECAST, the ARIMA, and the REG procedures.

The OUT1STEP option of PROC FORECAST specifies that only the one-step-ahead forecasts are output to the data set LEADOUT1. The LEAD= option produces forecasts for 12 months beyond the sample period.



   /* -- Using PROC FORECAST -- */

   proc forecast data=leadprd out=leadout1 out1step
      lead=12 interval=month;
      id date;
      var leadprod;
   run;

The F statement in PROC ARIMA is used to generate forecasts based on the ARIMA model estimated with the E statement. The LEAD= option, as in the FORECAST procedure, produces forecasts for 12 months beyond the sample period.



   /* -- Using PROC ARIMA -- */

   proc arima data=leadprd;
      i var=leadprod nlag=15;
      e p=1;
      f lead=12 interval=month id=date out=leadout2;
   run;
   quit;

To estimate a time trend for the lead prediction data, it is necessary to create a new variable T that spans both the sample and forecast periods.



   /* -- Using PROC REG (Time Trend Regression) -- */

   data ttrend;
      set leadout2;
      t+1;
   run;

Use PROC REG to create an output data set, LEADOUT3, containing the predicted values from a regression of LEADPROD on a time trend.



   proc reg data=ttrend;
      model leadprod = t;
      output out=leadout3 p=ptrend;
   run;
   quit;

The key to overlaying plots of time series forecasts is in the data management. Use a DATA step to collect all of the forecasts in a data set named FINAL. Use the KEEP= and the RENAME= options to combine the three output data sets.



   /* -------  Data Management  ------- */

   data final;
      set leadout1(keep=date leadprod
                   rename=(leadprod=pfore));
      set leadout2(keep=date leadprod forecast
                   rename=(leadprod=actual forecast=parima));
      set leadout3(keep=date ptrend);
   run;

Use graphics options along with SYMBOL, AXIS, and LEGEND statements to provide a more polished look to your graph.



   /* -------  Graphics Output  ------- */

   goptions cback=white ctitle=bl ctext=bl border
            ftitle=centx ftext=centx;

   title 'Lead Production Data';
   title2 'Plot of Forecast for Lead Production';
   symbol1 i=spline width=1 v=dot  c=black;  /* for actual */
   symbol2 i=spline width=2 v=none c=red;    /* for pfore forecast */
   symbol3 i=spline width=2 v=none c=green;  /* for parima forecast */
   symbol4 i=spline width=2 v=none c=blue;   /* for ptrend forecast */

   axis1 offset=(1 cm)
         label=('Year') minor=none
         order=('01jan90'd to '01jan94'd by year);
   axis2 label=(angle=90 'Lead Production')
         order=(20000 to 45000 by 5000);

   legend1 across=1
           cborder=black
           position=(top inside right)
           offset=(-2,0)
           value=(tick=1 'ACTUAL'
                  tick=2 'PROC FORECAST'
                  tick=3 'PROC ARIMA'
                  tick=4 'TIME TREND')
           shape=symbol(2,.25)
           mode=share
           label=none;

Use the GPLOT procedure with the OVERLAY option to plot the actual time series data along with the predictions from the three forecasting methods. The HREF= option creates a reference line perpendicular to the horizonal axis at September 1, 1992, the dividing date between the sample and forecast periods.



   proc gplot data=final;
      format date year4.;
      plot actual * date = 1
           pfore  * date = 2
           parima * date = 3
           ptrend * date = 4 / overlay noframe
                               href='01sep92'd
                               chref=red
                               vaxis=axis2
                               vminor=1
                               haxis=axis1
                               legend=legend1;
   run;
   quit;

The graph produced is shown in Figure 1

forecast.gif (6957 bytes)


Figure 1: Forecasts of Lead Production





References

SAS Institute Inc. (1996), Forecasting Examples for Business and Economics Using the SAS System, Cary, NC: SAS Institute Inc.

SAS Institute Inc. (1993), SAS/ETS User's Guide, Version 6, Second Edition, Cary, NC: SAS Institute Inc.

SAS Institute Inc. (1990), SAS/GRAPH Software: Reference, Version 6, First Edition, Volume 2, Cary, NC: SAS Institute Inc.