Large-Scale Automatic Forecasting: Millions of Forecasts
Michael Leonard, SAS Institute
Web sites and transactional databases collect large amounts of time-stamped data. Businesses often want to make future predictions (forecasts) based on numerous sets of time-stamped data (sets of transactions). There are many time series analysis techniques related to forecasting, and an experienced analyst can effectively use these techniques to analyze, model, and forecast time series data. However, the number of time series to forecast may be enormous or the forecasts may need to be updated frequently, making human interaction impractical. Additionally, these time series analysis techniques require that the data be recorded on fixed time intervals. This paper proposes the following technique for automatically forecasting sets of transactions. For each set of transactions recorded in the database: The time-stamped data are accumulated to form a time series. The time series is diagnosed to choose an appropriate set of candidate forecasting models. Each of the diagnosed candidate models are fitted (trained) to the time series data with the most recent data excluded (holdout sample or test data). Based on a model selection criterion, the best performing candidate model within the holdout sample is selected to forecast the time series. This automatic forecasting technique can efficiently generate millions of forecasts related to time-stamped data. This paper demonstrates this technique using SAS® High-Performance Forecasting Software.