| The HPFENGINE Procedure |
| PROC HPFENGINE Statement |
The following options can be used in the PROC HPFENGINE statement.
specifies the number of observations before the end of the data where the multistep forecasts are to begin. This option is often used to obtain performance statistics. See the PRINT= option details about printing performance statistics. The default is BACK=0.
names the SAS data set that contains the input data for the procedure to forecast. If the DATA= option is not specified, the most recently created SAS data set is used.
enables finer control of message printing. The error severity level and HPFENGINE procedure processing stages are set independently. A logical 'and' is taken over all the specified options, and any message that tests true against the results of the 'and' is printed.
Available severity-options are as follows:
specifies that only low severity, minor issues be printed.
specifies that only medium-severity problems be printed.
specifies that only severe errors be printed.
specifies that errors of all severity levels (LOW, MEDIUM, and HIGH) be printed.
specifies that no messages from the HPFENGINE procedure be printed.
Available stage-options are as follows:
specifies that only errors that occur during option processing and validation be printed.
specifies that only errors that occur during the accumulation of data and the application of SETMISS= and ZEROMISS= options be printed.
specifies that only errors that occur during the model selection process be printed.
specifies that only errors that occur during the model parameter estimation process be printed.
specifies that only errors that occur during the model evaluation and forecasting process be printed.
is the same as specifying all PROCEDURELEVEL, DATAPREP, SELECTION, ESTIMATION, and FORECASTING options.
Examples are as follows:
errorcontrol=(severity=(high medium) stage=all);
prints high- and moderate-severity errors at any processing stage of PROC HPFENGINE.
errorcontrol=(severity=high stage=dataprep);
prints high-severity errors only during the data preparation.
errorcontrol=(severity=none stage=all);
turns off messages from PROC HPFENGINE.
errorcontrol=( severity=(high medium low)
stage=( procedurelevel dataprep selection
estimation forecasting) );
specifies the default behavior. Also the following statement specifies the default behavior:
errorcontrol=(severity=all stage=all)
specifies the desired handling of arithmetic exceptions during the run. You can specify except-option as one of the following:
specifies that PROC HPFENGINE stop on an arithmetic exception. No recovery is attempted. This is the default behavior if the EXCEPTIONS= option is not specified.
specifies that PROC HPFENGINE generate a forecast based on using its default exponential smoothing model for the variable that produces the arithmetic exception in the current BY group. The ESM model is equivalent to the default model used by the HPFESMSPEC procedure with no modifications.
specifies that PROC HPFENGINE generate a forecast based on a zero-drift random walk model for the variable that produces the exception in the current BY group.
specifies that PROC HPFENGINE generate a forecast of missing values for the variable that produces the exception in the current BY group.
The order of CATCH handling corresponds to the order of the preceding list. If CATCH(ESM) handling produces an arithmetic exception, it attempts to generate a forecast by using CATCH(RW) semantics. Likewise, if CATCH(RW) handling produces an arithmetic exception, it generates a missing value forecast.
specifies the name of a catalog entry that serves as a model selection list. This is the selection list used to forecast all series if no INEST= data set is provided. It is also the selection list used if individual model selections are missing in the INEST= data set when INEST= is provided. If REPOSITORY= is not present, GLOBALSELECTION defaults to BEST, specified in SASHELP.HPFDFLT.
specifies that the CHOOSE= option in the HPFSELECT procedure be ignored when selecting a model in the candidate model list. The best model is selected regardless of the model chosen in the CHOOSE= option in the HPFSELECT procedure.
contains information that maps forecast variables to models or selection lists, and data set variables to model variables. It can also contain parameter estimates used if the TASK=FORECAST or TASK=UPDATE options are present. INEST= is optional. See the description of the GLOBALSELECTION= option for more information.
contains information that describes predefined events. This data set is usually created by the HPFEVENTS procedure. This option is only used if events are included in a model.
specifies the number of periods ahead to forecast (forecast lead or horizon). The default is LEAD=12.
The LEAD= value is relative to the last observation in the input data set and not to the last nonmissing observation of a particular series. Thus, if a series has missing values at the end, the actual number of forecasts computed for that series will be greater than the LEAD= value.
names the output data set to contain the forecasts of the variables specified in the subsequent FORECAST statements. If an ID variable is specified, it will also be included in the OUT= data set. The values are accumulated based on the ACCUMULATE= option and forecasts are appended to these values. If the OUT= data set is not specified, a default output data set DATAn is created. If you do not want the OUT= data set created, then use OUT=_NULL_.
names the output data set to contain the forecast components. The components included in the output depend on the model.
contains information that maps forecast variables to model specifications, and data set variables to model variables and parameter estimates.
An OUTEST= data set will frequently be used as the INEST= data set for subsequent invocations of PROC HPFENGINE. In such a case, if the PROC HPFENGINE statement option TASK=FORECAST is used, forecasts are generated using the parameter estimates found in this data set and are not reestimated.
names the output data set to contain the forecast time series components (actual, predicted, lower confidence limit, upper confidence limit, prediction error, and prediction standard error). The OUTFOR= data set is particularly useful for displaying the forecasts in tabular or graphical form.
names the output data set to contain input in the forecasting process. This information is useful if future values of input variables are automatically supplied by the HPFENGINE procedure. Such a case would occur if one or more input variables are listed in either the STOCHASTIC or the CONTROLLABLE statement and if there are missing future values of these input variables.
names the output data set to contain detailed information about the selected forecast model. The data set has information such as the model family, presence or absence of inputs, events and outliers, and so forth.
names the output data set to contain information in the SAS log, specifically the number of notes, errors, and warnings and the number of series processed, forecasts requested, and forecasts failed.
names the output data set to contain the statistics of fit (or goodness-of-fit statistics). The OUTSTAT= data set is useful for evaluating how well the model fits the series. The statistics of fit are based on the entire range of the time series.
names the output data set to contain statistics of fit for all of the candidate models fit during model selection. The OUTSTATSELECT= data set is useful for comparing the performance of various models.
specifies the graphical output desired. By default, the HPFENGINE procedure produces no graphical output. The following printing options are available:
plots prediction error time series graphics.
plots prediction error autocorrelation function graphics.
plots prediction error partial autocorrelation function graphics.
plots prediction error inverse autocorrelation function graphics.
plots white noise graphics.
plots forecast graphics.
plots forecast seasonal cycles graphics.
plots the forecast in the forecast horizon only.
plots the forecast components.
plots model and error graphics for each candidate model fit to the series forecast series.
specifies all of the preceding PLOT= options.
For example, PLOT=FORECASTS plots the forecasts for each series.
specifies the printed output desired. By default, the HPFENGINE procedure produces no printed output.
The following printing options are available:
prints the results of parameter estimation.
prints the forecasts.
prints the statistics of fit.
prints the forecast summary.
prints the performance statistics for each forecast.
prints the performance summary for each BY group.
prints the performance summary for all of the BY groups.
prints the label and fit statistics for each model in the selection list.
prints model bias information.
prints forecast model components.
prints parameter estimates for each candidate model fit to the series forecast series.
prints descriptive statistics the series forecast series.
is the same as specifying PRINT=(ESTIMATES SELECT FORECASTS STATISTICS BIAS DESCSTATS). PRINT=(ALL CANDIDATES COMPONENTS PERFORMANCE PERFORMANCESUMMARY PERFORMANCEOVERALL) prints all the options listed.
specifies that output requested with the PRINT= option be printed in greater detail.
is a two-level SAS catalog name that specifies the location of the model repository. The REPOSITORY= option can also be specified as MODELREPOSITORY=, MODELREP=, or REP=. The default for this option is SASHELP.HPFDFLT.
is a two-level SAS catalog name that specifies the location of the model score repository. This repository is where score files are written if the SCORE statement is used in the HPFENGINE procedure. There is no default score repository. The presence of a SCORE statement requires that the SCOREREPOSITORY= option also be present.
specifies the length of the seasonal cycle. For example, SEASONALITY=3 means that every group of three observations forms a seasonal cycle. The SEASONALITY= option is applicable only for seasonal forecasting models. By default, the length of the seasonal cycle is 1 (no seasonality) or the length implied by the INTERVAL= option specified in the ID statement. For example, INTERVAL=MONTH implies that the length of the seasonal cycle is 12.
specifies that the variables specified in the FORECAST statement be processed in sorted order.
controls the model selection and parameter estimation process. Available options are as follows:
performs model selection, estimates parameters of the selected model, and produces forecasts. This is the default.
performs model selection, estimates parameters of the selected model, produces forecasts, and potentially overrides settings in the model selection list. If a selection list does not specify a particular item and that item is specified with a TASK=SELECT option, the value as set in TASK=SELECT is used. If an option is specified in selection list, the corresponding value set in TASK=SELECT is not used unless the OVERRIDE option is also present. The available options for TASK=SELECT are as follows:
specifies the size of the holdout sample to be used for model selection. The holdout sample is a subset of actual time series that ends at the last nonmissing observation.
specifies the size of the holdout sample as a percentage of the length of the time series. If HOLDOUT=5 and HOLDOUTPCT=10, the size of the holdout sample is
, where
is the length of the time series with beginning and ending missing values removed.
specifies the model selection criterion (statistic of fit) to be used to select from several candidate models. The following list shows the valid values for the CRITERION= option and the statistics of fit these option values specify:
sum of square error
mean square error
root mean squared error
unbiased mean squared error
unbiased root mean squared error
maximum percent error
minimum percent error
mean percent error
mean absolute percent error
median absolute percent error
geometric mean absolute percent error
minimum absolute error percent of standard deviation
maximum absolute error percent of standard deviation
mean absolute error percent of standard deviation
median absolute error percent of standard deviation
geometric mean absolute error percent of standard deviation
minimum predictive percent error
maximum predictive percent error
mean predictive percent error
symmetric mean absolute predictive percent error
median absolute predictive percent error
geometric mean absolute predictive percent error
minimum symmetric percent error
maximum symmetric percent error
mean symmetric percent error
symmetric mean absolute percent error
median absolute symmetric percent error
geometric mean absolute symmetric percent error
minimum relative error
maximum relative error
mean relative error
mean relative absolute error
median relative absolute error
geometric mean relative absolute error
maximum error
minimum error
mean error
mean absolute error
mean absolute scaled error
R-square
adjusted R-square
Amemiya’s adjusted R-square
random walk R-square
Akaike information criterion
finite sample corrected AIC
Schwarz Bayesian information criterion
Amemiya’s prediction criterion
specifies a number greater than one that is used to determine whether or not a time series is intermittent. If the average demand interval is greater than this number, then the series is assumed to be intermittent.
specifies the options related to the seasonality test.
The following values for the SEASONTEST= option are allowed:
no test
significance probability value to use in testing whether seasonality is present in the time series. The value must be between 0 and 1. A smaller value of the SIGLEVEL= option means that stronger evidence of a seasonal pattern in the data is required before PROC HPFENGINE will use seasonal models to forecast the time series.
The default is SEASONTEST=(SIGLEVEL=0.01).
specifies the significance level to use in computing the confidence limits of the forecast. The ALPHA= value must be between 0 and 1. As an example, ALPHA=0.05 produces 95% confidence intervals.
forces the use of any options listed.
disables the default action and sets the forecast to missing if no models can be fit from the initial selection list. (By default, if none of the models in a selection list can be successfully fit to a series, PROC HPFENGINE returns to the selection list SASHELP.HPFDFLT.BEST and restarts the selection process.) There will be an observation in OUTSUM=, if requested, that corresponds to the variable and BY group in question, and the _STATUS_ variable will be nonzero.
specifies that no trend model be fit to any series with fewer than n nonmissing values. Normally the models in a selection list are not subset by trend.
Incorporation of a trend is checked only for smoothing, UCM, and ARIMA models. For the smoothing case, only simple smoothing is a non-trend model. For UCM, the absence of a slope component qualifies it as a non-trend model. For ARIMA, there must be no differencing of the dependent variable for PROC HPFENGINE to consider it a non-trend model.
The value of n must be greater than or equal to 1. The default is n = 1.
specifies that no seasonal model be fit to any series with fewer observations than n multiplied by the seasonal cycle length. The value of n must be greater than or equal to 1. The default is n = 2.
specifies that any series with fewer than n nonmissing values not be fit using the models in the selection list, but instead be forecast as the mean of the observations in the series. The value of n must be greater than or equal to 1. The default is n = 1.
estimates parameters by using the model specified in the INEST= data set, then forecasts. No model selection is performed.
estimates parameters by using the model specified in the INEST= data set, then forecasts, potentially overriding the significance level in the model selection list that was used to compute forecast confidence intervals. No model selection is performed. If a selection list does not specify ALPHA and ALPHA is specified in the TASK=FIT option, the value as set in TASK=FIT will be in effect. If ALPHA is specified in selection list, the corresponding value set in TASK=FIT will not be used unless the OVERRIDE option is also present. The available options for TASK=FIT are as follows:
specifies the significance level to use in computing the confidence limits of the forecast. The ALPHA= value must be between 0 and 1.
forces the use of any options listed.
estimates parameters by using the model specified in the INEST= data set, then forecasts. TASK=UPDATE differs from TASK=FIT in that the parameters found in the INEST= data set are used as starting values in the estimation. No model selection is performed.
estimates parameters by using the model specified in the INEST= data set, using the parameter estimates as starting values, then forecasts, potentially overriding the significance level in the model selection list used to compute forecast confidence intervals. No model selection is performed. If a selection list does not specify ALPHA, and ALPHA is specified in the TASK=UPDATE option, the value as set in TASK=UPDATE will be in effect. If ALPHA is specified in the selection list, the corresponding value set in TASK=UPDATE will not be used unless the OVERRIDE option is also present. The available options for TASK=UPDATE are as follows:
specifies the significance level to use in computing the confidence limits of the forecast. The ALPHA= value must be between 0 and 1.
forces the use of any options listed.
forecasts using model and parameters specified in the INEST= data set. No parameter estimation occurs.
forecasts using model and parameters specified in the INEST= data set, potentially overriding the significance level in the model selection list used to compute forecast confidence intervals. No parameter estimation occurs. If a selection list does not specify ALPHA, and ALPHA is specified in the TASK=FORECAST option, the value as set in TASK=FORECAST will be used. If ALPHA is specified in the selection list, the corresponding value set in TASK=FORECAST will not be used unless the OVERRIDE option is also present. The available options for TASK=FORECAST are as follows:
specifies the significance level to use in computing the confidence limits of the forecast. The ALPHA= value must be between 0 and 1.
forces the use of any options listed.
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.