PROC ESM: Data Set Output

Data Set Output

The ESM procedure can create the OUT=, OUTEST=, OUTFOR=, OUTSTAT=, and OUTSUM= data sets. These data sets contain the variables listed in the BY statement and statistics related to the variables listing in the FORECAST statement. In general, if a forecasting step related to an output data set fails, the values of this step are not recorded or are set to missing in the related output data set and appropriate error and/or warning messages are recorded in the log.

OUT= Data Set

The OUT= data set contains the variables specified in the BY, ID, and FORECAST statements. If the ID statement is specified, the ID variable values are aligned and extended based on the ALIGN= and INTERVAL= options. The values of the variables specified in the FORECAST statements are accumulated based on the ACCUMULATE= option, and missing values are interpreted based on the SETMISSING= option. If the REPLACEMISSING option is specified, embedded missing values are replaced by the one-step-ahead predicted values.

These FORECAST variables are then extrapolated based on the forecasts from the fitted models, or extended with missing values when the MODEL=NONE option is specified. If USE=LOWER is specified, the variable is extrapolated with the lower confidence limits; if USE=UPPER, the variable is extrapolated using the upper confidence limits; otherwise, the variable values are extrapolated with the predicted values. If the TRANSFORM= option is specified, the predicted values contain either mean or median forecasts depending on whether or not the MEDIAN option is specified.

If any of the forecasting steps fail for a particular variable, the variable is extended by missing values.

OUTEST= Data Set

The OUTEST= data set contains the variables specified in the BY statement as well as the variables listed below. For variables listed in FORECAST statements where the option MODEL=NONE is specified, no observations are recorded in the OUTEST= data set. For variables listed in FORECAST statements where the option MODEL=NONE is not specified, the following variables in the OUTEST= data set contain observations related to the parameter estimation step:

_NAME_: variable name
_MODEL_: forecasting model
_TRANSFORM_: transformation
_PARM_: parameter name
_EST_: parameter estimate
_STDERR_: standard errors
_TVALUE_: t values
_PVALUE_: probability values

If the parameter estimation step fails for a particular variable, no observations are output to the OUTEST= data set for that variable.

OUTFOR= Data Set

The OUTFOR= data set contains the variables specified in the BY statement as well as the variables listed below. For variables listed in FORECAST statements where the option MODEL=NONE is specified, no observations are recorded in the OUTFOR= data set for these variables. For variables listed in FORECAST statements where the option MODEL=NONE is not specified, the following variables in the OUTFOR= data set contain observations related to the forecasting step:

_NAME_: variable name
_TIMEID_: time ID values
ACTUAL: actual values
PREDICT: predicted values
STD: prediction standard errors
LOWER: lower confidence limits
UPPER: upper confidence limits
ERROR: prediction errors

If the forecasting step fails for a particular variable, no observations are recorded in the OUTFOR= data set for that variable. If the TRANSFORM= option is specified, the values in the preceding variables are the inverse transform forecasts. If the MEDIAN option is specified, the median forecasts are stored; otherwise, the mean forecasts are stored.

OUTPROCINFO= Data Set

The OUTPROCINFO= data set contains information about the run of the ESM procedure. The following variables are present:

_SOURCE_: set to the name of the procedure, in this case ESM
_NAME_: name of an item being reported; can be the number of errors, notes, or warnings, number of forecasts requested, and so on
_LABEL_: descriptive label for the item in _NAME_
_STAGE_: set to the current stage of the procedure, for ESM this is set to ALL
_VALUE_: value of the item specified in _NAME_

OUTSTAT= Data Set

The OUTSTAT= data set contains the variables specified in the BY statement as well as the variables listed below. For variables listed in FORECAST statements where the option MODEL=NONE is specified, no observations are recorded for these variables in the OUTSTAT= data set. For variables listed in FORECAST statements where the option MODEL=NONE is not specified, the following variables in the OUTSTAT= data set contain observations related to the statistics of fit:

_NAME_: variable name
_REGION_: the region in which the statistics are calculated. Statistics calculated in the fit region are indicated by FIT. Statistics calculated in the forecast region, which happens only if the BACK= option is greater than zero, are indicated by FORECAST.
DFE: degrees of freedom error
N: number of observations
NOBS: number of observations used
NMISSA: number of missing actuals
NMISSP: number of missing predicted values
NPARMS: number of parameters
TSS: total sum of squares
SST: corrected total sum of squares
SSE: sum of square error
MSE: mean square error
UMSE: unbiased mean square error
RMSE: root mean square error
URMSE: unbiased root mean square error
MAPE: mean absolute percent error
MAE: mean absolute error
MASE: mean absolute scaled error
RSQUARE: R square
ADJRSQ: adjusted R square
AADJRSQ: Amemiya’s adjusted R square
RWRSQ: random walk R square
AIC: Akaike information criterion
AICC: finite sample corrected AIC
SBC: Schwarz Bayesian information criterion
APC: Amemiya’s prediction criterion
MAXERR: maximum error
MINERR: minimum error
MINPE: minimum percent error
MAXPE: maximum percent error
ME: mean error
MPE: mean percent error
MDAPE: median absolute percent error
GMAPE: geometric mean absolute percent error
MINPPE: minimum predictive percent error
MAXPPE: maximum predictive percent error
MSPPE: mean predictive percent error
MAPPE: symmetric mean absolute predictive percent error
MDAPPE: median absolute predictive percent error
GMAPPE: geometric mean absolute predictive percent error
MINSPE: minimum symmetric percent error
MAXSPE: maximum symmetric percent error
MSPE: mean symmetric percent error
SMAPE: symmetric mean absolute percent error
MDASPE: median absolute symmetric percent error
GMASPE: geometric mean absolute symmetric percent error
MINRE: minimum relative error
MAXRE: maximum relative error
MRE: mean relative error
MRAE: mean relative absolute error
MDRAE: median relative absolute error
GMRAE: geometric mean relative absolute error
MINAPES: minimum absolute error percent of standard deviation
MAXAPES: maximum absolute error percent of standard deviation
MAPES: mean absolute error percent of standard deviation
MDAPES: median absolute error percent of standard deviation
GMAPES: geometric mean absolute error percent of standard deviation

If the statistics of fit cannot be computed for a particular variable, no observations are recorded in the OUTSTAT= data set for that variable. If the TRANSFORM= option is specified, the values in the preceding variables are computed based on the inverse transform forecasts. If the MEDIAN option is specified, the median forecasts are the basis; otherwise, the mean forecasts are the basis.

See Chapter 50, Forecasting Process Details, for more information about the calculation of forecasting statistics of fit.

OUTSUM= Data Set

The OUTSUM= data set contains the variables specified in the BY statement as well as the variables listed below. The OUTSUM= data set records the summary statistics for each variable specified in a FORECAST statement. For variables listed in FORECAST statements where the option MODEL=NONE is specified, the values related to forecasts are set to missing for those variables in the OUTSUM= data set. For variables listed in FORECAST statements where the option MODEL=NONE is not specified, the forecast values are set based on the USE= option.

The following variables related to summary statistics are based on the ACCUMULATE= and SETMISSING= options:

_NAME_: variable name
_STATUS_: forecasting status. Nonzero values imply that no forecast was generated for the series.
NOBS: number of observations
N: number of nonmissing observations
NMISS: number of missing observations
MIN: minimum value
MAX: maximum value
MEAN: mean value
STDDEV: standard deviation

The following variables related to forecast summation are based on the LEAD= and STARTSUM= options:

PREDICT: forecast summation predicted values
STD: forecast summation prediction standard errors
LOWER: forecast summation lower confidence limits
UPPER: forecast summation upper confidence limits

Variance-related computations are computed only when no transformation is specified (TRANSFORM=NONE).

The following variables related to multistep forecast are based on the LEAD= and USE= options:

_LEAD $\text{[math]}$ _: multistep forecast ( $\text{[math]}$ ranges from one to the value of the LEAD= option). If USE=LOWER, this variable contains the lower confidence limits; if USE=UPPER, this variable contains the upper confidence limits; otherwise, this variable contains the predicted values.

If the forecast step fails for a particular variable, the variables that are related to forecasting are set to missing for that variable. The OUTSUM= data set contains both a summary of the (accumulated) time series and optionally its forecasts for all series.