| The HPFENGINE Procedure |
| ID Statement |
The ID statement names a numeric variable that identifies observations in the input and output data sets. The ID variable’s values are assumed to be SAS date, time, or datetime values. In addition, the ID statement specifies the (desired) frequency associated with the actual time series. The ID statement options also specify how the observations are accumulated and how the time ID values are aligned to form the actual time series. The information specified affects all variables specified in subsequent FORECAST statements. If the ID statement is specified, the INTERVAL= option must also be specified. If an ID statement is not specified, the observation number (with respect to the BY group) is used as the time ID.
The following options can be used with the ID statement.
specifies how the data set observations are accumulated within each time period. The frequency (width of each time interval) is specified by the INTERVAL= option. The ID variable contains the time ID values. Each time ID variable value corresponds to a specific time period. The accumulated values form the actual time series, which is used in subsequent model fitting and forecasting.
The ACCUMULATE= option is particularly useful when there are zero or more than one input observations that coincide with a particular time period (for example, transactional data). The EXPAND procedure offers additional frequency conversions and transformations that can also be useful in creating a time series.
The following options determine how the observations are accumulated within each time period based on the ID variable and the frequency specified by the INTERVAL= option:
No accumulation occurs; the ID variable values must be equally spaced with respect to the frequency. This is the default option.
Observations are accumulated based on the total sum of their values.
Observations are accumulated based on the average of their values.
Observations are accumulated based on the minimum of their values.
Observations are accumulated based on the median of their values.
Observations are accumulated based on the maximum of their values.
Observations are accumulated based on the number of nonmissing observations.
Observations are accumulated based on the number of missing observations.
Observations are accumulated based on the number of observations.
Observations are accumulated based on the first of their values.
Observations are accumulated based on the last of their values.
Observations are accumulated based on the standard deviation of their values.
Observations are accumulated based on the corrected sum of squares of their values.
Observations are accumulated based on the uncorrected sum of squares of their values.
If the ACCUMULATE= option is specified, the SETMISSING= option is useful for specifying how accumulated missing values are treated. If missing values should be interpreted as zero, then SETMISSING=0 should be used. The section Details: HPFENGINE Procedure describes accumulation in greater detail.
controls the alignment of SAS dates that are used to identify output observations. The ALIGN= option accepts the following values: BEGINNING | BEG | B, MIDDLE | MID | M, and ENDING | END | E. The default is BEGINNING.
specifies a SAS date, datetime, or time value that represents the end of the data. If the last time ID variable value is less than the END= value, the series is extended with missing values. If the last time ID variable value is greater than the END= value, the series is truncated. For example, END="&sysdate"D uses the automatic macro variable SYSDATE to extend or truncate the series to the current date.
specifies a SAS format that is used for the DATE variable in the output data sets. The default format is the same as that of the DATE variable in the DATA= data set.
specifies a SAS date, datetime, or time value that represents the start of the forecast horizon. If the specified HORIZONSTART= date falls beyond the end of the historical data, then forecasts are computed from the last observation with a nonmissing dependent variable value until LEAD= intervals from the HORIZONSTART= data. Therefore, the effective forecast horizon for any particular BY group might differ from another due to differences in the lengths of the historical data across the BY groups, but all forecasts will end at the same date as determined by the HORIZONSTART= and LEAD= options.
An important feature of the HORIZONSTART= option is that it truncates values only in forecast variables. Any future values of input variables are retained.
specifies the frequency of the input time series. For example, if the input data set consists of quarterly observations, then INTERVAL=QTR should be used. If the SEASONALITY= option is not specified, the length of the seasonal cycle is implied from the INTERVAL= option. For example, INTERVAL=QTR implies a seasonal cycle of length 4. If the ACCUMULATE= option is also specified, the INTERVAL= option determines the time periods for the accumulation of observations. See the SAS/ETS User’s Guide for the intervals that can be specified.
specifies how missing values (either actual or accumulated) are assigned in the accumulated time series. If a number is specified, missing values are set to that number. If a missing value indicates an unknown value, this option should not be used. If a missing value indicates no value, a SETMISSING=0 should be used. You would typically use SETMISSING=0 for transactional data because the absence of recorded data usually implies no activity. The following options can also be used to determine how missing values are assigned:
Missing values are set to missing. This is the default option.
Missing values are set to the accumulated average value.
Missing values are set to the accumulated minimum value.
Missing values are set to the accumulated median value.
Missing values are set to the accumulated maximum value.
Missing values are set to the accumulated first nonmissing value.
Missing values are set to the accumulated last nonmissing value.
Missing values are set to the previous accumulated nonmissing value. Missing values at the beginning of the accumulated series remain missing.
Missing values are set to the next accumulated nonmissing value. Missing values at the end of the accumulated series remain missing.
If SETMISSING=MISSING is specified and the MODEL= option specifies a smoothing model, the missing observations are smoothed over. If MODEL=IDM is specified, missing values are assumed to be periods of no demand; that is, SETMISSING=MISSING is equivalent to SETMISSING=0.
specifies a SAS date, datetime, or time value that represents the beginning of the data. If the first time ID variable value is greater than the START= value, the series is prefixed with missing values. If the first time ID variable value is less than the END= value, the series is truncated. This option and the END= option can be used to ensure that data associated with each BY group contain the same number of observations.
specifies how missing values (either actual or accumulated) are trimmed from the accumulated time series for variables listed in the FORECAST statement. The following options are provided:
No missing value trimming is applied.
Beginning missing values are trimmed.
Ending missing values are trimmed.
Both beginning and ending missing values are trimmed. This is the default.
specifies how beginning and/or ending zero values (either actual or accumulated) are interpreted in the accumulated time series. The following options can also be used to determine how beginning and/or ending zero values are assigned:
Beginning and/or ending zeros are unchanged. This is the default.
Beginning zeros are set to missing.
Ending zeros are set to missing.
Both beginning and ending zeros are set to missing.
If the accumulated series is all missing and/or zero, the series is not changed.
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.