The TIMESERIES Procedure

ID Statement

  • ID variable INTERVAL=interval <options>;

The ID statement names a numeric variable that identifies observations in the input and output data sets. The ID variable’s values are assumed to be SAS date or datetime values. In addition, the ID statement specifies the frequency to be associated with the time series. The ID statement option also specify how the observations are accumulated and how the time ID values are aligned to form the time series. The specified information affects all variables that are listed in subsequent VAR statements. If you do not specify an ID statement, the observation number, with respect to the BY group, is used as the time ID.

You must specify the following argument:

INTERVAL=interval

specifies the frequency of the accumulated time series. For example, if the input data set consists of quarterly observations, then specify INTERVAL=QTR. If the PROC TIMESERIES statement SEASONALITY= option is not specified, the length of the seasonal cycle is implied from the INTERVAL= option. For example, INTERVAL=QTR implies a seasonal cycle of length 4. If the ACCUMULATE= option is also specified, the INTERVAL= option determines the time periods for the accumulation of observations. The INTERVAL= option is required and must be the first option specified in the ID statement.

You can also specify the following options:

ACCUMULATE=option

specifies how the data set observations are to be accumulated within each time period. The frequency (width of each time interval) is specified by the INTERVAL= interval. The ID variable contains the time ID values. Each time ID variable value corresponds to a specific time period. The accumulated values form the time series, which is used in subsequent analysis.

This option is useful when there are zero or more than one input observations that coincide with a particular time period (for example, time-stamped transactional data). The EXPAND procedure offers additional frequency conversions and transformations that can also be useful in creating a time series.

You can specify the following options, which determine how the observations are accumulated within each time period based on the ID variable and on the frequency specified in INTERVAL= interval:

NONE

does not accumulate observations; the ID variable values must be equally spaced with respect to the frequency.

TOTAL

accumulates observations based on the total sum of their values.

AVERAGE | AVG

accumulates observations based on the average of their values.

MINIMUM | MIN

accumulates observations based on the minimum of their values.

MEDIAN | MED

accumulates observations based on the median of their values.

MAXIMUM | MAX

accumulates observations based on the maximum of their values.

N

accumulates observations based on the number of nonmissing observations.

NMISS

accumulates observations based on the number of missing observations.

NOBS

accumulates observations based on the number of observations.

FIRST

accumulates observations based on the first of their values.

LAST

accumulates observations based on the last of their values.

STDDEV |STD

accumulates observations based on the standard deviation of their values.

CSS

accumulates observations based on the corrected sum of squares of their values.

USS

accumulates observations based on the uncorrected sum of squares of their values.

If you specify the ACCUMULATE= option, the SETMISSING= option is useful for specifying how accumulated missing values are to be treated. If missing values are to be interpreted as 0, then specify SETMISSING=0. For more information about accumulation, see the section Details: TIMESERIES Procedure.

By default, ACCUMULATE=NONE.

ALIGN=option

controls the alignment of SAS dates that are used to identify output observations. The ALIGN= option accepts the following values: BEGINNING | BEG | B, MIDDLE | MID | M, and ENDING | END | E. BEGINNING is the default.

BOUNDARYALIGN=option

controls how the ACCUMULATE= option is processed for the two boundary time intervals, which include the START= and END= time ID values. Some time ID values might fall inside the first and last accumulation intervals but fall outside the START= and END= boundaries. In these cases the BOUNDARYALIGN= option determines which values to include in the accumulation operation. You can specify the following options:

NONE

does not accumulate any values outside the START= and END= boundaries.

START

accumulates all observations in the first time interval.

END

accumulates all observations in the last time interval.

BOTH

accumulates all observations in the first and last.

For more information, see the section Details: TIMESERIES Procedure. By default, BOUNDARYALIGN=NONE.

END=value

specifies a SAS date or datetime value that represents the end of the data. If the last time ID variable value is less than value, the series is extended with missing values. If the last time ID variable value is greater than value, the series is truncated. For example, END="&sysdate"D uses the automatic macro variable SYSDATE to extend or truncate the series to the current date. You can use the START= and END= options to ensure that data associated within each BY group contains the same number of observations.

FORMAT=format

specifies the SAS format for the time ID values. The default format is implied from the INTERVAL= option.

NOTSORTED

specifies that the time ID values might not be in sorted order. Prior to analysis, the TIMESERIES procedure sorts the data with respect to the time ID.

SETMISSING=option | number

specifies how missing values (either actual or accumulated) are to be interpreted in the accumulated time series. If you specify a number, missing values are set to the number. If a missing value indicates an unknown value, this option should not be used. If a missing value indicates no value, specify SETMISSING=0. You would typically use SETMISSING=0 for transactional data because no recorded data usually implies no activity. Instead of specifying a number, you can specify one of the following options to determine how missing values are assigned:

MISSING

sets missing values to missing.

AVERAGE | AVG

sets missing values to the accumulated average value.

MINIMUM | MIN

sets missing values to the accumulated minimum value.

MEDIAN | MED

sets missing values to the accumulated median value.

MAXIMUM | MAX

sets missing values to the accumulated maximum value.

FIRST

sets missing values to the accumulated first nonmissing value.

LAST

sets missing values to the accumulated last nonmissing value.

PREVIOUS | PREV

sets missing values to the previous period’s accumulated nonmissing value. Missing values at the beginning of the accumulated series remain missing.

NEXT

sets missing values to the next period’s accumulated nonmissing value. Missing values at the end of the accumulated series remain missing.

By default, SETMISSING=MISSING.

START=value

specifies a SAS date or datetime value that represents the beginning of the data. If the first time ID variable value is greater than value, missing values are added to the beginning of the series. If the first time ID variable value is less than value, the series is truncated. You can specify the START= and END= options to ensure that data associated with each BY group contains the same number of observations.