The SIMILARITY Procedure

Accumulation

If the ACCUMULATE= option is specified in the ID, INPUT, or TARGET statement, data set observations are accumulated within each time period. The frequency (width of each time interval) is specified by the INTERVAL= option in the ID statement. The ID variable contains the time ID values. Each time ID value corresponds to a specific time period. Accumulation is particularly useful when the input data set contains transactional data, whose observations are not spaced with respect to any particular time interval. The accumulated values form the time series, which is used in subsequent analyses.

For example, suppose a data set contains the following observations:

   19MAR1999    10
   19MAR1999    30
   11MAY1999    50
   12MAY1999    20
   23MAY1999    20

If the INTERVAL=MONTH is specified, all of the preceding observations fall within three time periods of March 1999, April 1999, and May 1999. The observations are accumulated within each time period as follows:

If the ACCUMULATE=NONE option is specified, an error is generated because the ID variable values are not equally spaced with respect to the specified frequency (MONTH).

If the ACCUMULATE=TOTAL option is specified, the data are accumulated as follows:

   O1MAR1999    40
   O1APR1999    .
   O1MAY1999    90

If the ACCUMULATE=AVERAGE option is specified, the data are accumulated as follows:

   O1MAR1999    20
   O1APR1999    .
   O1MAY1999    30

If the ACCUMULATE=MINIMUM option is specified, the data are accumulated as follows:

   O1MAR1999    10
   O1APR1999    .
   O1MAY1999    20

If the ACCUMULATE=MEDIAN option is specified, the data are accumulated as follows:

   O1MAR1999    20
   01APR1999    .
   O1MAY1999    20

If the ACCUMULATE=MAXIMUM option is specified, the data are accumulated as follows:

   O1MAR1999    30
   O1APR1999    .
   O1MAY1999    50

If the ACCUMULATE=FIRST option is specified, the data are accumulated as follows:

   O1MAR1999    10
   O1APR1999    .
   O1MAY1999    50

If the ACCUMULATE=LAST option is specified, the data are accumulated as follows:

   O1MAR1999    30
   O1APR1999    .
   O1MAY1999    20

If the ACCUMULATE=STDDEV option is specified, the data are accumulated as follows:

   O1MAR1999    14.14
   O1APR1999    .
   O1MAY1999    17.32

As can be seen from the preceding examples, even though the data set observations contain no missing values, the accumulated time series can have missing values.