The SIMILARITY Procedure |

The SIMILARITY procedure can be used to form time series data from transactional data.

- accumulation
ACCUMULATE= option

- missing value interpretation
SETMISSING= option

- zero value interpretation
ZEROMISS= option

The accumulated time series can then be transformed to form the working time series. Transformations are useful when you want to stabilize the time series before computing the similarity measures. Simple and seasonal differencing are useful when you want to detrend or deseasonalize the time series before computing the similarity measures. Often, but not always, the TRANSFORM=, DIF=, and SDIF= options should be specified in the same way for both the target and input variables.

- time series transformation
TRANSFORM= option

- time series differencing
DIF= and SDIF= option

- time series missing value trimming
TRIMMISSING= option

- time series descriptive statistics
PRINT=DESCSTATS option

After the working series is formed, it can be treated as an ordered sequence that can be normalized or scaled. Normalizations are useful when you want to compare the "shape" or "profile" of the time series. Scaling is useful when you want to compare the input sequence to the target sequence while discounting the variation of the target sequence.

- normalization
NORMALIZE= option

- scaling
SCALE= option

After the working sequences are formed, similarity measures can be computed between input and target sequences.

- sliding
SLIDE= option

- warping
COMPRESS= and EXPAND= option

- similarity measure
MEASURE= and PATH= option

The SLIDE= option is used to specify observation-index sliding, seasonal-index sliding, or no sliding. The COMPRESS= and EXPAND= options are used to specify the warping limits. The MEASURE= and PATH= options are used to specify how the similarity measures are computed.

- Accumulation
- Missing Value Interpretation
- Zero Value Interpretation
- Time Series Transformation
- Time Series Differencing
- Time Series Missing Value Trimming
- Time Series Descriptive Statistics
- Input and Target Sequences
- Sliding Sequences
- Time Warping
- Sequence Normalization
- Sequence Scaling
- Similarity Measures
- User-Defined Functions and Subroutines
- Output Data Sets
- OUT= Data Set
- OUTMEASURE= Data Set
- OUTSUM= Data Set
- OUTSEQUENCE= Data Set
- OUTPATH= Data Set
- _STATUS_ Variable Values
- Printed Output
- ODS Tables Names
- ODS Graphics

Note: This procedure is experimental.

Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.