The SIMILARITY Procedure |
PROC SIMILARITY Statement |
The following options can be used in the PROC SIMILARITY statement.
names the SAS data set that contains the time series, transactional, or sequence input data for the procedure. If the DATA= option is not specified, the most recently created SAS data set is used.
specifies the order that the variables listed in the INPUT and TARGET statements are processed. This ordering affects the OUTSEQUENCE=, OUTPATH=, OUTMEASURE=, and OUTSUM= data sets, as well as the printed and graphical output. The SORTNAMES option also affects the ordering of the analysis.
names the output data set to contain the time series variables specified in the subsequent INPUT and TARGET statements. If an ID variable is specified, it is also included in the OUT= data set. The values are accumulated based on the ID statement INTERVAL= option or the ACCUMULATE= options or both. The values are transformed based on the INPUT or TARGET statement TRANSFORM=, DIF=, and/ or SDIF= options in this order. The OUT= data set is particularly useful when you want to further analyze, model, or forecast the resulting time series with other SAS/ETS procedures.
names the output data set to contain the detailed similarity measures by time ID value. The form of the OUTMEASURE= data set is determined by the PROC SIMILARITY statement SORTNAMES and ORDER= options.
names the output data set to contain the path used to compute the similarity measures for each slide and warp. The form of the OUTPATH= data set is determined by the PROC SIMILARITY statement SORTNAMES and ORDER= options.
names the output data set to contain the sequences used to compute the similarity measures for each slide and warp. The form of the OUTSEQUENCE= data set is determined by the PROC SIMILARITY statement SORTNAMES and ORDER= options.
names the output data set to contain the similarity measure summary. The OUTSUM= data set is particularly useful when analyzing large numbers of series and only the summary of the results are needed. The form of the OUTSUM= data set is determined by the PROC SIMILARITY statement SORTNAMES and ORDER= options.
specifies that each INPUT variable is processed and then the TARGET variables are processed. The results are stored and printed based only on the INPUT variables.
specifies that each INPUT variable is processed and then the TARGET variables are processed. The results are stored and printed based on both the INPUT and TARGET variables. This is the default.
specifies that each TARGET variable is processed and then the INPUT variables are processed. The results are stored and printed based only on the TARGET variables.
specifies that each TARGET variable is processed and then the INPUT variables are processed. The results are stored and printed based on both the TARGET and INPUT variables.
specifies the graphical output desired. The options are separated by spaces. By default, the SIMILARITY procedure produces no graphical output. The following graphical options are available:
same as PLOTS=(INPUTS TARGETS SEQUENCES NORMALIZED SCALED DISTANCES PATHS MAPS WARPS COST MEASURES).
plots time warp costs graphics.
plots similarity absolute and relative distances graphics. (OUTPATH= data set)
plots input variable time series graphics. (OUT= data set)
plots time warp maps graphics. (OUTPATH= data set)
plots similarity measure graphics. (OUTMEASURE= data set)
plots both the input and target variable normalized sequence graphics. These plots are displayed only when the INPUT or TARGET statement NORMALIZE= option is specified.
plots time warp paths graphics. (OUTPATH= data set)
plots both the input variable scaled sequence graphics. These plots are displayed only when the INPUT statement SCALE= option is specified.
plots both the input and target variable sequence graphics. (OUTSEQUENCE= data set)
plots target variable time series graphics. (OUT= data set)
plots time warps graphics. (OUTPATH= data set)
specifies the printed output desired. The options are separated by spaces. By default, the SIMILARITY procedure produces no printed output. The following printing options are available:
prints the descriptive statistics for the working time series.
prints the path statistics table.
prints the cost statistics table.
prints the warp summary table.
prints the slides summary table.
prints the similarity measure summary table.
same as PRINT=(DESCSTATS PATHS COSTS WARPS SLIDES SUMMARY).
specifies that output requested with the PRINT= option be printed in greater detail.
specifies the length of the seasonal cycle where integer ranges from one to 10,000. For example, SEASONALITY=3 means that every group of three time periods forms a seasonal cycle. By default, the length of the seasonal cycle is one (no seasonality) or the length implied by the INTERVAL= option specified in the ID statement. For example, INTERVAL=MONTH implies that the length of the seasonal cycle is twelve.
specifies that the variables specified in the INPUT and TARGET statements are processed in order sorted by the variable names. By default, the SIMILARITY procedure processes the variables in the order they are listed. The ORDER= option also affects the ordering of the analysis.
Note: This procedure is experimental.
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.