The SIMILARITY Procedure

INPUT Statement

INPUT variable-list < / options> ;

The INPUT statement lists the input numeric variables in the DATA= data set whose values are to be accumulated to form the time series or represent ordered numeric sequences (when no ID statement is specified).

An input data set variable can be specified in only one INPUT or TARGET statement. Any number of INPUT statements can be used. The following options can be used with an INPUT statement:

ACCUMULATE=option

specifies how the data set observations are accumulated within each time period for the variables listed in the INPUT statement. If the ACCUMULATE= option is not specified in the INPUT statement, accumulation is determined by the ACCUMULATE= option of the ID statement. If the ACCUMULATE= option is not specified in the ID statement or the INPUT statement, no accumulation is performed. See the ID statement ACCUMULATE= option for more details.

DIF=(numlist)

specifies the differencing to be applied to the accumulated time series. The list of differencing orders must be separated by spaces or commas. For example, DIF=(1,3) specifies first, then third order, differencing. Differencing is applied after time series transformation. The TRANSFORM= option is applied before the DIF= option. Simple differencing is useful when you want to detrend the time series before computing the similarity measures.

NORMALIZE=option

specifies the sequence normalization to be applied to the working input sequence. The following normalization options are provided:

NONE

No normalization is applied. This option is the default.

ABSOLUTE

Absolute normalization is applied.

STANDARD

Standard normalization is applied.

User-Defined

Normalization is computed by a user-defined subroutine that is created using the FCMP procedure, where User-Defined is the subroutine name.

Normalization is applied to the working input sequence, which can be a subset of the working input time series if the SLIDE=INDEX or SLIDE=SEASON option is specified.

SCALE=option

specifies the scaling of the working input sequence with respect to the working target sequence. Scaling is performed after normalization. The following scaling options are provided:

NONE

No scaling is applied. This option is the default.

ABSOLUTE

Absolute scaling is applied.

STANDARD

Standard scaling is applied.

User-Defined

Scaling is computed by a user-defined subroutine that is created using the FCMP procedure, where User-Defined is the subroutine name.

Scaling is applied to the working input sequence, which can be a subset of the working input time series if the SLIDE=INDEX or SLIDE=SEASON option is specified.

SDIF=(numlist)

specifies the seasonal differencing to be applied to the accumulated time series. The list of seasonal differencing orders must be separated by spaces or commas. For example, SDIF=(1,3) specifies first, then third, order seasonal differencing. Differencing is applied after time series transformation. The TRANSFORM= option is applied before the SDIF= option. Seasonal differencing is useful when you want to deseasonalize the time series before computing the similarity measures.

SETMISSING=option | number
SETMISS=option | number

specifies how missing values (either actual or accumulated) are interpreted in the accumulated time series or ordered sequence for variables listed in the INPUT statement. If the SETMISSING= option is not specified in the INPUT statement, missing values are set based on the SETMISSING= option in the ID statement. If the SETMISSING= option is not specified in the ID statement or the INPUT statement, no missing value interpretation is performed. See the ID statement SETMISSING= option for more details.

TRANSFORM=option

specifies the time series transformation to be applied to the accumulated time series. The following transformations are provided:

NONE

No transformation is applied. This option is the default.

LOG

Logarithmic transformation is applied.

SQRT

Square-root transformation is applied.

LOGISTIC

Logistic transformation is applied.

BOXCOX(number)

Box-Cox transformation with parameter is applied, where the real number is between –5 and 5.

User-Defined

Transformation is computed by a user-defined subroutine that is created using the FCMP procedure, where User-Defined is the subroutine name.

When the TRANSFORM= option is specified, the time series must be strictly positive unless a user-defined function is used.

TRIMMISSING=option
TRIMMISSING=option

specifies how missing values (either actual or accumulated) are trimmed from the accumulated time series or ordered sequence for variables that are listed in the INPUT statement. The following trimming options are provided:

NONE

No missing value trimming is applied.

LEFT

Beginning missing values are trimmed.

RIGHT

Ending missing values are trimmed.

BOTH

Both beginning and ending missing value are trimmed. This is the default.

ZEROMISS=option

specifies how beginning and ending zero values (either actual or accumulated) are interpreted in the accumulated time series or ordered sequence for variables listed in the INPUT statement. If the ZEROMISS= option is not specified in the INPUT statement, beginning and ending zero values are set based on the ZEROMISS= option of the ID statement. If the ZERO= option is not specified in the ID statement or the INPUT statement, no zero value interpretation is performed. See the ID statement ZEROMISS= option for more details.