VAR
level (variables </ options>)<level (variables </ options>)…> ;
where the syntax for the options is as follows:
ABSENT=value MISSING=missmethod  value ORDER=orderoption STD=stdmethod WEIGHTS=weightlist
The VAR statement lists variables from which distances are to be computed. The VAR statement is required. The variables can
be numeric or character depending on their measurement levels. A variable cannot appear more than once in either the same
list or a different list.
level is required. It declares the levels of measurement for those variables specified within the parentheses. Available values for level are as follows:
 ANOMINAL

variables are asymmetric nominal and can be either numeric or character.
 NOMINAL

variables are symmetric nominal and can be either numeric or character.
 ORDINAL

variables are ordinal and can be either numeric or character. Values of ordinal variables are replaced by their corresponding
rank scores. If standardization is required, the standardized rank scores are output to the data set specified in the OUTSDZ=
option. See the RANKSCORE= option in the PROC DISTANCE statement for methods available for assigning rank scores to ordinal
variables. After being replaced by scores, ordinal variables are considered interval.
 INTERVAL

variables are interval and numeric.
 RATIO

variables are ratio and numeric. Ratio variables should always contain positive measurements.
Each variable list can be followed by an option list. Use “/ ” after the list of variables to start the option list. An option list contains options that are applied to the variables.
The following options are available in the option list:
 ABSENT=

specifies the value to be used as an absence value in an irrelevant absentabsent match for asymmetric nominal variables.
 MISSING=

specifies the method (or numeric value) with which to replace missing data.
 ORDER=

selects the order for assigning scores to ordinal variables.
 STD=

selects the standardization method.
 WEIGHTS=

assigns weights to the variables in the list.
If an option is missing from the current attribute list, PROC DISTANCE provides default values for all the variables in the
current list.
For example, in the VAR statement
var ratio(x1x4/std= mad weights= .5 .5 .1 .5 missing= 99)
interval(x5/std= range)
ordinal(x6/order= desc);
the first option list defines x1
–x4
as ratio variables to be standardized by the MAD method. Also, any missing values in x1
–x4
should be replaced by –99. x1
is given a weight of 0.5, x2
is given a weight of 0.5, x3
is given a weight of 0.1, and x4
is given a weight of 0.5.
The second option list defines x5
as an interval variable to be standardized by the RANGE method. If the REPLACE option is specified in the PROC DISTANCE statement,
missing values in x5
are replaced by the location estimate from the RANGE method. By default, x5
is given a weight of 1.
The last option list defines x6
as an ordinal variable. The scores are assigned from highest to lowest by its unformatted values. Although the STD= option
is not specified, x6
is standardized by the default method (STD) because there is more than one level of measurements (ratio, interval, and ordinal)
in the VAR statement. Again, if the REPLACE option is specified, missing values in x6
are replaced by the location estimate from the STD method. Finally, by default, x6
is given a weight of 1.
More details for the options are explained as follows.

STD=stdmethod

specifies the standardization method. Valid values for stdmethod are MEAN, MEDIAN, SUM, EUCLEN, USTD, STD, RANGE, MIDRANGE, MAXABS, IQR, MAD, ABW, AHUBER, AWAVE, AGK, SPACING, and L. Table 34.9 lists available methods of standardization as well as their corresponding location and scale measures.
Table 34.9: Available Standardization Methods
Method

Scale

Location

MEAN

1

mean

MEDIAN

1

median

SUM

sum

0

EUCLEN

Euclidean length

0

USTD

standard deviation about origin

0

STD

standard deviation

mean

RANGE

range

minimum

MIDRANGE

range/2

midrange

MAXABS

maximum absolute value

0

IQR

interval quartile range

median

MAD

median absolute deviation from median

median

ABW(c)

biweight Aestimate

biweight 1step Mestimate

AHUBER(c)

Huber Aestimate

Huber 1step Mestimate

AWAVE(c)

Wave 1step Mestimate

Wave Aestimate

AGK(p)

AGK estimate (ACECLUS)

mean

SPACING(p)

minimum spacing

mid minimumspacing

L(p)



These standardization methods are further documented in the section on the METHOD= option in the PROC STDIZE statement of
the STDIZE procedure (see the section Standardization Methods in Chapter 87: The STDIZE Procedure,).
Standardization is not required if there is only one level of measurement, or if only asymmetric nominal and nominal levels
are specified; otherwise, standardization is mandatory. When standardization is mandatory, a default method is provided when
the STD= option is not specified. You can suppress the mandatory standardization by using the NOSTD option in the PROC DISTANCE
statement. See the NOSTD option in the section PROC DISTANCE Statement and the section Mandatory Standardization for details.
The default method is STD for standardizing interval variables and MAXABS for standardizing ratio variables unless METHOD=GOWER
or METHOD=DGOWER is specified. If METHOD=GOWER is specified, interval variables are standardized by the RANGE method, and
whatever is specified in the STD= option is ignored; if METHOD=DGOWER is specified, the RANGE method is the default standardization
method for interval variables. The MAXABS method is the default standardization method for ratio variables for both the GOWER
and DGOWER methods.
Notice that a ratio variable should always be positive.
Table 34.10 lists standardization methods and the levels of measurement that can be accepted by each method. For example, the SUM method
can be used to standardize ratio variables but not interval or ordinal variables. Also, the AGK and SPACING methods should
not be used to standardize ordinal variables. If you apply AGK and SPACING to ranks, the results are degenerate because all
the spacings of a given order are equal.
Table 34.10: Legitimate Levels of Measurements for Each Method
Standardization

Legitimate

Method

Levels of Measurement

MEAN

ratio, interval, ordinal

MEDIAN

ratio, interval, ordinal

SUM

ratio

EUCLEN

ratio

USTD

ratio

STD

ratio, interval, ordinal

RANGE

ratio, interval, ordinal

MIDRANGE

ratio, interval, ordinal

MAXABS

ratio

IQR

ratio, interval, ordinal

MAD

ratio, interval, ordinal

ABW(c)

ratio, interval, ordinal

AHUBER(c)

ratio, interval, ordinal

AWAVE(c)

ratio, interval, ordinal

AGK(p)

ratio, interval

SPACING(p)

ratio, interval

L(p)

ratio, interval, ordinal


ABSENT=numner  qs

specifies the value to be used as an absence value in an irrelevant absentabsent match for asymmetric nominal variables.
The absence value specified here overwrites the absence value specified through the ABSENT= option in the PROC DISTANCE statement
for those variables in the current variable list.
An absence value for a variable can be either a numeric value or a quoted string consisting of combinations of characters.
For instance, ., –999, “NA” are legal values for the ABSENT= option.
The default for an absence value for a character variable is “NONE” (notice that a blank value is considered a missing value), and the default for an absence value for a numeric variable is
0.

MISSING=missmethod  value

specifies the method or a numeric value for replacing missing values. If you omit the MISSING= option, the REPLACE option replaces missing values with the location measure
given by the STD= option. Specify the MISSING= option when you want to replace missing values with a different value. You
can specify any method that is valid in the STD= option. The corresponding location measure is used to replace missing values.
If a numeric value is given, the value replaces missing values after standardizing the data. However, when standardization
is not mandatory, you can specify the REPONLY option with the MISSING= option to suppress standardization for cases in which
you want only to replace missing values.
If the NOSTD option is specified, there is no standardization, but missing values are replaced by the corresponding location
measures or by the numeric value of the MISSING= option. See the section Missing Values for details about missing values replacement with and without standardization.

ORDER=ASCENDING  DESCENDING  ASCFORMATTED  DESFORMATTED  DSORDER
ORDER=ASC  DESC  ASCFMT  DESFMT  DATA

specifies the order for assigning score to ordinal variables. The value for the ORDER= option can be one of the following:
 ASCENDING

scores are assigned in lowesttohighest order of unformatted values.
 DESCENDING

scores are assigned in highesttolowest order of unformatted values.
 ASCFORMATTED

scores are assigned in ascending order by their formatted values. This option can be applied to character variables only,
since unformatted values are always used for numeric variables.
 DESFORMATTED

scores are assigned in descending order by their formatted values. This option can be applied to character variables only,
since unformatted values are always used for numeric variables.
 DSORDER

scores are assigned according to the order of their appearance in the input data set.
The default value is ASCENDING.

WEIGHTS=weightlist

specifies a list of values for weighting individual variables while computing the proximity. Values in this list can be separated by blanks or commas. You can include one or more items
of the form start TO stop BY increment. This list should contain at least one weight. The maximum number of weights you can list is equal to the number of variables.
If the number of weights is less than the number of variables, the last value in the weightlist is used for the rest of the variables; conversely, if the number of weights is greater than the number of variables, the
trailing weights are discarded.
The default value is 1.