The STDIZE Procedure

WEIGHT Statement

  • WEIGHT variable;

The WEIGHT statement specifies a numeric variable in the input data set with values that are used to weight each observation. Only one variable can be specified.

The WEIGHT variable values can be nonintegers. An observation is used in the analysis only if the value of the WEIGHT variable is greater than zero.

The WEIGHT variable applies only when you specify the following standardization methods: AGK, EUCLEN, IQR, L, MAD, MEAN, MEDIAN, STD, SUM, and USTD. Weights are used for the METHOD=MAD, MEDIAN, or IQR only when PCTLMTD=ORD_STAT is specified; if PCTLMTD=ONEPASS is specified, the WEIGHT statement is ignored.

PROC STDIZE uses the value of the WEIGHT variable to calculate the sample mean and sample variances:

$ ~  ~  ~  \overline{x}_{w} = \sum _{i}w_{i}x_{i} / \sum _{i}w_{i} $              (sample mean)

$ ~  ~  ~  us_ w^2 = \sum _{i}{w_{i}{x_ i^2}} / d ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~  $    (uncorrected sample variances)

$ ~  ~  ~  s_ w^2 = \sum _{i}{w_{i}{(x_{i}-{\overline{x}_ w})}^2} /d ~ ~ ~  $     (sample variances)

where $w_ i$ is the weight value of the ith observation, $x_ i$ is the value of the ith observation, and d is the divisor controlled by the VARDEF= option (see the VARDEF= option for details).

The following weighted statistics are defined accordingly:

MEAN

the weighted mean, $\overline{x}_{w}$

SUM

the weighted sum, $\sum _{i}{w_{i}{x_{i}}}$

USTD

the weighted uncorrected standard deviation, $\sqrt {us_ w^2}$

STD

the weighted standard deviation, $\sqrt {s_ w^2}$

EUCLEN

the weighted Euclidean length, computed as the square root of the weighted uncorrected sum of squares:

\[  \sqrt {\sum _{i}w_{i}{x_ i^2}}  \]
MEDIAN

the weighted median. See the section Weighted Percentiles for the formulas and descriptions.

MAD

the weighted median absolute deviation from the weighted median. See the section Weighted Percentiles for the formulas and descriptions.

IQR

the weighted median, 25th percentile, and the 75th percentile. See the section Weighted Percentiles for the formulas and descriptions.

AGK

the AGK estimate. This estimate is documented further in the ACECLUS procedure as the METHOD=COUNT option. See the discussion of the WEIGHT statement in Chapter 24: The ACECLUS Procedure, for information about how the WEIGHT variable is applied to the AGK estimate.

L

the $L_ p$ estimate. This estimate is documented further in the FASTCLUS procedure as the LEAST= option. See the discussion of the WEIGHT statement in Chapter 38: The FASTCLUS Procedure, for information about how the WEIGHT variable is used to compute weighted cluster means. The number of clusters is always 1.