XSCHART Statement: SHEWHART Procedure

Methods for Estimating the Standard Deviation

When control limits are determined from the input data, four methods (referred to as default, MVLUE, MVGRANGE, and RMSDF) are available for estimating $\sigma $.

Default Method

The default estimate for $\sigma $ is

\[ \hat{\sigma } = \frac{s_{1}/c_{4}(n_{1})+ \cdots + s_{N}/c_{4}(n_{N})}{N} \]

where N is the number of subgroups for which $n_{i} \geq 2$, $s_{i}$ is the sample standard deviation of the ith subgroup

\[ s_{i} = \sqrt { \frac{1}{n_{i} - 1} \sum ^{n_ i}_{j=1}(x_{ij}-\bar{X}_{i})^{2}} \]

and

\[ c_{4}(n_{i}) = \frac{\Gamma (n_{i}/2)\sqrt {2/(n_{i}-1)}}{\Gamma ((n_{i}-1)/2)} \]

Here, $\Gamma (\cdot )$ denotes the gamma function, and $\bar{X}_{i}$ denotes the ith subgroup mean. A subgroup standard deviation $s_{i}$ is included in the calculation only if $n_{i} \geq 2$. If the observations are normally distributed, then the expected value of $s_{i}$ is $c_{4}(n_{i})\sigma $. Thus, $\hat{\sigma }$ is the unweighted average of N unbiased estimates of $\sigma $. This method is described in the American Society for Testing and Materials (1976).

MVLUE Method

If you specify SMETHOD= MVLUE, a minimum variance linear unbiased estimate (MVLUE) is computed for $\sigma $. Refer to Burr (1969, 1976) and Nelson (1989, 1994). This estimate is a weighted average of N unbiased estimates of $\sigma $ of the form $s_ i/c_4(n_ i)$, and it is computed as

\[ \hat{\sigma } = \frac{h_{1}s_{1}/c_{4}(n_{1})+ \cdots + h_{N}s_{N}/c_{4}(n_{N})}{h_1 + \cdots + h_ N} \]

where

\[ h_ i = \frac{[c_4(n_ i)]^{2}}{1 - [c_4(n_ i)]^{2}} \]

A subgroup standard deviation $s_ i$ is included in the calculation only if $n_ i \geq 2$, and N is the number of subgroups for which $n_{i} \geq 2$. The MVLUE assigns greater weight to estimates of $\sigma $ from subgroups with larger sample sizes, and it is intended for situations where the subgroup sample sizes vary. If the subgroup sample sizes are constant, the MVLUE reduces to the default estimate.

MVGRANGE Method

If you specify SMETHOD=MVGRANGE, $\sigma $ is estimated using a moving range of subgroup averages. This is appropriate for constructing control charts for means when the jth measurement in the ith subgroup can be modeled as $x_{ij} = \sigma _{B}\omega _{i} + \sigma _{W}\epsilon _{ij}$, where $\sigma _{B}^{2}$ is the between-subgroup variance, $\sigma _{W}^{2}$ is the within-subgroup variance, the $\omega _{i}$ are independent with zero mean and unit variance, and the $\omega _{i}$ are independent of the $\epsilon _{ij}$.

The estimate for $\sigma $ is

\[ \hat{\sigma } = \bar{R}/d_{2}(n) \]

where $\bar{R}$ is the average of the moving ranges, n is the number of consecutive subgroup averages used to compute each moving range, and the unbiasing factor $d_{2}(n)$ is defined so that if the subgroup averages are normally distributed, the expected value of $R_{i}$ is

\[ E(R_{i}) = d_{2}(n_ i)\sigma \]

This method is appropriate for constructing a variation on the three-way control chart that is advocated for this situation by Wheeler (1995). A three-way control chart is useful when sampling, or within-group variation is not the only source of variation, as discussed in Multiple Components of Variation. Wheeler’s three-way control chart comprises a chart of subgroup means, a moving range chart of the subgroup means, and a chart of subgroup ranges. This variation substitutes a chart of subgroup standard deviations for the chart of subgroup ranges. When you specify the SMETHOD=MVGRANGE option, the XSCHART statement produces the appropriate charts of subgroup means and subgroup standard deviations.

RMSDF Method

If you specify SMETHOD=RMSDF, a weighted root-mean-square estimate is computed for $\sigma $:

\[ \hat{\sigma } = \frac{\sqrt {(n_{1} - 1)s_1^{2} + \cdots + (n_{N} - 1)s_{N}^{2}}}{c_{4}(n)\sqrt {n_{1} + \cdots + n_{N} - N}} \]

where $n = n_1 + \cdots + n_ N - (N - 1)$. The weights are the degrees of freedom $n_{i} - 1$. A subgroup standard deviation $s_{i}$ is included in the calculation only if $n_{i} \geq 2$, and N is the number of subgroups for which $n_{i} \geq 2$.

If the unknown standard deviation $\sigma $ is constant across subgroups, the root-mean-square estimate is more efficient than the minimum variance linear unbiased estimate. However, in process control applications it is generally not assumed that $\sigma $ is constant, and if $\sigma $ varies across subgroups, the root-mean-square estimate tends to be more inflated than the MVLUE.