BOXCHART Statement: SHEWHART Procedure

Methods for Estimating the Standard Deviation

When control limits are computed from the input data, three methods (referred to as default, MVLUE and RMSDF) are available for estimating the process standard deviation $\sigma $. The method depends on whether you specify the RANGES option. If you specify this option, $\sigma $ is estimated using subgroup ranges; otherwise, $\sigma $ is estimated using subgroup standard deviations.

Default Method Based on Subgroup Standard Deviations

If you do not specify the RANGES option, the default estimate for $\sigma $ is

\[ \hat{\sigma } = \frac{s_{1}/c_{4}(n_{1})+ \cdots + s_{N}/c_{4}(n_{N})}{N} \]

where N is the number of subgroups for which $n_{i} \geq 2$, $s_{i}$ is the sample standard deviation of the ith subgroup

\[ s_{i} = \sqrt { \frac{1}{n_{i} - 1} \sum ^{n_ i}_{j=1}(x_{ij}-\bar{X}_{i})^{2}} \]

and

\[ c_{4}(n_{i}) = \frac{\Gamma (n_{i}/2)\sqrt {2/(n_{i}-1)}}{\Gamma ((n_{i}-1)/2)} \]

Here $\Gamma (\cdot )$ denotes the gamma function, and $\bar{X}_{i}$ denotes the ith subgroup mean. A subgroup standard deviation $s_{i}$ is included in the calculation only if $n_{i} \geq 2$. If the observations are normally distributed, the expected value of $s_{i}$ is $c_{4}(n_{i})\sigma $. Thus, $\hat{\sigma }$ is the unweighted average of N unbiased estimates of $\sigma $. This method is described in the American Society for Testing and Materials (1976).

Default Method Based on Subgroup Ranges

If you specify the RANGES option, the default estimate for $\sigma $ is

\[ \hat{\sigma } = \frac{R_{1}/d_{2}(n_{1})+ \cdots + R_{N}/d_{2}(n_{N})}{N} \]

where N is the number of subgroups for which $n_ i \geq 2$, and $R_ i$ is the sample range of the observations $x_{i1}$, . . . ,$x_{in_{i}}$ in the ith subgroup.

\[ R_{i} = \max _{1 \leq j \leq n_{i}}(x_{ij}) - \min _{1 \leq j \leq n_{i}}(x_{ij}) \]

A subgroup range $R_{i}$ is included in the calculation only if $n_{i} \geq 2$. The unbiasing factor $d_{2}(n_ i)$ is defined so that, if the observations are normally distributed, the expected value of $R_{i}$ is $d_{2}(n_ i)\sigma $. Thus, $\hat{\sigma }$ is the unweighted average of N unbiased estimates of $\sigma $. This method is described in the American Society for Testing and Materials (1976).

MVLUE Method Based on Subgroup Standard Deviations

If you do not specify the RANGES option and specify SMETHOD= MVLUE, a minimum variance linear unbiased estimate (MVLUE) is computed for $\sigma $. Refer to Burr (1969, 1976) and Nelson (1989, 1994). This estimate is a weighted average of N unbiased estimates of $\sigma $ of the form $s_ i/c_4(n_ i)$, and it is computed as

\[ \hat{\sigma } = \frac{h_{1}s_{1}/c_{4}(n_{1})+ \cdots + h_{N}s_{N}/c_{4}(n_{N})}{h_1 + \cdots + h_ N} \]

where

\[ h_ i = \frac{[c_4(n_ i)]^{2}}{1 - [c_4(n_ i)]^{2}} \]

A subgroup standard deviation $s_ i$ is included in the calculation only if $n_ i \geq 2$, and N is the number of subgroups for which $n_{i} \geq 2$. The MVLUE assigns greater weight to estimates of $\sigma $ from subgroups with larger sample sizes, and it is intended for situations where the subgroup sample sizes vary. If the subgroup sample sizes are constant, the MVLUE reduces to the default estimate.

MVLUE Method Based on Subgroup Ranges

If you specify the RANGES option and SMETHOD=MVLUE, a minimum variance linear unbiased estimate (MVLUE) is computed for $\sigma $. Refer to Burr (1969, 1976) and Nelson (1989, 1994). The MVLUE is a weighted average of N unbiased estimates of $\sigma $ of the form $R_ i/d_2(n_ i)$, and it is computed as

\[ \hat{\sigma } = \frac{f_{1}R_{1}/d_{2}(n_{1})+ \cdots + f_{N}R_{N}/d_{2}(n_{N})}{f_1 + \cdots + f_ N} \]

where

\[ f_ i = \frac{[d_2(n_ i)]^{2}}{[d_3(n_ i)]^{2}} \]

A subgroup range $R_ i$ is included in the calculation only if $n_ i \geq 2$, and N is the number of subgroups for which $n_{i}\geq 2$. The unbiasing factor $d_3(n_ i)$ is defined so that, if the observations are normally distributed, the expected value of $\sigma _{R_{i}}$ is $d_{3}(n_ i)\sigma $. The MVLUE assigns greater weight to estimates of $\sigma $ from subgroups with larger sample sizes, and it is intended for situations where the subgroup sample sizes vary. If the subgroup sample sizes are constant, the MVLUE reduces to the default estimate.

RMSDF Method Based on Subgroup Standard Deviations

If you do not specify the RANGES option and specify SMETHOD=RMSDF, a weighted root-mean-square estimate is computed for $\sigma $:

\[ \hat{\sigma } = \frac{\sqrt {(n_{1} - 1)s_1^{2} + \cdots + (n_{N} - 1)s_{N}^{2}}}{c_{4}(n)\sqrt {n_{1} + \cdots + n_{N} - N}} \]

where $n = n_1 + \cdots + n_ N - (N - 1)$. The weights are the degrees of freedom $n_{i} - 1$. A subgroup standard deviation $s_{i}$ is included in the calculation only if $n_{i} \geq 2$, and N is the number of subgroups for which $n_{i} \geq 2$.

If the unknown standard deviation $\sigma $ is constant across subgroups, the root-mean-square estimate is more efficient than the minimum variance linear unbiased estimate. However, in process control applications, it is generally not assumed that $\sigma $ is constant, and if $\sigma $ varies across subgroups, the root-mean-square estimate tends to be more inflated than the MVLUE.