BOXCHART Statement: SHEWHART Procedure

Output Data Sets

OUTLIMITS= Data Set

The OUTLIMITS= data set saves control limits and control limit parameters. The following variables can be saved:

Table 17.7: OUTLIMITS= Data Set

Variable

Description

_ALPHA_

probability ($\alpha $) of exceeding limits

_CP_

capability index $C_{p}$

_CPK_

capability index $C_{pk}$

_CPL_

capability index $CPL$

_CPM_

capability index $C_{pm}$

_CPU_

capability index $CPU$

_INDEX_

optional identifier for the control limits specified with the OUTINDEX= option

_LCLM_

lower control limit for subgroup median

_LCLR_

lower control limit for subgroup range

_LCLS_

lower control limit for subgroup standard deviation

_LCLX_

lower control limit for subgroup mean

_LIMITN_

nominal sample size associated with the control limits

_LSL_

lower specification limit

_MEAN_

process mean (value of central line on box chart)

_R_

value of central line on R chart

_S_

value of central line on s chart

_SIGMAS_

multiple (k) of standard error of $\bar{X}_{i}$ or $M_{i}$

_STDDEV_

process standard deviation ($\hat{\sigma }$ or $\sigma _{0}$)

_SUBGRP_

subgroup-variable specified in the BOXCHART statement

_TARGET_

target value

_TYPE_

type (estimate or standard value) of _MEAN_ and _STDDEV_

_UCLM_

upper control limit for subgroup median

_UCLR_

upper control limit for subgroup range

_UCLS_

upper control limit for subgroup standard deviation

_UCLX_

upper control limit for subgroup mean

_USL_

upper specification limit

_VAR_

process specified in the BOXCHART statement


Notes:

  1. The variables _LCLM_ and _UCLM_ are included if you specify CONTROLSTAT=MEDIAN; otherwise, the variables _LCLX_ and _UCLX_ are included.

  2. The variables _LCLR_, _R_, and _UCLR_ are included if you specify the RANGES option; otherwise, the variables _LCLS_, _S_, and _UCLS_ are included. These variables are not used to create box charts, but they enable the OUTLIMITS= data set to be used as a LIMITS= data set with the XRCHART, XSCHART, MRCHART, SCHART, and RCHART statements.

  3. If the control limits vary with subgroup sample size, the special missing value V is assigned to the variables _LIMITN_, _LCLX_, _UCLX_, _LCLM_, _UCLM_, _LCLR_, _R_, _UCLR_, _LCLS_, _S_, and _UCLS_.

  4. If the limits are defined in terms of a multiple k of the standard error of $\bar{X_{i}}$, the value of _ALPHA_ is computed as $\alpha =2(1-\Phi (k))$, where $\Phi (\cdot )$ is the standard normal distribution function. If the limits are defined in terms of a multiple k of the standard error of $M_{i}$, the value of _ALPHA_ is computed as $\alpha =2(1-F_{med}(k,n))$, where $F_{med}(\cdot ,n)$ is the cumulative distribution function of the median of a random sample of n standard normally distributed observations, and n is the value of _LIMITN_. If _LIMITN_ has the special missing value V, this value is assigned to _ALPHA_.

  5. If the limits for means are probability limits, the value of _SIGMAS_ is computed as $k=\Phi ^{-1}(1-\alpha /2)$, where $\Phi ^{-1}$ is the inverse standard normal distribution function. If the limits for medians are probability limits, the value of _SIGMAS_ is computed as $k=F_{med}^{-1}(1-\alpha /2,n)$, where $F_{med}^{-1}(\cdot ,n)$ is the inverse distribution function of the median of a random sample of n standard normally distributed observations, and n is the value _LIMITN_. If _LIMITN_ has the special missing value V, this value is assigned to _SIGMAS_.

  6. The variables _CP_, _CPK_, _CPL_, _CPU_, _LSL_, and _USL_ are included only if you provide specification limits with the LSL= and USL= options. The variables _CPM_ and _TARGET_ are included if, in addition, you provide a target value with the TARGET= option. See Capability Indices for computational details.

  7. Optional BY variables are saved in the OUTLIMITS= data set.

The OUTLIMITS= data set contains one observation for each process specified in the BOXCHART statement. For an example, see Saving Control Limits.

OUTBOX= Data Set

The OUTBOX= data set saves subgroup summary statistics, control limits, and outlier values. The following variables can be saved:

  • the subgroup-variable

  • the variable _VAR_, containing the process variable name

  • the variable _TYPE_, identifying features of box-and-whisker plots

  • the variable _VALUE_, containing values of box-and-whisker plot features

  • the variable _ID_, containing labels for outliers

  • the variable _HTML_, containing links associated with box-and-whisker plot features

_ID_ is included in the OUTBOX= data set only if one of the keywords SCHEMATICID or SCHEMATICIDFAR is specified with the BOXSTYLE= option. _HTML_ is present only if one or more of the HTML=, OUTHIGHHTML=, OUTLOWHTML=, or POINTSHTML= options are specified.

Each observation in an OUTBOX= data set records the value of a single feature of one subgroup’s box-and-whisker plot, such as its mean. The _TYPE_ variable identifies the feature whose value is recorded in _VALUE_. The following table lists valid _TYPE_ variable values:

Table 17.8: Valid _TYPE_

Value

Description

N

subgroup size

SIGMAS

multiple (k) of standard error of $\bar{X}_{i}$ or $M_{i}$

ALPHA

probability ($\alpha $) of exceeding limits

LIMITN

nominal sample size associated with control limits

LCLM

lower control limit for subgroup median

LCLX

lower control limit for subgroup mean

UCLM

upper control limit for subgroup median

UCLX

upper control limit for subgroup mean

PROCMED

process median

PROCMEAN

process mean

EXLIM

control limit exceeded on box chart

TREND

trend variable value

MIN

minimum subgroup value

Q1

subgroup first quartile

MEDIAN

subgroup median

MEAN

subgroup mean

Q3

subgroup third quartile

MAX

subgroup maximum value

LOW

low outlier value

HIGH

high outlier value

LOWHISKR

low whisker value, if different from MIN

HIWHISKR

high whisker value, if different from MAX

FARLOW

low far outlier value

FARHIGH

high far outlier value


Additionally, the following variables, if specified, are included:

  • block-variables

  • symbol-variable

  • BY variables

  • ID variables

OUTHISTORY= Data Set

The OUTHISTORY= data set saves subgroup summary statistics. The following variables can be saved:

  • the subgroup-variable

  • a subgroup minimum variable named by the prefix process suffixed with L

  • a subgroup first-quartile variable named by the prefix process suffixed with 1

  • a subgroup mean variable named by the prefix process suffixed with X

  • a subgroup median variable named by the prefix process suffixed with M

  • a subgroup third-quartile variable named by the prefix process suffixed with 3

  • a subgroup maximum variable named by the prefix process suffixed with H

  • a subgroup sample size variable named by the prefix process suffixed with N

  • a subgroup range variable named by the prefix process suffixed with R or a subgroup standard deviation variable named by process suffixed with S

A subgroup range variable is included if you specify the RANGES option; otherwise, a subgroup standard deviation variable is included.

Given a process name that contains 32 characters, the procedure first shortens the name to its first 16 characters and its last 15 characters, and then it adds the suffix.

Subgroup summary variables are created for each process specified in the BOXCHART statement. For example, consider the following statements:

proc shewhart data=steel;
   boxchart (Width Diameter)*Lot / outhistory=Summary;
run;

The data set Summary contains variables named Lot, WidthL, Width1, WidthM, WidthX, Width3, WidthH, WidthS, WidthN, DiameterL, Diameter1, DiameterM, DiameterX, Diameter3, DiameterH, DiameterS, and DiameterN.

The variables WidthS and DiameterS are included since the RANGES option is not specified. If you specified the RANGES option, the data set Summary would contain the variables WidthR and DiameterR rather than WidthS and DiameterS.

Additionally, the following variables, if specified, are included:

  • BY variables

  • block-variables

  • symbol-variable

  • ID variables

  • _PHASE_ (if the OUTPHASE= option is specified)

For an example of an OUTHISTORY= data set, see Saving Summary Statistics.

OUTTABLE= Data Set

The OUTTABLE= data set saves subgroup summary statistics, control limits, and related information. The following variables can be saved:

Variable

Description

_ALPHA_

probability ($\alpha $) of exceeding control limits

_EXLIM_

control limit exceeded on box chart

_LCLM_

lower control limit for median

_LCLX_

lower control limit for mean

_LIMITN_

nominal sample size associated with the control limits

_MEAN_

process mean

_SIGMAS_

multiple (k) of the standard error associated with control limits

subgroup

values of the subgroup variable

_SUBMAX_

subgroup maximum

_SUBMED_

subgroup median

_SUBMIN_

subgroup minimum

_SUBN_

subgroup sample size

_SUBQ1_

subgroup first quartile (25th percentile)

_SUBQ3_

subgroup third quartile (75th percentile)

_SUBX_

subgroup mean

_TESTS_

tests for special causes signaled on box chart

_UCLM_

upper control limit for median

_UCLX_

upper control limit for mean

_VAR_

process specified in the BOXCHART statement

The variables _LCLM_ and _UCLM_ are included if you specify CONTROLSTAT=MEDIAN; otherwise, the variables _LCLX_ and _UCLX_ are included. In addition, the following variables, if specified, are included:

  • BY variables

  • block-variables

  • symbol-variable

  • ID variables

  • _PHASE_ (if the READPHASES= option is specified)

  • _TREND_ (if the TRENDVAR= option is specified)

Notes:

  1. Either the variable _ALPHA_ or the variable _SIGMAS_ is saved depending on how the control limits are defined (with the ALPHA= or SIGMAS= options, respectively, or with the corresponding variables in a LIMITS= data set).

  2. The variable _TESTS_ is saved if you specify the TESTS= option. The kth character of a value of _TESTS_ is k if Test k is positive at that subgroup. For example, if you request all eight tests and Tests 2 and 8 are positive for a given subgroup, the value of _TESTS_ has a 2 for the second character, an 8 for the eighth character, and blanks for the other six characters.

  3. The variables _EXLIM_ and _TESTS_ are character variables of length 8. The variable _PHASE_ is a character variable of length 48. The variable _VAR_ is a character variable whose length is no greater than 32. All other variables are numeric.

For an example, see Saving Control Limits.