| BOXCHART Statement |
You can read summary statistics, decision limits, and outlier values from a BOX= data set specified in the PROC ANOM statement. This enables you to reuse an OUTBOX= data set created in a previous run of the ANOM procedure to display a box chart.
A BOX= data set must contain the following variables:
Each observation in a BOX= data set records the value of a single feature of one group's box-and-whisker plot, such as its mean. The _TYPE_ variable identifies the feature whose value is recorded in a given observation. The following table lists valid _TYPE_ variable values:
Table 5.23: Valid _TYPE_ Values in a BOX= Data Set| _TYPE_ Value | Description |
| N | group size |
| ALPHA | significance level |
| LIMITN | nominal sample size associated with decision limits |
| LDLX | lower decision limit for group mean |
| UDLX | upper decision limit for group mean |
| RESPMEAN | overall response variable mean |
| MIN | group minimum value |
| Q1 | group first quartile |
| MEDIAN | group median |
| MEAN | group mean |
| Q3 | group third quartile |
| MAX | group maximum value |
| LOW | low outlier value |
| HIGH | high outlier value |
| LOWHISKR | low whisker value, if different from MIN |
| HIWHISKR | high whisker value, if different from MAX |
| FARLOW | low far outlier value |
| FARHIGH | high far outlier value |
The features identified by _TYPE_ values N, LDLX, UDLX, RESPMEAN, MIN, Q1, MEDIAN, MEAN, Q3, and MAX are required for each group.
Other variables that can be read from a BOX= data set include:
When you specify one of the keywords SCHEMATICID or SCHEMATICIDFAR with the BOXSTYLE= option, values of _ID_ are used as outlier labels. If _ID_ does not exist in the BOX= data set, the values of the first variable listed in the ID statement are used.
You can read raw data (response values) from a DATA=
data set specified in the PROC ANOM statement.
Each response specified
in the BOXCHART statement must be a SAS variable in the DATA= data set.
This variable provides measurements that must be grouped
into group samples indexed by the
group-variable. The group-variable,
which is specified in the BOXCHART statement,
must also be a SAS variable in the DATA= data set.
Each observation in a DATA= data set must contain
a value for each response and a value for the
group-variable.
If the
th group contains
items, there should be
consecutive observations for which the value of the group-variable
is the index of the
th group.
For example, if each group contains five items
and there are 10 groups, the
DATA= data set should contain 50 observations.
Other variables that can be read from a DATA= data set include
By default, the ANOM procedure reads all of the observations in a DATA= data set. However, if the data set includes the variable _PHASE_, you can read selected groups of observations (referred to as phases) with the READPHASES= option.
For an example of a DATA= data set, see "Creating ANOM Boxcharts from Response Values".
You can read preestablished decision limits (or parameters from which the decision limits can be calculated) from a LIMITS= data set specified in the PROC ANOM statement. For example, the following statements read decision limit information from the data set Conlims:
proc anom data=Info limits=Conlims;
xchart Weight*Batch;
run;
The LIMITS= data set can be an OUTLIMITS= data set that was created in a previous run of the ANOM procedure. Such data sets always contain the variables required for a LIMITS= data set; see Table 5.20. The LIMITS= data set can also be created directly using a DATA step. When you create a LIMITS= data set, you must provide one of the following:
In addition, note the following:
You can read group summary statistics from a SUMMARY= data set specified in the PROC ANOM statement. This enables you to reuse OUTSUMMARY= data sets that have been created in previous runs of the ANOM procedure or to read output data sets created with SAS summarization procedures, such as PROC MEANS.
A SUMMARY= data set used with the BOXCHART statement must contain the following:
The names of the group summary statistics variables must be the response name concatenated with the following special suffix characters:
| Group Summary Statistic | Suffix Character |
| group minimum | L |
| group first-quartile | 1 |
| group median | M |
| group mean | X |
| group third-quartile | 3 |
| group maximum | H |
| group standard deviation | S |
| group sample size | N |
For example, consider the following statements:
proc anom summary=Summary;
xchart (Weight Yieldstrength)*Batch;
run;
The data set Summary must include the variables Batch, WeightL, Weight1, WeightX, WeightM, Weight3, WeightH, WeightS, WeightN, YieldstrengthL, Yieldstrength1, YieldstrengthX, YieldstrengthM, Yieldstrength3, YieldstrengthH, YieldstrengthS, and YieldstrengthN. Note that if you specify a response name that contains 32 characters, the names of the summary variables must be formed from the first 16 characters and the last 15 characters of the response name, suffixed with the appropriate character.
Other variables that can be read from a SUMMARY= data set include
By default, the ANOM procedure reads all of the observations in a SUMMARY= data set. However, if the data set includes the variable _PHASE_, you can read selected groups of observations (referred to as phases) by specifying the READPHASES= option.
For an example of a SUMMARY= data set, see "Creating ANOM Boxcharts from Group Summary Data".
You can read summary statistics and decision limits from a TABLE= data set specified in the PROC ANOM statement. This enables you to reuse an OUTTABLE= data set created in a previous run of the ANOM procedure. Because the ANOM procedure simply displays the information in a TABLE= data set, you can use TABLE= data sets to create specialized ANOM charts.
The following table lists the variables required in a TABLE= data set used with the BOXCHART statement:
Table 5.24: Variables Required in a TABLE= Data Set| Variable | Description |
| group-variable | values of the group-variable |
| _LDLX_ | lower decision limit for mean |
| _LIMITN_ | nominal sample size associated with the decision limits |
| _MEAN_ | central line |
| _SUBMAX_ | group maximum |
| _SUBMED_ | group median |
| _SUBMIN_ | group minimum |
| _SUBN_ | group sample size |
| _SUBQ1_ | group first quartile |
| _SUBQ3_ | group third quartile |
| _SUBX_ | group mean |
| _UDLX_ | upper decision limit for mean |
Other variables that can be read from a TABLE= data set include
For an example of a TABLE= data set, see "Saving Decision Limits".
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.