The BOXPLOT Procedure |

Output Data Sets |

The OUTBOX= data set saves group summary statistics and outlier values. The following variables can be saved:

the group variable

the variable _VAR_, containing the analysis variable name

the variable _TYPE_, identifying features of box-and-whiskers plots

the variable _VALUE_, containing values of box-and-whiskers plot features

the variable _ID_, containing labels for outliers

the variable _HTML_, containing URLs associated with plot features

_ID_ is included in the OUTBOX= data set only if the keyword SCHEMATICID or SCHEMATICIDFAR is specified with the BOXSTYLE= option. _HTML_ is present only if one or more of the HTML=, OUTHIGHHTML=, and OUTLOWHTML= options are specified.

Each observation in an OUTBOX= data set records the value of a single feature of one group’s box-and-whiskers plot, such as its mean. The _TYPE_ variable identifies the feature whose value is recorded in _VALUE_. Table 24.7 lists valid _TYPE_ variable values.

_TYPE_ |
Description |
---|---|

N |
group size |

MIN |
minimum group value |

Q1 |
group first quartile |

MEDIAN |
group median |

MEAN |
group mean |

Q3 |
group third quartile |

MAX |
group maximum value |

STDDEV |
group standard deviation |

LOW |
low outlier value |

HIGH |
high outlier value |

LOWHISKR |
low whisker value, if different from MIN |

HIWHISKR |
high whisker value, if different from MAX |

FARLOW |
low far outlier value |

FARHIGH |
high far outlier value |

Additionally, the following variables, if specified, are included:

block variables

symbol variable

BY variables

ID variables

The OUTHISTORY= data set saves group summary statistics. The following variables are saved:

the group variable

group minimum variables named by

*analysis-variable*suffixed with*L*group first-quartile variables named by

*analysis-variable*suffixed with*1*group mean variables named by

*analysis-variable*suffixed with*X*group median variables named by

*analysis-variable*suffixed with*M*group third-quartile variables named by

*analysis-variable*suffixed with*3*group maximum variables named by

*analysis-variable*suffixed with*H*group standard deviation variables named by

*analysis-variable*suffixed with*S*group size variables named by

*analysis-variable*suffixed with*N*

If an analysis variable name has the maximum length of 32 characters, PROC BOXPLOT forms summary statistic names from its first 16 characters, its last 15 characters, and the appropriate suffix.

Subgroup summary variables are created for each analysis variable specified in the PLOT statement. For example, consider the following statements:

proc boxplot data=Steel; plot (Width Diameter)*Lot / outhistory=Summary; run;

The data set Summary contains variables named Lot, WidthL, Width1, WidthM, WidthX, Width3, WidthH, WidthS, WidthN, DiameterL, Diameter1, DiameterM, DiameterX, Diameter3, DiameterH, DiameterS, and DiameterN.

Additionally, the following variables, if specified, are included:

BY variables

block variables

symbol variable

ID variables

Note that an OUTHISTORY= data set does not contain outlier values, and therefore cannot be used, in general, to save a schematic box plot. You can use an OUTBOX= data set to save a schematic box plot summary.

Copyright © SAS Institute, Inc. All Rights Reserved.