The PARETO Procedure

Creating Output Data Sets

The OUT= data set saves the information that is displayed on a Pareto chart. If you specify CLASS= variables, the OUT= data set contains one block of observations for each combination of levels of the CLASS= variables, and each block contains an observation for each Pareto category. The observations are sorted in the order in which the categories are displayed on the chart. The following variables from a DATA= data set are saved in an OUT= data set:

In addition, the OUT= data set contains the following variables that are created during the analysis:

  • _COUNT_, which saves the frequency count for each Pareto category

  • _WCOUNT_, which saves the weighted count for each category. This variable is created only when you specify the WEIGHT= option.

  • _PCT_, which saves the percentage of the total count for each category. If you specify the WEIGHT= option, the variable _PCT_ saves the percentage of the total weighted count.

  • _CMPCT_, which saves the cumulative percentage for each category

See Output 15.8.2 for an example of an OUT= data set.

If you specify the MAXNCAT= , MAXCMPCT= , or MINPCT= option, the OUT= data set saves only the categories that are displayed on the chart. If you create an OTHER= category that merges the remaining categories, an additional observation is saved with the new category. Because the OTHER= value is defined as a formatted value of the process variable, you should also specify a corresponding internal value, as follows:

  • If the process variable is a character variable, specify the internal value in the OTHERCVAL= option. If you do not specify this value, the OTHER= value is saved as the internal value.

  • If the process variable is a numeric variable, specify the internal value in the OTHERNVAL= option. If you do not specify this value, an internal missing value is saved.