PROC UNIVARIATE creates an OUT= data set for each OUTPUT statement. This data set contains an observation for each combination of levels of the variables in the BY and CLASS statements, or a single observation if you do not specify a BY or CLASS statement. Thus the number of observations in the new data set corresponds to the number of groups for which statistics are calculated. Without a BY or CLASS statement, the procedure computes statistics and percentiles by using all the observations in the input data set. With a BY statement, the procedure computes statistics and percentiles by using the observations within each BY group. With a CLASS statement, the procedure computes statistics and percentiles by using the observations that correspond to each level of the CLASS variables within each BY group.
The variables in the OUT= data set are as follows:
BY statement variables. The values of these variables match the values in the corresponding BY group in the DATA= data set and indicate which BY group each observation summarizes.
CLASS statement variables. The values of these variables match the CLASS levels within a BY group that each observation summarizes.
variables created by selecting statistics in the OUTPUT statement. The statistics are computed using all the nonmissing data, or they are computed for each CLASS level within each BY group if you specify BY and/or CLASS statements.
variables created by requesting new percentiles with the PCTLPTS= option. The names of these new variables depend on the values of the PCTLPRE= and PCTLNAME= options.
If the output data set contains a percentile variable or a quartile variable, the percentile definition assigned with the PCTLDEF= option in the PROC UNIVARIATE statement is recorded in the output data set label. See Example 4.8.
The following table lists variables available in the OUT= data set.
Table 4.36: Variables Available in the OUT= Data Set