The UNIVARIATE Procedure

OUTKERNEL= Output Data Set

You can create an OUTKERNEL= data set with the HISTOGRAM statement. This data set contains information about histogram intervals. Because you can specify multiple HISTOGRAM statements with the UNIVARIATE procedure, you can create multiple OUTKERNEL= data sets.

An OUTKERNEL= data set contains a group of observations for each kernel density estimate requested with the HISTOGRAM statement. These observations span a range of analysis variable values recorded in the _VALUE_ variable. The procedure determines the increment between values, and therefore the number of observations in the group. The variable _DENSITY_ contains the kernel density calculated for the corresponding analysis variable value.

When a density curve is overlaid on a histogram, the curve is scaled so that the area under the curve equals the total area of the histogram bars. The scaled density values are saved in the variable _COUNT_, _PERCENT_, or _PROPORTION_, depending on the histogram’s vertical axis scale, determined by the VSCALE= option. Only one of these variables appears in a given OUTKERNEL= data set.

Table 4.38 lists the variables in an OUTKERNEL= data set.

Table 4.38: Variables in the OUTKERNEL= Data Set

Variable

Description

_C_

standardized bandwidth parameter

_COUNT_

kernel density scaled for VSCALE=COUNT

_DENSITY_

kernel density

_PERCENT_

kernel density scaled for VSCALE=PERCENT (default)

_PROPORTION_

kernel density scaled for VSCALE=PROPORTION

_TYPE_

kernel function

_VALUE_

variable value at which kernel function is calculated

_VAR_

variable name