Previous Page | Next Page

The KDE Procedure

ODS Graphics

This section describes the use of ODS for creating graphics with the KDE procedure. To request these graphs, you must specify the ODS GRAPHICS statement in addition to the following options. For more information about the ODS GRAPHICS statement, see Chapter 21, Statistical Graphics Using ODS.

Bivariate Plots

You can specify the PLOTS= option in the BIVAR statement to request graphical displays of bivariate kernel density estimates.

PLOTS= option1 <option2 ...>

requests one or more plots of the bivariate kernel density estimate. The following table shows the available plot options.

Option

Description

ALL

all available displays

CONTOUR

contour plot of bivariate density estimate

CONTOURSCATTER

contour plot of bivariate density estimate overlaid with scatter plot of data

HISTOGRAM

bivariate histogram of data

HISTSURFACE

bivariate histogram overlaid with bivariate kernel density estimate

NONE

suppresses all plots

SCATTER

scatter plot of data

SURFACE

surface plot of bivariate kernel density estimate

By default, if you enable ODS Graphics and you do not specify the PLOTS= option, then the BIVAR statement creates a contour plot. If you specify the PLOTS= option, you get only the requested plots.

Univariate Plots

You can specify the PLOTS= option in the UNIVAR statement to request graphical displays of univariate kernel density estimates.

PLOTS= option1 <option2 ...>

requests one or more plots of the univariate kernel density estimate. The following table shows the available plot options.

Option

Description

ALL

all available displays

DENSITY

univariate kernel density estimate curve

DENSITYOVERLAY

overlaid univariate kernel density estimate curves

HISTDENSITY

univariate histogram of data overlaid with kernel density estimate curve

HISTOGRAM

univariate histogram of data

NONE

suppresses all plots

By default, if you enable ODS Graphics and you do not specify the PLOTS= option, then the UNIVAR statement creates a histogram overlaid with a kernel density estimate. If you specify the PLOTS= option, you get only the requested plots.

ODS Graph Names

PROC KDE assigns a name to each graph it creates using the Output Delivery System (ODS). You can use these names to reference the graphs when using ODS. The names are listed in Table 45.2.

To request these graphs you must specify the ODS GRAPHICS statement in addition to the options indicated in Table 45.2. For more information about the ODS GRAPHICS statement, see Chapter 21, Statistical Graphics Using ODS.

Table 45.2 ODS Graphics Produced by PROC KDE

ODS Graph Name

Plot Description

Statement

PLOTS= Option

BivariateHistogram

Bivariate histogram of data

BIVAR

HISTOGRAM

ContourPlot

Contour plot of bivariate kernel density estimate

BIVAR

CONTOUR

ContourScatterPlot

Contour plot of bivariate kernel density estimate overlaid with scatter plot

BIVAR

CONTOURSCATTER

DensityPlot

Univariate kernel density estimate curve

UNIVAR

DENSITY

DensityOverlayPlot

Overlaid univariate kernel density estimate curves

UNIVAR

DENSITYOVERLAY

HistogramDensity

Univariate histogram overlaid with kernel density estimate curve

UNIVAR

HISTDENSITY

Histogram

Univariate histogram of data

UNIVAR

HISTOGRAM

HistogramSurface

Bivariate histogram overlaid with surface plot of bivariate kernel density estimate

BIVAR

HISTSURFACE

ScatterPlot

Scatter plot of data

BIVAR

SCATTER

SurfacePlot

Surface plot of bivariate kernel density estimate

BIVAR

SURFACE

Binning of Bivariate Histogram

Let , be a sample of size drawn from a bivariate distribution. For the marginal distribution of , the number of bins () in the bivariate histogram is calculated according to the formula

     

where denotes the smallest integer greater than or equal to ,

     

and the optimal bin width is obtained, following Scott (1992, p. 84), as

     

Here, and are the sample variance and the sample correlation coefficient, respectively. When you specify a WEIGHT variable, PROC KDE uses weighted versions of and in the preceding expressions.

Similar formulas are used to compute the number of bins for the marginal distribution of . Further details can be found in Scott (1992).

Notice that if , then is calculated as in the univariate case (see Terrell and Scott; 1985). In this case .

Previous Page | Next Page | Top of Page