HISTOGRAM <variables> < / options>;
The HISTOGRAM statement creates histograms and optionally superimposes estimated parametric and nonparametric probability density curves. You cannot use the WEIGHT statement with the HISTOGRAM statement. You can use any number of HISTOGRAM statements after a PROC UNIVARIATE statement. The components of the HISTOGRAM statement are follows.
Table 4.5 lists primary options that display parametric density estimates on the histogram. You can specify each primary option once in a given HISTOGRAM statement, and each primary option can display multiple curves from its family on the histogram.
Table 4.5: Primary Options for Parametric Fitted Distribution
Table 4.6 lists secondary options that specify parameters for fitted parametric distributions and that control the display of fitted curves. Specify these secondary options in parentheses after the primary distribution option. For example, you can fit a normal curve by specifying the NORMAL option as follows:
proc univariate; histogram / normal(color=red mu=10 sigma=0.5); run;
The COLOR= normal-option draws the curve in red, and the MU= and SIGMA= normal-options specify the parameters and for the curve. Note that the sample mean and sample standard deviation are used to estimate and , respectively, when the MU= and SIGMA= normal-options are not specified.
You can specify lists of values for secondary options to display more than one fitted curve from the same distribution family on a histogram. Option values are matched by list position. You can specify the value EST in a list of distribution parameter values to use an estimate of the parameter.
For example, the following code displays two normal curves on a histogram:
proc univariate; histogram / normal(color=(red blue) mu=10 est sigma=0.5 est); run;
The first curve is red, with and . The second curve is blue, with equal to the sample mean and equal to the sample standard deviation.
See the section Formulas for Fitted Continuous Distributions for detailed information about the families of parametric distributions that you can fit with the HISTOGRAM statement.
Table 4.6: Secondary Options for Parametric Distributions
Use the option KERNEL(kernel-options) to compute kernel density estimates. Specify the following secondary options in parentheses after the KERNEL option to control features of density estimates requested with the KERNEL option.
Table 4.7: Kernel-Options
Option |
Description |
---|---|
specifies standardized bandwidth parameter c |
|
specifies color of the kernel density curve |
|
fills area under kernel density curve |
|
specifies type of kernel function |
|
specifies line type used for kernel density curve |
|
specifies lower bound for kernel density curve |
|
specifies upper bound for kernel density curve |
|
specifies line width for kernel density curve |
Table 4.8 summarizes options for enhancing histograms.
Table 4.8: General Graphics Options
Option |
Description |
---|---|
General Graphics Options |
|
produces labels above histogram bars |
|
scales vertical axis without considering fitted curves |
|
lists endpoints for histogram intervals |
|
creates a grid |
|
constructs hanging histogram |
|
specifies reference lines perpendicular to the horizontal axis |
|
specifies labels for HREF= lines |
|
specifies vertical position of labels for HREF= lines |
|
specifies midpoints for histogram intervals |
|
specifies number of histogram interval endpoints |
|
specifies number of histogram interval midpoints |
|
suppresses histogram bars |
|
suppresses label for horizontal axis |
|
suppresses plot |
|
suppresses label for vertical axis |
|
suppresses tick marks and tick mark labels for vertical axis |
|
includes right endpoint in interval |
|
specifies reference lines at values of summary statistics |
|
specifies labels for STATREF= lines |
|
specifies substitution character for displaying statistic values in STATREFLABELS= labels |
|
specifies label for vertical axis |
|
specifies reference lines perpendicular to the vertical axis |
|
specifies labels for VREF= lines |
|
specifies horizontal position of labels for VREF= lines |
|
specifies scale for vertical axis |
|
Options for Traditional Graphics Output |
|
specifies annotate data set |
|
specifies width for the bars |
|
specifies color for axis |
|
specifies color for outlines of histogram bars |
|
specifies color for filling under curve |
|
specifies color for frame |
|
specifies color for grid lines |
|
specifies colors for HREF= lines |
|
draws reference lines behind histogram bars |
|
specifies colors for STATREF= lines |
|
specifies color for text |
|
specifies colors for VREF= lines |
|
specifies description for plot in graphics catalog |
|
specifies software font for text |
|
draws reference lines in front of histogram bars |
|
specifies AXIS statement for horizontal axis |
|
specifies height of text used outside framed areas |
|
specifies number of horizontal minor tick marks |
|
specifies offset for horizontal axis |
|
specifies software font for text inside framed areas |
|
specifies height of text inside framed areas |
|
specifies space between histogram bars |
|
specifies a line type for grid lines |
|
specifies line types for HREF= lines |
|
specifies line types for STATREF= lines |
|
specifies line types for VREF= lines |
|
specifies name for plot in graphics catalog |
|
suppresses frame around plotting area |
|
specifies pattern for filling under curve |
|
turns and vertically strings out characters in labels for vertical axis |
|
specifies AXIS statement or values for vertical axis |
|
specifies number of vertical minor tick marks |
|
specifies length of offset at upper end of vertical axis |
|
specifies line thickness for axes and frame |
|
specifies line thickness for bar outlines |
|
specifies line thickness for grid |
|
Options for ODS Graphics Output |
|
suppresses legend for curves |
|
specifies footnote displayed on histogram |
|
specifies secondary footnote displayed on histogram |
|
specifies title displayed on histogram |
|
specifies secondary title displayed on histogram |
|
overlays histograms for different class levels |
|
Options for Comparative Plots |
|
applies annotation requested in ANNOTATE= data set to key cell only |
|
specifies color for filling frame for row labels |
|
specifies color for filling frame for column labels |
|
specifies color for proportion of frequency bar |
|
specifies color for row labels of comparative histograms |
|
specifies color for column labels of comparative histograms |
|
specifies distance between tiles |
|
specifies maximum number of bins to display |
|
limits the number of bins that display to within a specified number of standard deviations above and below mean of data in key cell |
|
specifies number of columns in comparative histogram |
|
specifies number of rows in comparative histogram |
|
Miscellaneous Options |
|
specifies table of contents entry for histogram grouping |
|
creates table of histogram intervals |
|
suppresses table of contents entries for tables produced by HISTOGRAM statement |
|
creates a data set containing information about histogram intervals |
|
creates a data set containing kernel density estimates |
The following entries provide detailed descriptions of options in the HISTOGRAM statement. Options marked with † are applicable only when traditional graphics are produced. See the section Dictionary of Common Options for detailed descriptions of options common to all plot statements.