What’s New in the Base SAS 9.3 Statistical Procedures

Enhancements

The following are enhancements to the Base SAS statistical procedures for SAS 9.3.

CORR Procedure

The POLYSERIAL option has been added to the PROC CORR statement. The POLYSERIAL option requests a table of polyserial correlation coefficients. Polyserial correlation measures the correlation between two continuous variables that have a bivariate normal distribution, where only one variable is observed directly. Information about the unobserved variable is obtained through an observed ordinal variable that is derived from the unobserved variable by classifying its values into a finite set of discrete, ordered values.
In the second maintenance release of SAS 9.3, the POLYCHORIC option has been added to the PROC CORR statement. The POLYCHORIC option requests a table of polychoric correlation coefficients. Polychoric correlation measures the correlation between two unobserved, continuous variables that have a bivariate normal distribution. Information about each unobserved variable is obtained through an observed ordinal variable that is derived from the unobserved variable by classifying its values into a finite set of discrete, ordered values (Olsson, 1979; Drasgow, 1986).

FREQ Procedure

The FREQ procedure now produces agreement plots when the AGREE option is specified and ODS Graphics is enabled. It also offers alternative confidence limit types for the risk (proportion) difference, and it provides exact unconditional confidence limits for the risk difference and relative risk.
The following enhancements have been made to the FREQ procedure for the second maintenance release of SAS 9.3.
The new MAXLEVELS= option in the TABLES statement specifies the maximum number of variable levels to display in one-way frequency tables and one-way frequency plots.
The CROSSLIST(STDRES) option displays standardized residuals in the CROSSLIST table for two-way crossclassifications.
PROC FREQ now provides two additional confidence limit types for the risk (proportion) difference for Inline Graphic of: $2 \times 2$ tables. The CL=AGRESTICAFFO option provides Agresti-Caffo confidence limits for the risk difference. The CL=MN option provides Miettinen-Nurminen confidence limits for the risk difference.
The BARNARD option in the EXACT statement produces Barnard’s exact unconditional test for the risk difference.
Continuity-corrected Wilson confidence limits are now available for the binomial proportion.
The new DF= option specifies or adjusts the degrees of freedom for chi-square tests. The TESTF= option now enables you to provide null frequencies for a one-way chi-square test by using a secondary input data set. Similarly, the TESTP= option now enables you to provide null proportions by using a secondary input data set.
The LRCHISQ option in the TABLES statement produces a likelihood ratio chi-square test for one-way tables. This test can be based on a null hypothesis of equal proportions, specified proportions, or specified frequencies. The LRCHISQ option in the EXACT statement produces an exact likelihood ratio chi-square test for one-way tables.
The PLCORR option in the TEST statement provides Wald and likelihood ratio tests for the polychoric correlation coefficient.
The PLOTS=MOSAICPLOT option provides mosaic plots for two-way tables when ODS Graphics is enabled.
You can now specify the confidence limit type to display in risk difference plots (PLOTS=RISKDIFFPLOT). In addition to Wald and exact unconditional confidence limits, available confidence limit types include the Agresti-Caffo, Hauck-Anderson, Miettinen-Nurminen, and Newcombe. Continuity-corrected Wald and Newcombe confidence limits are also available.
By default, the following plots now display the common (overall) statistic in addition to the stratum (two-way table) statistics: odds ratio plot, relative risk plot, kappa plot, and weighted kappa plot. The COMMON=NO plot option suppresses display of the common value.
The new TWOWAY=CLUSTER plot option provides a cluster layout for frequency plots that are displayed as bar charts (TYPE=BAR). The cluster layout first groups bars (table cells) by column variable level and then displays row variable levels as adjacent bars within each column-level group.
The new GROUPBY= plot option specifies the primary grouping of graph cells for two-way frequency plots. The default is GROUPBY=COLUMN, which first groups graph cells by column variable level and then displays row variable levels within column variable levels. You can specify the GROUPBY=ROW plot option to group first by row variable.

UNIVARIATE Procedure

The UNIVARIATE procedure supports five new fitted distributions for SAS 9.3:
  • Gumbel distribution
  • inverse Gaussian distribution
  • generalized Pareto distribution
  • power function distribution
  • Rayleigh distribution
These new distributions are available in the CDFPLOT, HISTOGRAM, PROBPLOT, PPPLOT, and QQPLOT statements.
The following enhancements have been made to the UNIVARIATE procedure for the second maintenance release of SAS 9.3.
The PLOTS option in the PROC UNIVARIATE statement now produces ODS Graphics output when ODS Graphics is enabled.
The UNIVARIATE procedure supports several new options. The CDFPLOT, HISTOGRAM, PPPLOT, PROBPLOT, and QQPLOT statements support the following new options for specifying titles and footnotes in graphs produced by using ODS Graphics:
  • ODSFOOTNOTE= adds a footnote to the graph.
  • ODSFOOTNOTE2= adds a secondary footnote to the graph.
  • ODSTITLE= specifies the graph title.
  • ODSTITLE2= specifies a secondary graph title.
You can use these options to specify your own graph titles and footnotes without modifying ODS graph templates or using the ODS Graphics Editor.
The CDFPLOT, HISTOGRAM, PROBPLOT, and QQPLOT statements support the following new options for displaying reference lines at the values of computed statistics:
  • STATREF= specifies keywords that identify the statistics.
  • CSTATREF= specifies the colors of the reference lines.
  • LSTATREF= specifies the line types of the reference lines.
  • STATREFLABELS= specifies labels for the reference lines.
  • STATREFSUBCHAR= specifies a substitution character for incorporating statistic values into reference line labels.
For example, specifying STATREF=MEAN in a HISTOGRAM statement produces a histogram that has a vertical reference line at the mean of the data.
The HISTOGRAM statement supports the new CLIPCURVES option, which clips fitted distribution curves that extend above the highest histogram bar. This eliminates compression of the histogram bars caused by extremely high fitted curve peaks.
The OUTPUT statement supports the following new options:
  • CIPCTLDF= computes distribution-free confidence limits for percentiles that you request by specifying the PCTLPTS= option.
  • CIPCTLNORMAL= computes confidence limits that assume normality for percentiles that you request by specifying the PCTLPTS= option.
  • PCTLGROUP= controls how variables that you request by specifying the PCTLPTS= option are grouped in the OUTPUT data set.
In addition, the CHREF=, CVREF=, LHREF=, and LVREF= options have been enhanced. These options now accept lists of values so that different reference lines in a single graph can be displayed using different colors and line types. They are available in the CDFPLOT, HISTOGRAM, PPPLOT, PROBPLOT, and QQPLOT statements.

What’s Changed

What follows are changes in software behavior from SAS 9.2 to SAS 9.3.

FREQ Procedure

Frequency plots and cumulative frequency plots are no longer produced by default when ODS Graphics is enabled. You can request these plots by using the PLOTS=FREQPLOT and PLOTS=CUMFREQPLOT options, respectively, in the TABLES statement.

References

  • Drasgow, F. (1986), “Polychoric and Polyserial Correlations,” in S. Kotz and N. L. Johnson, eds., Encyclopedia of Statistical Sciences, volume 7, 68–74, New York: John Wiley & Sons.
  • Olsson, U. (1979), “Maximum Likelihood Estimation of the Polychoric Correlation Coefficient,” Psychometrika, 12, 443–460.