The HPBIN Procedure

PROC HPBIN Statement

PROC HPBIN <options> ;

The PROC HPBIN statement invokes the procedure. Table 4.1 summarizes important options in the PROC HPBIN statement by function.

Table 4.1: PROC HPBIN Statement Options



Basic Options


Specifies the input data set


Specifies the output data set


Overwrites the ODS output

Binning Level Options


Specifies the global number of bins for all binning variables

Binning Method Options


Specifies the bucket binning method


Specifies the Winsorized binning method and the rate that it uses




Specifies the pseudo–quantile binning method

Statistics Options


Computes the basic statistics of the binning variables


Compute the quantiles of the binning variables

Weight-of-Evidence Options


Specifies the BINS_META input data set, which contains the binning results


Computes the weight of evidence and information values


Specifies the adjustment factor for the weight-of-evidence calculation

You can specify the following optional arguments:


specifies the BINS_META input data set, which contains the binning results. The BINS_META data set contains six variables: variable name, binned variable name, lower bound, upper bound, bin, and range. The mapping table that is generated by PROC HPBIN can be used as the BINS_META data set.


specifies which binning method to use. If you specify BUCKET, then PROC HPBIN uses equal-length binning. If you specify WINSOR, PROC HPBIN uses Winsorized binning, and you must specify the WINSORRATE option with a value from 0.0 to 0.5 exclusive for number. If you specify PSEUDO_QUANTILE, then PROC HPBIN generates a result that approximates the quantile binning. You can specify only one option. The default is BUCKET.

However, when a BINS_META data set is specified, PROC HPBIN does not do binning and ignores the binning method options, binning level options, and INPUT statement. Instead, PROC HPBIN takes the binning results from the BINS_META data set and calculates the weight of evidence and information value.


computes the quantile result. If you specify COMPUTEQUANTILE, PROC HPBIN generates the quantiles and extremes table, which contains the following percentages: 0% (Min), 1%, 5%, 10%, 25% (Q1), 50% (Median), 75% (Q3), 90%, 95%, 99%, and 100% (Max).


computes the statistic result. If you specify COMPUTESTATS, basic statistical information is computed and ODS output can be provided. The output table contains six variables: the mean, median, standard deviation, minimum, maximum, and number of bins for each binning variable.


specifies the input SAS data set or database table to be used by PROC HPBIN.

If the procedure executes in distributed mode, the input data are distributed to memory on the appliance nodes and analyzed in parallel, unless the data are already distributed in the appliance database. In this case, PROC HPBIN reads the data alongside the distributed database.

For single-machine mode, the input must be a SAS data set.


suppresses the generation of ODS outputs.


specifies the global number of binning levels for all binning variables. The value of integer can be any integer between 2 and 1,000, inclusive. The default number of binning levels is 16.

The resulting number of binning levels might be less than the specified integer if the sample size is small or if the data are not normalized. In this case, PROC HPBIN provides a warning message.

You can specify a different number of binning levels for each different variable in an INPUT statement. The number of binning levels that you specify in an INPUT statement overwrites the global number of binning levels.


creates an output SAS data set in single-machine mode or a database table that is saved alongside the distributed database in distributed mode. The output data set or table contains binning variables. To avoid data duplication for large data sets, the variables in the input data set are not included in the output data set.


computes the weight of evidence (WOE) and information value (IV).


specifies the adjustment factor for the weight-of-evidence calculation. You can specify any value from 0.0 to 1.0, inclusive, for number. The default is 0.5.