The HPSPLIT Procedure

INPUT Statement

  • INPUT variables </ option>;

The INPUT statement specifies input variables to the decision tree. The value of variable can be a range such as "g_1–g_1000" or the special "_ALL_" value to include all variables in the data set.

Use the LEVEL=NOM option to request that PROC HPSPLIT treat a numeric variable as a nominal input.

Use multiple INPUT statements if you have a set of numeric variables that you want treated as interval inputs and a second set of numeric variables that you want treated as nominal inputs. For example, the following INPUT statements cause NUMVAR1 to be treated as an interval input and NUMVAR2, CHARVAR1, and CHARVAR2 to be treated as a nominal inputs:

input numvar1 charvar1;
input numvar2 charvar2 / level=nom;

The following two statements are equivalent to the previous two statements:

input numvar1 charvar1 / level=int;
input numvar2 charvar2 / level=nom;

PROC HPSPLIT treats CHARVAR1 as a nominal input despite the LEVEL=INT option because CHARVAR1 is a character variable type.

You can specify the following option:

LEVEL=INT | NOM

specifies whether the specified input variables are interval or nominal.

INT

treats all numeric variables as interval inputs.

NOM

treats all variables as nominal inputs.

Unless the LEVEL= option is specified, numeric variables are treated as interval inputs and character variables are treated as nominal inputs. Specifying LEVEL=NOM forces all variables in that statement to be treated as nominal. PROC HPSPLIT ignores the LEVEL=INT option for character variables.