The PARTITION statement specifies how observations in the input data set are logically partitioned into disjoint subsets for
model training, validation, and testing. Either you can designate a variable in the input data set and a set of formatted
values of that variable to determine the role of each observation, or you can specify proportions to use for random assignment
of observations for each role.
The following mutually exclusive partition-options are available:
-
ROLEVAR | ROLE=variable(<TEST=’value’> <TRAIN=’value’> <VALIDATE=’value’>)
-
names the variable in the input data set whose values are used to assign roles to each observation. The formatted values of
this variable that are used to assign observations roles are specified in the TEST=, TRAIN=, and VALIDATE= suboptions. If
you do not specify the TRAIN= suboption, then all observations whose role is not determined by the TEST= or VALIDATE= suboptions
are assigned to training.
-
FRACTION(<TEST=fraction> <VALIDATE=fraction>)
-
requests that specified proportions of the observations in the input data set be randomly assigned training and validation
roles. You specify the proportions for testing and validation by using the TEST= and VALIDATE= suboptions. If you specify
both the TEST= and the VALIDATE= suboptions, then the sum of the specified fractions must be less than 1 and the remaining
fraction of the observations are assigned to the training role.