Example: Partitioning the SASHELP.CLASSFIT Data Set

  1. In the Tasks section, expand the Data folder, and then double-click Partition Data. The user interface for the Partition Data task opens.
  2. On the Data tab, select SASHELP.CLASSFIT as the input data set.
    Tip
    If the data set is not available from the drop-down list, click Select a table icon. In the Choose a Table window, expand the library that contains the data set that you want to use. Select the data set for the example and click OK. The selected data set should now appear in the drop-down list.
  3. In the Number of partitions box, enter 2.
  4. In the Proportion of cases for partition 1 box, enter .5, which specifies 50% of the values should be in partition 1.
  5. In the Proportion of cases for partition 2 box, enter .3, which specifies 30% of the values should be in partition 2.
  6. From the Partition data sets drop-down list, select All partitions in one data set.
  7. In the ID value for partition 1 data role, enter Test.
  8. In the ID value for partition 2 data role, enter Train.
  9. Under the Output Data Set heading, select Show output data to view the output data set in the results.
  10. To run the task, click Submit SAS Code.
Here is a subset of the results:
Example of Partitioned Data Set
The new _Partition_ variable in the output data set specifies the partition (either Train or Test) for the observation. For example, the data for Joyce is in the Test partition. The data for Janet is in the Test partition. This example does not specify a random seed. As a result, the task randomly assigns 50% of the observations to the Test partition and 30% of the observations to the Train partition. If you run this example again, you might see slightly different results.