-
In the
Tasks section,
expand the
Data folder, and then double-click
Partition
Data. The user interface for the Partition Data task
opens.
-
On the
Data tab,
select
SASHELP.CLASSFIT as the input data
set.
Tip
If the data set is
not available from the drop-down list, click
. In the
Choose a Table window,
expand the library that contains the data set that you want to use.
Select the data set for the example and click
OK.
The selected data set should now appear in the drop-down list.
-
In the
Number
of partitions box, enter
2
.
-
In the
Proportion
of cases for partition 1 box, enter
.5
,
which specifies 50% of the values should be in partition 1.
-
In the
Proportion
of cases for partition 2 box, enter
.3
,
which specifies 30% of the values should be in partition 2.
-
From the
Partition
data sets drop-down list, select
All partitions
in one data set.
-
In the
ID
value for partition 1 data role, enter
Test
.
-
In the
ID
value for partition 2 data role, enter
Train
.
-
Under the
Output
Data Set heading, select
Show output data to
view the output data set in the results.
-
To run the task, click
.
The new _Partition_
variable in the output data set specifies the partition (either Train
or Test) for the observation. For example, the data for Joyce is in
the Test partition. The data for Janet is in the Test partition. This
example does not specify a random seed. As a result, the task randomly
assigns 50% of the observations to the Test partition and 30% of the
observations to the Train partition. If you run this example again,
you might see slightly different results.