Usage Note 23091: Randomly split data into two parts by saving both selected and unselected units from PROC SURVEYSELECT
Beginning with SAS/STAT® 12.3 in SAS® 9.4 TS1M0, use the GROUPS= option in the PROC SURVEYSELECT statement as discussed and illustrated in this note.
Prior to SAS/STAT 12.3, use the OUTALL option in the PROC SURVEYSELECT statement to output all the original observations to the OUT= data set. The variable SELECTED in the OUT= data set equals 1 for the randomly selected units and 0 for the unselected units.
Example (for releases prior to SAS 9.4 TS1M0)
The following DATA step creates a data set with ten observations.
data all;
do id=1 to 10;
output;
end;
run;
By specifying SAMPRATE=0.5 in PROC SURVEYSELECT, half of the observations are randomly selected. The OUT= data set, Sample, contains all ten original observations and the SELECTED variable indicates if each observation was selected (SELECTED=1) or not (SELECTED=0). The results of this example can be reproduced by specifying the same value in the SEED= option.
proc surveyselect data=all samprate=0.50 seed=49201 out=Sample outall
method=srs noprint;
run;
These statements display the Sample data set. Note that there are five observations in each of the two groups identified by the SELECTED variable.
proc print data=Sample;
id id;
var selected;
run;
Operating System and Release Information
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
Type: | Usage Note |
Priority: | low |
Topic: | Analytics ==> Survey Sampling and Analysis SAS Reference ==> Procedures ==> SURVEYSELECT
|
Date Modified: | 2016-06-01 17:11:48 |
Date Created: | 2002-12-16 10:56:38 |