Sample 25431: Selecting a sample of K observations from a data set
These sample files and code examples are provided by SAS Institute
Inc. "as is" without warranty of any kind, either express or implied, including
but not limited to the implied warranties of merchantability and fitness for a
particular purpose. Recipients acknowledge and agree that SAS Institute shall
not be liable for any damages whatsoever arising out of their use of this material.
In addition, SAS Institute will provide no support for the materials contained herein.
This sample is from the SAS Sample Library. For additional information refer to SAS Help and Online Documentation.
/****************************************************************/
/* S A S S A M P L E L I B R A R Y */
/* */
/* NAME: SAMPLE */
/* TITLE: Selecting a Sample of K Observations from a Data Set*/
/* PRODUCT: SAS */
/* SYSTEM: all */
/* KEYS: DATMAN STATAPP DATASTEP FUNCTION RANNOR RANUNI */
/* PRINT BY SET LAST. */
/* PROCS: PRINT */
/* DATA: */
/* */
/* SUPPORT: UPDATE: */
/* REF: */
/* MISC: */
/* */
/****************************************************************/
/* Random sample */
/* This code demonstrates the logic for taking a random sample */
/* of K observations from each BY group. */
OPTIONS LS=72;
DATA DS; /* Generate the data for this example */
RETAIN SEED1 12345 SEED2 12345;
ID=0; /* ID defines groups (valued 0-9), */
/* with 0-50 obs in each. */
CALL RANUNI(SEED1,K);
K=K*50;
L: CALL RANNOR(SEED2,Y);
OUTPUT;
K=K-1;
IF K>0 THEN GOTO L;
ID+1;
CALL RANUNI(SEED1,K);
K=K*50;
IF ID<10 THEN GOTO L;
KEEP ID Y;
DROP SEED1 SEED2;
RUN;
PROC PRINT DATA=DS;
BY ID;
TITLE 'THE ORIGINAL DATA SET';
RUN;
DATA SAMPLE; /* Take a random sample of 10 from each ID */
/* group. First find out how many obs in this */
/* group. If there are 10 or fewer, select */
/* all of them. */
SCAN: SET DS;
BY ID;
N+1;
IF NOT LAST.ID THEN GOTO SCAN;
K=10; /* K is the number to randomly select from */
/* this group. It may be a function of N, */
/* e.g.: K=.05*N for a 5 percent sample. */
LOOP: SET DS;
PROB=K/N; /* PROB is the current select probability */
IF RANUNI(12345)>PROB THEN GOTO NEXT;
OUTPUT; /* The observation is selected */
K=K-1;
NEXT: N=N-1;
IF N>0 THEN GOTO LOOP;
RUN;
PROC PRINT;
BY ID;
TITLE 'THE SAMPLE FROM THE ORIGINAL DATA SET';
RUN;
These sample files and code examples are provided by SAS Institute
Inc. "as is" without warranty of any kind, either express or implied, including
but not limited to the implied warranties of merchantability and fitness for a
particular purpose. Recipients acknowledge and agree that SAS Institute shall
not be liable for any damages whatsoever arising out of their use of this material.
In addition, SAS Institute will provide no support for the materials contained herein.
THE ORIGINAL DATA SET
ID=0
OBS Y
1 -0.04298
2 -0.09999
3 -0.24349
4 -0.22226
5 0.07353
6 0.49937
7 -1.52119
8 0.79180
9 0.57221
10 0.17571
11 -1.44361
12 0.44887
13 -0.64419
14 1.69914
15 0.03142
16 0.96398
17 0.60484
18 0.30401
19 0.85256
ID=1
OBS Y
20 0.72810
21 0.18137
22 0.62187
23 -1.03079
24 0.49056
25 1.47289
26 1.84239
27 -1.13082
28 -1.55640
29 -1.37761
30 -1.11293
31 0.36636
32 -1.11567
33 0.50600
34 -0.03473
35 -0.45415
36 -0.77319
37 -0.78785
38 -0.51671
39 -0.52238
40 -0.54947
41 -1.03773
42 0.57771
43 -0.77680
44 1.61567
45 0.40223
46 -1.43182
47 -0.35979
48 2.04519
49 0.29577
50 -1.50831
51 -1.68968
52 0.61034
53 -0.48024
54 -2.43170
55 0.30556
56 0.19337
57 -1.03129
ID=2
OBS Y
58 -1.18237
59 -0.18794
60 1.41937
61 -0.22017
62 0.39716
63 1.30340
64 0.80306
65 -0.84595
66 0.24197
67 1.04111
68 -0.79731
69 1.53661
70 -0.54922
71 -0.07136
72 -0.47312
73 -0.08258
74 1.05537
75 0.32932
76 0.69760
77 -1.75574
78 -0.90481
79 0.73914
80 0.06795
81 -0.40015
82 -0.64213
83 0.58588
84 -0.05007
85 -0.57406
86 -1.75401
87 -1.09562
88 -0.37296
89 0.64407
90 1.06368
91 0.62897
92 0.20284
93 -1.27817
94 0.35541
95 -0.23264
96 -0.08819
97 1.01026
98 -0.39166
99 1.89210
ID=3
OBS Y
100 1.44692
101 0.50543
102 -0.31590
103 -0.59481
104 -0.46179
105 0.64560
106 2.29568
107 1.05388
108 -0.64543
109 -0.76476
110 1.22160
111 -1.66032
112 -1.64458
113 -0.84033
ID=4
OBS Y
114 -0.92060
115 -0.71083
116 0.77366
117 0.63364
118 1.69748
119 0.35330
120 0.33375
121 -0.28322
122 1.21632
123 0.94123
ID=5
OBS Y
124 0.04430
125 0.68092
126 0.65207
127 -0.10412
128 1.31174
129 0.35382
130 -1.02247
131 -0.43255
132 -0.71871
133 -0.66003
134 -1.51920
135 0.23616
136 -0.07064
137 0.12275
138 0.37231
139 1.01600
140 1.23229
141 0.84475
142 -0.04564
143 -0.44523
144 1.00205
145 -0.10688
146 -0.09297
147 0.34426
148 -0.68471
149 -1.19954
150 -1.24912
151 -1.83780
152 -2.92843
153 -1.95062
154 0.71249
155 -0.71101
156 0.52271
157 0.31401
158 1.38917
159 -2.02083
160 -1.09453
ID=6
OBS Y
161 -1.22608
162 -1.35549
163 -2.90749
164 -0.48246
ID=7
OBS Y
165 1.58283
166 -0.54956
167 0.54916
168 -0.41504
169 0.06839
170 -2.18023
171 -1.11801
172 -0.17716
173 -0.35128
174 0.93176
175 -0.81017
176 -0.17025
177 -0.98307
178 0.63747
179 -0.48010
180 -1.07973
181 1.03507
182 1.10783
183 0.26936
184 0.76464
185 0.01736
186 0.36615
187 -0.37553
188 0.91624
189 0.92953
190 -0.47725
191 -0.45826
192 1.60434
193 -1.75591
194 1.04359
195 -0.74449
196 0.54067
197 -0.62822
198 1.36228
199 -0.03420
200 0.13569
201 -0.25792
ID=8
OBS Y
202 0.10837
203 -0.94440
204 -0.52336
205 -0.79481
206 -0.79491
207 0.20130
208 -0.42985
209 0.31430
210 -1.24146
211 -0.89851
212 1.15808
213 0.38177
214 0.47140
215 -2.11844
216 0.95942
217 -0.07391
218 0.05033
219 -0.48828
220 -0.44363
221 0.22591
222 -1.72475
223 0.58405
224 0.15506
225 -0.22388
226 -0.36666
227 1.63594
228 -1.12804
229 0.15312
230 0.19330
231 -0.00324
232 -0.91588
233 0.54857
234 1.36404
235 0.54981
236 -1.37281
237 -0.74566
ID=9
OBS Y
238 -0.07178
239 1.91303
240 0.69410
241 1.76330
242 -0.22409
243 1.00801
244 0.12493
245 -0.61721
246 1.80041
247 0.34919
248 -2.85407
249 -2.57655
250 0.81381
251 0.61024
252 -0.94201
253 0.09191
254 0.81538
255 -0.43872
256 -0.47206
257 0.71451
258 0.92196
259 0.36260
260 -0.76397
261 0.54613
262 -0.38955
263 -0.33226
264 -1.15956
265 1.03150
266 0.56480
267 -0.46263
268 2.06405
269 1.34594
270 -0.12991
271 0.90268
272 -0.57690
273 0.20573
274 -0.49554
275 -0.32756
276 0.39208
THE SAMPLE FROM THE ORIGINAL DATA SET
ID=0
OBS Y N K PROB
1 -0.04298 19 10 0.52632
2 -0.22226 16 9 0.56250
3 0.07353 15 8 0.53333
4 -1.52119 13 7 0.53846
5 -1.44361 9 6 0.66667
6 0.44887 8 5 0.62500
7 -0.64419 7 4 0.57143
8 1.69914 6 3 0.50000
9 0.60484 3 2 0.66667
10 0.30401 2 1 0.50000
ID=1
OBS Y N K PROB
11 0.72810 38 10 0.26316
12 -1.13082 31 9 0.29032
13 -1.55640 30 8 0.26667
14 -1.11293 28 7 0.25000
15 -1.11567 26 6 0.23077
16 -0.77319 22 5 0.22727
17 -0.51671 20 4 0.20000
18 -0.54947 18 3 0.16667
19 0.29577 9 2 0.22222
20 -1.68968 7 1 0.14286
ID=2
OBS Y N K PROB
21 -0.84595 35 10 0.28571
22 -0.05007 16 9 0.56250
23 -1.75401 14 8 0.57143
24 -1.09562 13 7 0.53846
25 1.06368 10 6 0.60000
26 0.62897 9 5 0.55556
27 0.35541 6 4 0.66667
28 -0.23264 5 3 0.60000
29 -0.39166 2 2 1.00000
30 1.89210 1 1 1.00000
ID=3
OBS Y N K PROB
31 1.44692 14 10 0.71429
32 0.50543 13 9 0.69231
33 -0.31590 12 8 0.66667
34 -0.59481 11 7 0.63636
35 2.29568 8 6 0.75000
36 1.05388 7 5 0.71429
37 -0.64543 6 4 0.66667
38 -0.76476 5 3 0.60000
39 1.22160 4 2 0.50000
40 -0.84033 1 1 1.00000
ID=4
OBS Y N K PROB
41 -0.92060 10 10 1
42 -0.71083 9 9 1
43 0.77366 8 8 1
44 0.63364 7 7 1
45 1.69748 6 6 1
46 0.35330 5 5 1
47 0.33375 4 4 1
48 -0.28322 3 3 1
49 1.21632 2 2 1
50 0.94123 1 1 1
ID=5
OBS Y N K PROB
51 0.04430 37 10 0.27027
52 0.23616 26 9 0.34615
53 0.12275 24 8 0.33333
54 0.37231 23 7 0.30435
55 1.00205 17 6 0.35294
56 -0.68471 13 5 0.38462
57 -1.19954 12 4 0.33333
58 -1.24912 11 3 0.27273
59 -1.83780 10 2 0.20000
60 -1.09453 1 1 1.00000
ID=6
OBS Y N K PROB
61 -1.22608 4 10 2.5
62 -1.35549 3 9 3.0
63 -2.90749 2 8 4.0
64 -0.48246 1 7 7.0
ID=7
OBS Y N K PROB
65 -0.54956 36 10 0.27778
66 -1.11801 31 9 0.29032
67 -0.17025 26 8 0.30769
68 0.63747 24 7 0.29167
69 1.10783 20 6 0.30000
70 0.91624 14 5 0.35714
71 -0.47725 12 4 0.33333
72 1.60434 10 3 0.30000
73 -0.62822 5 2 0.40000
74 -0.03420 3 1 0.33333
ID=8
OBS Y N K PROB
75 0.20130 31 10 0.32258
76 -1.24146 28 9 0.32143
77 -0.89851 27 8 0.29630
78 1.15808 26 7 0.26923
79 0.38177 25 6 0.24000
80 0.47140 24 5 0.20833
81 -0.44363 18 4 0.22222
82 0.22591 17 3 0.17647
83 -0.00324 7 2 0.28571
84 0.54981 3 1 0.33333
ID=9
OBS Y N K PROB
85 -0.22409 35 10 0.28571
86 0.12493 33 9 0.27273
87 1.80041 31 8 0.25806
88 0.71451 20 7 0.35000
89 0.54613 16 6 0.37500
90 -0.46263 10 5 0.50000
91 -0.12991 7 4 0.57143
92 0.20573 4 3 0.75000
93 -0.32756 2 2 1.00000
94 0.39208 1 1 1.00000
This example shows how to select a sample of K observations from a data set.
| Type: | Sample |
| Topic: | SAS Reference ==> DATA Step SAS Reference ==> CALL routines
|
| Date Modified: | 2006-01-11 03:03:02 |
| Date Created: | 2005-05-23 13:53:22 |
Operating System and Release Information
| SAS System | Base SAS | All | n/a | n/a |