The Sashelp.LeuTrain and Sashelp.LeuTest data sets provide microarray data from (Golub et al., 1999; Zou and Hastie, 2005). The Sashelp.LeuTrain data set consists of 7,129 genes and 38 training samples, and the Sashelp.LeuTest data set consists of the same 7,129 genes and 34 testing samples. Among the 38 training samples, 27 are type 1 leukemia (acute
lymphoblastic leukemia, coded in the data as 1) and 11 are type 2 leukemia (acute myeloid leukemia, coded in the data as –1).
The following steps display information about Sashelp.LeuTrain data set and create Figure B.12:
title 'Leukemia Training Data'; proc contents data=sashelp.LeuTrain varnum; ods select position; run; title 'The First Five Observations and 11 Variables'; proc print data=sashelp.LeuTrain(obs=5); var y x1-x10; run; title 'Leukemia Type Variable'; proc freq data=sashelp.LeuTrain; tables y; run;
Figure B.12: Leukemia Training Data
| The First Five Observations and 11 Variables |
| Obs | y | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | -1.46240 | -0.64514 | -0.83593 | -1.47040 | -0.91997 | -1.58430 | 0.71239 | -0.54229 | 1.05090 | 0.23649 |
| 2 | 1 | -0.66480 | 0.20615 | -0.36857 | 0.25822 | -0.47567 | -0.35497 | -1.11940 | -0.29251 | -0.37542 | -0.38760 |
| 3 | 1 | -0.20049 | 0.37994 | -2.38280 | 0.43960 | -1.22700 | -1.76220 | 0.10464 | -1.80750 | 0.49292 | -1.67000 |
| 4 | 1 | -0.25776 | 0.27994 | 1.83920 | -1.62950 | -1.28750 | -1.26510 | 0.76334 | -0.61645 | -0.31578 | -0.32193 |
| 5 | 1 | -0.56457 | -0.39588 | -0.98372 | -0.83741 | -0.41477 | 0.14834 | -0.03550 | -0.10022 | -0.75753 | 0.37068 |
The results of the PROC CONTENTS step are not displayed. The results show that there are 7,130 variables, y and x1-x7129.
The following steps display information about Sashelp.LeuTest data set and create Figure B.13:
title 'Leukemia Test Data'; proc contents data=sashelp.LeuTest varnum; ods select position; run; title 'The First Five Observations and 11 Variables'; proc print data=sashelp.LeuTest(obs=5); var y x1-x10; run; title 'Leukemia Type Variable'; proc freq data=sashelp.LeuTest; tables y; run;
Figure B.13: Leukemia Test Data
| The First Five Observations and 11 Variables |
| Obs | y | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | -1.38240 | 0.06288 | 0.62252 | 1.61210 | 0.52179 | 0.11516 | -1.85270 | -0.39956 | 0.88007 | -0.86565 |
| 2 | 1 | 0.65192 | -0.35476 | 2.29630 | 1.64980 | 0.50211 | -0.37315 | 1.76820 | -1.74270 | 1.63080 | 0.60171 |
| 3 | 1 | 0.65409 | 1.41340 | 0.22593 | -0.06719 | 0.30015 | 0.76964 | -0.26212 | 0.94481 | -0.51884 | -0.60999 |
| 4 | 1 | 1.07220 | 0.01959 | 0.16875 | 0.84779 | 0.24533 | 0.79682 | 0.41442 | 0.35122 | -0.70177 | 1.85410 |
| 5 | 1 | 2.12480 | 1.66370 | -0.35986 | 1.15850 | 0.89379 | 0.56310 | -0.92476 | 0.56790 | -0.56039 | -2.12400 |
The results of the PROC CONTENTS step are not displayed. The results show that there are 7,130 variables, y and x1-x7129.