This example uses the same data set as in Example 9.1. It demonstrates how to use PROC HPSAMPLE to perform oversampling.
When the input data set resides on the SAS appliance, the SAS appliance performs all samplings, writes out a sample on the
SAS appliance, and reports the frequency results back to the client. In the following statements, the input data resides in
the MyLib library (which is a distributed data source), and the output data set is a distributed data set in the MyLib library.
You can use PROC DATASETS to delete the output table if it already exists on the SAS appliance. The ods output FreqTable=Freqtab;
statement saves the frequency table to a SAS data set called Freqtab
on the client.
/*MyLib is a libref for a distributed data source In this case, the computation is automatically done on the SAS Appliance.*/ option set=GRIDHOST = "&GRIDHOST"; option set=GRIDINSTALLLOC = "&GRIDINSTALLLOC"; option set=GRIDMODE = "&GRIDMODE"; libname MyLib &LIBTYPE server ="&GRIDDATASERVER" user =&USER password=&PASSWORD database=&DATABASE; proc delete data=MyLib.out_smp_ex3; run; proc hpsample data=MyLib.hmeq out=MyLib.out_smp_ex3 seed=13579 partition samppctevt=80 eventprop=.2 event="SALES"; var loan value delinq derog; class job; target job; ods output FreqTable=Freqtab; run;
Output 9.3.1 shows the performance environment information.
Output 9.3.1: Performance Information
Performance Information | |
---|---|
Host Node | greenarrow.unx.sas.com |
Execution Mode | Distributed |
Grid Mode | Symmetric |
Number of Compute Nodes | 16 |
Number of Threads per Node | 1 |
Output 9.3.2 shows the number of observations in each level of target variable JOB in the data set MyLib.Hmeq
and in the sample. After oversampling, the proportion of SALES level is adjusted to 20% in the sample from the original 1.8%
in the population.
Output 9.3.2: Frequency Table
Oversampling Frequency Table | ||
---|---|---|
Target Level | Number of Obs | Number of Samples |
279 | 15 | |
MGR | 767 | 47 |
OFFICE | 948 | 56 |
OTHER | 2388 | 142 |
PROFEXE | 1276 | 77 |
SALES | 109 | 87 |
SELF | 193 | 15 |