Sample Code for Hadoop

Code Snippets

The code snippets in this section resemble those for most other SAS/ACCESS interfaces.
This snippet shows a list of available Hive tables.
proc datasets lib=hdp; quit;
Here is the metadata for the mytab Hive table.
proc contents data=hdp.mytab; quit;
This snippet extracts mytab data into SAS.
data work.a;
set hdp.mytab;
run;
This extracts a subset of the mytab rows and columns into SAS. Subsetting the rows (with a WHERE statement, for example) can help avoid extracting too much data into SAS.
data work.a;
set hdp.mytab (keep=col1 col2);
where col2=10;
run;

Use DBSASTYPE= to Load Hadoop Data into SAS

This example uses the DBSASTYPE= data set option to load Hadoop textual dates, timestamps, and times into the corresponding SAS DATE, DATETIME, and TIME formats. The first step reads in a SAS character string to display the data and make clear what occurs in successive steps.
data; set hdp.testHiveDate; put dt; run;
2011-10-17
2009-07-30 12:58:59
11:30:01
data; set hdp.testHiveDate(dbsastype=(dt='date')); put dt; run;
17OCT2011
30JUL2009
.
data; set hdp.testHiveDate(dbsastype=(dt='datetime')); put dt; run;
17OCT2011:00:00:00
30JUL2009:12:58:59
.
data; set hdp.testHiveDate(dbsastype=(dt='time')); put dt; run;
.
12:58:59
11:30:01
This code uses SAS SQL to access a Hadoop table.
proc sql;
create table work.a as select * from hdp.newtab;
quit;
SAS data is then loaded into Hadoop.
data hdp.newtab2;
set work.a;
run;
Use implicit pass-through SQL to extract only 10 rows from the newtab table and load the work SAS data set with the results.
proc sql;
connect to hadoop (server=hxpduped);
create table work.a as
   select * from connection to hadoop (select * from newtab limit 10);