High-Performance Features of the OPTGRAPH Procedure


Distribute Data to Hadoop in SASHDAT Format

The following example shows how to use a DATA step to copy a data set to a Hadoop appliance and store it in SASHDAT format; the table is distributed by a column called from_node:

   libname linkdata 'C:\mydata';
   libname hdat sashdat hdfs_path='user/data';
   
   proc datasets nolist lib=hdat;
      delete links_data_123;
   quit;
   
   data hdat.links_data_123 (partition=(from_node));
      set linkdata.links_data_123;
   run;

If the output table links_data_123 already exists in the Hadoop appliance, the call to PROC DATASETS removes the table from the appliance, because a DBMS usually does not support replacement operations on tables.