The following example shows how to use the HPDS2 procedure to copy a data set to a Greenplum database; the table is distributed
by a column called from_node
:
libname linkdata 'C:\mydata'; libname gplib greenplm server = "grid001.example.com" schema = public user = dbuser password = dbpass database = hps; proc datasets nolist lib=gplib; delete links_data_123; quit; proc hpds2 data = linkdata.links_data_123 out = gplib.links_data_123 (distributed_by='distributed by (from_node)'); performance host = "grid001.example.com" install = "/opt/TKGrid" commit = 10000000; data DS2GTF.out; method run(); set DS2GTF.in; end; enddata; run;
If the output table links_data_123
already exists in the Greenplum database, the call to the DATASETS procedure removes the existing table from the database,
because a DBMS usually does not support replacement operations on tables.
The OUT= option in the PROC HPDS2 statement specifies a table that uses the library gplib
, which is the Greenplum library that is assigned in the second LIBNAME statement. This option also requests that PROC HPDS2
distribute the records by from_node
among the data segments of the computing appliance. The statements that follow the PERFORMANCE statement are the DS2 program
that copies the input data to the output data without further transformations.
For more information about the HPDS2 procedure, see Base SAS Procedures Guide: High-Performance Procedures.