Requirements for DATA Step Processing

In order to run a DATA step program in Hadoop, the following is required:
  • The DSACCEL= system option is set to ANY.
    For more information about the DSACCEL= system option, see SAS System Options: Reference.
  • The code must contain a LIBNAME statement using the SAS/ACCESS HADOOP engine.
    For more information about the Hadoop LIBNAME statement, see SAS/ACCESS for Relational Databases: Reference.
  • The input and output files must use the same libref for the HADOOP engine.
  • The DATA statement must be followed immediately by the SET statement.
    This example demonstrates these requirements:
    options dsaccel=any;
    
    libname hdone hadoop;
    data hdone.out;
    set hdone.in;
    /* DATA step code */
    run;
  • The SAS Embedded Process must be running on the cluster where the input and output files exist.