Requirements for DATA Step Processing

In order to run a DATA step program in Hadoop, the following is required:
  • The DSACCEL= system option is set to ANY.
    For more information about the DSACCEL= system option, see SAS System Options: Reference.
  • The code must contain a LIBNAME statement using the SAS/ACCESS HADOOP engine.
    For more information about the Hadoop LIBNAME statement, see SAS/ACCESS for Relational Databases: Reference.
  • The input and output files must use the same libref for the HADOOP engine.
  • The DATA statement must be followed immediately by the SET statement.
    This example demonstrates these requirements:
    options dsaccel=any;
    
    libname hdone hadoop;
    data hdone.out;
    set hdone.in;
    /* DATA step code */
    run;
  • The SAS Embedded Process must be running on the cluster where the input and output files exist.
    For more information, see Determining the Status of the SAS Embedded Process in SAS In-Database Products: Administrator's Guide and Starting the SAS Embedded Process in SAS In-Database Products: Administrator's Guide.