Procedure Considerations and Limitations

Overview

The considerations and limitations in the following sections apply to both Base SAS and SAS/STAT in-database procedures.
Note: Each in-database procedure has its own specific considerations and limitations. For more information, see the documentation for the procedure.

User-Defined Formats

If you use in-database procedures with user-defined formats that were published in the data source, you must have a local copy of the user-defined formats. Without the local copy, the procedure fails.
Note: The local copy of the user-defined format must be identical in both name and function to the format that is published to the data source. If they are not identical, the following actions occur.
  • A “check sum ERROR” warning is produced. The warning indicates that the local and published formats differ.
  • The local format is used, and the query is processed by SAS instead of inside the data source.
If this occurs, you can redefine the local format to match the published version and rerun the procedure inside the data source.
For more information about publishing user-defined formats, see the section on deploying and using formats for your data source in Part 3, “Format Publishing and the SAS_PUT( ) Function.”
Note: Format publishing of user-defined formats is not available for Hadoop, Oracle, and SAP HANA.

Row Order

  • DBMS tables have no inherent order for the rows. Therefore, the BY statement with the OBS option and the FIRSTOBS option prevents in-database processing.
  • If you specify the ORDER=DATA option for input data, the procedure might produce different results for separate runs of the same analysis.
  • The order of rows written to a data source table from a SAS procedure is not likely to be preserved. For example, the SORT procedure can write a SAS data set that contains ordered observations. If the results are written to a data source table, the order of rows within that table might not be preserved because the DBMS has no obligation to maintain row order.
  • You can print a table using the SQL procedure with an ORDER BY clause to get consistent row order. Another option is to use the SORT procedure to create an ordinary SAS data set and use the PRINT procedure on that SAS data set.

BY-Groups

BY-group processing is handled by SAS for Base SAS procedures. Raw results are returned from the DBMS, and SAS BY-group processing applies formats as necessary to create the BY group.
For SAS/STAT procedures, formats can be applied, and BY-group processing can occur inside the DBMS if the SAS_PUT( ) function and formats are published to the DBMS. For more information, see the SAS Analytics Accelerator for Teradata: Guide.
These BY statement option settings apply to the in-database procedures:
  • The DESCENDING option is supported.
  • The NOTSORTED option is ignored because the data is always returned in sorted order.
When SAS/ACCESS creates a data source table, SAS/ACCESS by default uses the SAS formats that are assigned to variables to decide which DBMS data types to assign to the DBMS columns. If you specify the DBFMTIGNORE system option for numeric formats, SAS/ACCESS creates DBMS columns with a DOUBLE PRECISION data type. For more information, see the LIBNAME Statement for Relational Databases, “LIBNAME Statement Data Conversions,” and the DBFMTIGNORE system option in SAS/ACCESS for Relational Databases: Reference.

LIBNAME Statement

  • These LIBNAME statement options and settings prevent in-database processing:
    • CONNECTION
    • CONNECTION_GROUP
    • DBCREATE_TABLE_OPTS
    • DBMSTEMP=YES
    • DBCONINIT
    • DBCONTERM
    • DBGEN_NAME=SAS
    • HDFS_METADIR
    • MODE=TERADATA
  • LIBNAME concatenation prevents in-database processing.

Data Set-Related Options

These data set options and settings prevent in-database processing:
  • DBCONDITION
  • DBFORCE
  • DBLINK (Oracle only)
  • DBNULL
  • DBTYPE
  • NULLCHAR
  • NULLCHARVAL
  • OBS= and FIRSTOBS= on DATA= data set
  • OUT= data set on DBMS and DATA= data set not on DBMS
    For example, if data=work.foo and out=tera.fooout where WORK is the Base SAS engine, in-database processing does not occur.
  • RENAME= on a data set
  • SCHEMA

Column Names in Netezza

Column names that start with an underscore are not allowed in Netezza.
An error occurs if you try to create an output table in Netezza that contains a column whose name starts with an underscore. The workaround for this is to send the output table to the SAS Work directory.

Table Names in Hive

Table names that start with an underscore are not allowed in Hive.
An error occurs if you try to create an output table in Hive whose name starts with an underscore. The workaround for this is to send the output table to the SAS Work directory.

Additional Limitations That Can Prevent In-Database Processing

These items prevent in-database processing:
  • DBMSs do not support SAS passwords.
  • SAS encryption requires passwords that are not supported.
  • Teradata does not support generation options that are explicitly specified in the procedure step, and the procedure does not know whether a generation number is explicit or implicit.
  • When the data source resolves function references. the data source searches in this order:
    1. fully qualified object name
    2. current data source
    3. SYSLIB
    If you need to reference functions that are published in a nonsystem, nondefault data source, you must use one of these methods:
    • Use explicit SQL.
    • Use the DATABASE= LIBNAME option.
    • Map the fully qualified name (schema.sas_put) in the external mapping.
  • Oracle Version 10g supports only 4000 characters per input data item. If you are transcoding input data that has special characters, be aware that these characters might need more than one byte per character.