SPD Server SQL Extensions

SPD Server SQL provides several extensions to the SQL language. These extensions are not part of standardized industry SQL, but they are an integral part of the SPD Server system. These extensions enable systemic data management unique to the SPD Server. The SPD Server SQL uses a special pass-through facility that uses these extensions for data manipulation and extraction operations.

BEGIN and END ASYNC OPERATION Statements

Overview of BEGIN and END ASYNC OPERATION Statements

You can use asynchronous statements to maximize the performance of statements by allowing them to execute in parallel. Use the BEGIN ASYNC OPERATION and END ASYNC OPERATION statements to delimit one or more statements for asynchronous, parallel execution. Because the statements execute in parallel, they must not depend on another statement because it is impossible to guarantee which statement will finish before another statement executes. SPD Server software initiates thread execution according to the order of the statements in the block.
Note: When the END ASYNC statement is processed, all execute statements within a BEGIN and END ASYNC block are written to the SAS log along with the results. The execute statements’ output is in clear text including any passwords or keys that might be present in the query.
Usage:
execute ([ BEGIN | END ] ASYNCH OPERATION); 

Invalid ASYNC Block Statements

The statements in this invalid ASYNC block example have invalid interdependencies and can produce unexpected results:
     /* Example of Illegal ASYNC Block Code   */

           PROC SQL;
              connect to sasspds
                 (dbq="my-domain"
                  server=host.port
                  user='user-name'
                  password='user-password'
                  other connection options);

              execute(begin async operation)
                 by sasspds;

              execute(create table T1 as
                 select *
                 from SRC1)
                 by sasspds;

              execute(create unique index I1 on
                 T1(a,b))
                 by sasspds;

              execute(end async operation)
                 by sasspds;

              disconnect from sasspds;
           quit;
The example violates the interdependency rule. The CREATE INDEX statement assumes that table T1 exists and is complete. However, table T1 is created from table SRC1 and might not be complete before the asynchronous CREATE INDEX statement executes. Therefore, index I1 is dependent on a complete table T1. The resulting data would not be reliable.

Legal ASYNC Block Statements

The statements in this legal ASYNC block example do not have interdependencies.
      /*  Example of Legal ASYNC Block Code            */
      /* Creates some tables in the first ASYNC block  */
      /*                                               */

         PROC SQL;
            connect to sasspds
              (dbq="path1"
               server=host.port
               user='anonymous');

            execute(begin async operation)
               by sasspds;

            execute(create table state_al as
              select *
              from allstates
              where state='AL')
              by sasspds;

            execute(create table state_az as
              select *
              from allstates
              where state='AZ')
              by sasspds;
              ...

            execute(create table state_wy as
               select *
               from allstates
               where state='WY')
               by sasspds;

            execute(end async operation)
              by sasspds;

      /*                                                */
      /* Create some indexes in the second ASYNC block  */
      /*                                                */

            execute(begin async operation)
              by sasspds;

            execute(create index county on
                    state_al(county))
              by sasspds;

            execute(create index county on
                    state_az(county))
              by sasspds;
              ...

            execute(create index county on
                    state_wy(county))
              by sasspds;

            execute(end async operation)
              by sasspds;

           disconnect from sasspds;
          quit;
This example functions correctly because each table is created independently. There is a synchronization point: the first END ASYNC operation. The synchronization point ensures that all the tables are created before the second ASYNC statement block begins. (You can also achieve similar results by using the LOAD statement.) For more information about the LOAD statement, see LOAD Statement.

Using Librefs in an ASYNC Block Statement

To refer to a two-part table name inside an ASYNC block, you must re-execute the libref statement that you issued before you can enter the block. Conversely, if you issue a libref statement inside the ASYNC block, it does not extend outside the ASYNC block. An ASYNC block creates a distinct scope for the libref. To function correctly, you must place a libref statement inside the ASYNC block, and the libref statement must precede the first SQL statement that references it.
      /*  Example of Legal Code using LIBREFs in an ASYNC Block  */
      /*  Create some tables in the first ASYNC block            */

         PROC SQL;
            connect to sasspds
              (dbq="path1"
               server=host.port
               user='anonymous');

            execute(begin async operation)
               by sasspds;

            execute(libref path1 engopt='dbq="path1"
               server=host.port
               user="anonymous"')
               by sasspds;

            execute(libref path2 engopt='dbq="path2"
               server=host.port
               user="anonymous"')
               by sasspds;

            execute(create table path1.southeast as
               select a.customer_id,
                      a.region,
                      b.sales
               from   path1.customer a,
                      path2.orders b
               where  a.customer_id = b.customer_id
               and    a.region='SE')
               by sasspds;

                     ....

             execute(create table path1.northeast as
                 select a.customer_id,
                        a.region,
                        b.sales
                 from   path1.customer a,
                        path2.orders b
                 where  a.customer_id = b.customer_id
                 and    a.region='NE')
                 by sasspds;

            execute(end async operation)
               by sasspds;

            disconnect from sasspds;
         quit;

Using SQL Options in an ASYNC Block Statement

You must set SPD Server SQL options globally for all execute statements in the ASYNC block. You must set these options by using an execute statement before the BEGIN ASYNC operation. This example uses code blocks from the example Using Librefs in an ASYNC Block Statement to show how to print a method tree without executing the SQL.
      /*                                               */
      /*  Example of Legal SQL Options in ASYNC Block  */
      /*                                               */

			LIBNAME Path1 sasspds ... LIBGEN=YES;
			LIBNAME Path2 sasspds ... LIBGEN=YES;

         PROC SQL;
             connect to sasspds
                (dbq="path1"
                 server=host.port
                 user='anonymous');

             execute(reset noexec _method)
               by sasspds;

             execute(begin async operation)
               by sasspds;

             execute(libref path1
                engopt='dbq="path1"
                server=host.port
                user="anonymous"')
                by sasspds;

             execute(libref path2
                engopt='dbq="path2"
                server=host.port
                user="anonymous"')
                by sasspds;

             execute(create table path1.southeast as
                select a.customer_id,
                       a.region,
                       b.sales
                from   path1.customer a,
                       path2.orders b
                where  a.customer_id = b.customer_id
                and    a.region='SE')
                by sasspds;

                     ....

             execute(create table path1.northeast as
                select a.customer_id,
                       a.region,
                       b.sales
                from   path1.customer a,
                         path2.orders b
                where  a.customer_id = b.customer_id
                and    a.region='NE')
                by sasspds;

             execute(end async operation)
               by sasspds;

             disconnect from sasspds;
         quit;

LOAD Statement

Use the LOAD statement to create tables (with one or more indexes) by using a single statement. The data source for the statement is a SELECT clause. The SELECT list in the clause defines the columns for the new table. All characteristics of the columns (variables) in the SELECT list are preserved and become permanent attributes of the new table's column definitions. The target table for the LOAD TABLE statement must be on the local machine.
You cannot create a table with the LOAD statement if the source table and the result table have different domain backup credentials. The source table and result table must both be either BACKUP=yes or BACKUP=no domains. For more information about BACKUP= options and LIBNAME domains, see Creating the LIBNAME Domain in SAS Scalable Performance Data Server: Administrator's Guide. If you cannot use the LOAD statement, you must use PROC COPY to copy the tables, or use the SQL CREATE TABLE <tablename> AS SELECT statement.
In general, the LOAD statement is faster than a corresponding CREATE TABLE and CREATE INDEX statement pair, because it builds the table and one or more associated indexes asynchronously by using parallel processing.
Usage:
execute (LOAD TABLE table spec
    < WITH index spec
    < WITH index spec>>
  by sasspds;
In the following example, each execute statement creates a table for one U.S. state using a global table called STATE that contains many states. The first execute statement uses LOAD to create table STATE_AL (Alabama), and creates an index on the COUNTY column. The structure of the state table STATE_AL and the data in the state table both come from the global table STATE. The data in STATE_AL is the subset of all records from the STATE table in which the column variable equals 'AL'. The LOAD statement creates a table for all states (Alabama through Wyoming). The table for each state is indexed by county and mirrors the structure of the parent table STATE.
    execute(load table state_al
        with index county
        on (county) as
        select *
        from state
        where state='AL')
      by sasspds;

    execute(load table state_az
        with index county
        on (county) as
        select *
        from state
        where state='AZ')
      by sasspds;

        ...

    execute(load table state_wy
        with index county
        on (county) as
        select *
        from state
        where state='WY')
      by sasspds;

COPY Statement

The COPY statement creates a copy of an SPD Server table with or without the table indexes. In order to use the COPY table statement, the source and target tables must be on the same machine that the client is connected to. By default, the software creates one or more indexes. The COPY statement is faster than each of the following CREATE and LOAD statements:
create table ...
as select ...
create index ...
load table ...
with index...
as select ...
The COPY statement is faster than these statements because it uses a more direct access path than the SQL SELECT clause when it accesses the data.
You cannot use the COPY TABLE statement if the source table and the result table have different domain backup credentials. The source table and result table must both be either BACKUP=yes or BACKUP=no domains. If you cannot use the COPY statement, you must use PROC COPY to copy the tables, or use the SQL CREATE TABLE <tablename> AS SELECT statement.
The following example creates two new tables: T_NEWand T2_NEW. The first table, T_NEW, is created with index structures identical to table T_NEW. The second table, T2_NEW, is unindexed, regardless of the structure of table T2_OLD.
     execute(copy table t_new
             from t_old)
        by sasspds;

     execute(copy table t2_new
             from t2_old
             without indexes)
        by sasspds;
The COPY statement also supports an ORDER BY clause that you can use to create a new table with a sort order on one or more columns. COPY TABLE does not support all of the options of PROC SORT. However, you can achieve substantial performance gains when you create ordered tables by using the COPY TABLE command with an ORDER BY clause when appropriate.
The next example copies the table T_OLD to T_NEW using the order by clause. The data is ordered by columns: x in ascending order, y in descending order, and z in ascending order. The results are the same if you run PROC SORT on the columns using the same BY clause. The syntax of the COPY ORDER BY follows the typical SQL ORDER BY clause, but the column identifiers that you can specify are restricted. You can specify only actual table columns when you use the COPY ORDER BY clause.
execute(copy table t_new
       from t_old
       order by x, y desc, z asc)
  by sasspds;