SPD Server SQL Extensions

SPD Server SQL furnishes several extensions to the SQL language. These extensions are not a part of standardized industry SQL, but they are an integral part of the SPD Server system. These extensions enable systemic data management unique to the SPD Server. The SPD Server SQL uses a special pass-through facility that uses these extensions for data manipulation and extraction operations. The following section discusses the roles of the following extensions, which enable SPD Server's SQL pass-through facility. In addition to the extensions in this section, users should know Libref Statements and Libref Clauses.

BEGIN and END ASYNC OPERATION Statements

Overview of BEGIN and END ASYNC Operation Statements

Asynchronous statements are a useful technique you can use to harness the multi-processor power of SPD Server. Asynchronous statements enable execution of multiple, independent threads at the same time. The BEGIN ASYNC OPERATION and END ASYNC OPERATION statements enable you to delimit one or more statements for asynchronous, parallel execution. Since the statements execute in parallel, they must not depend on another, because there is no way to guarantee which statement will finish before another statement executes. SPD Server software initiates thread execution according to the order of the statements in the block.
Usage:
execute ([ BEGIN | END ] ASYNCH OPERATION); 

Illegal ASYNC Block Statements

The statements in this Illegal ASYNC Block example have illegal interdependencies and cannot be expected to work correctly:
     /* Example of Illegal ASYNC Block Code   */

           PROC SQL;
              connect to sasspds
                 (dbq="my-domain"
                  server=host.port
                  user='user-name'
                  password='user-password'
                  other connection options);

              execute(begin async operation)
                 by sasspds;

              execute(create table T1 as
                 select *
                 from SRC1)
                 by sasspds;

              execute(create unique index I1 on
                 T1(a,b))
                 by sasspds;

              execute(end async operation)
                 by sasspds;

              disconnect from sasspds;
           quit;
The example violates the interdependency rule. The create index statement assumes table T1 exists and is complete. However, table T1 is created from table SRC1, and might not be complete before the asynchronous create index statement executes. Hence, index I1 is dependent on a complete table T1. The resultant data would not be reliable. The purpose of this example is to illustrate the concept of interdependency, and how not to write an ASYNC block.

Legal ASYNC Block Statements

The statements in this Legal ASYNC Block example have no interdependencies.
      /*  Example of Legal ASYNC Block Code            */
      /* Creates some tables in the first ASYNC block  */
      /*                                               */

         PROC SQL;
            connect to sasspds
              (dbq="path1"
               server=host.port
               user='anonymous');

            execute(begin async operation)
               by sasspds;

            execute(create table state_al as
              select *
              from allstates
              where state='AL')
              by sasspds;

            execute(create table state_az as
              select *
              from allstates
              where state='AZ')
              by sasspds;
              ...

            execute(create table state_wy as
               select *
               from allstates
               where state='WY')
               by sasspds;

            execute(end async operation)
              by sasspds;

      /*                                                */
      /* Create some indexes in the second ASYNC block  */
      /*                                                */

            execute(begin async operation)
              by sasspds;

            execute(create index county on
                    state_al(county))
              by sasspds;

            execute(create index county on
                    state_az(county))
              by sasspds;
              ...

            execute(create index county on
                    state_wy(county))
              by sasspds;

            execute(end async operation)
              by sasspds;

           disconnect from sasspds;
          quit;
Why does the second example work correctly? First, each table is created independently. Second, there is a 'synchronization point': the first END ASYNC operation. This point ensures that all the tables are created before the second ASYNC statement block begins. (You can also achieve results that are similar to this example by using the LOAD statement).

Using Librefs in an ASYNC Block Statement

To refer to a two-part table name inside an ASYNC block, you must re-execute the libref statement issued before entering the block. Conversely, if you issue a libref statement inside the ASYNC block, it does not extend outside the ASYNC block. An ASYNC block creates a distinct scope for the libref. To work correctly, a libref statement must be located inside the ASYNC block, and the libref statement must precede the first SQL statement that references it.
      /*  Example of Legal Code using LIBREFs in an ASYNC Block  */
      /*  Create some tables in the first ASYNC block            */

         PROC SQL;
            connect to sasspds
              (dbq="path1"
               server=host.port
               user='anonymous');

            execute(begin async operation)
               by sasspds;

            execute(libref path1 engopt='dbq="path1"
               server=host.port
               user="anonymous"')
               by sasspds;

            execute(libref path2 engopt='dbq="path1"
               server=host.port
               user="anonymous"')
               by sasspds;

            execute(create table path1.southeast as
               select a.customer_id,
                      a.region,
                      b.sales
               from   path1.customer a,
                      path2.orders b
               where  a.customer_id = b.customer_id
               and    a.region='SE')
               by sasspds;

                     ....

             execute(create table path1.northeast as
                 select a.customer_id,
                        a.region,
                        b.sales
                 from   path1.customer a,
                        path2.orders b
                 where  a.customer_id = b.customer_id
                 and    a.region='NE')
                 by sasspds;

            execute(end async operation)
               by sasspds;

            disconnect from sasspds;
         quit;

Using SQL Options in an ASYNC Block Statement

SPD Server SQL options must be set globally for all execute statements in the ASYNC block. These options must be set using an execute statement before the BEGIN ASYNC operation. This example uses code blocks from the preceding example to show how to print a method tree without executing the SQL.
      /*                                               */
      /*  Example of Legal SQL Options in ASYNC Block  */
      /*                                               */

         PROC SQL;
             connect to sasspds
                (dbq="path1"
                 server=host.port
                 user='anonymous');

             execute(reset noexec _method)
               by sasspds;

             execute(begin async operation)
               by sasspds;

             execute(libref path1
                engopt='dbq="path1"
                server=host.port
                user="anonymous"')
                by sasspds;

             execute(libref path2
                engopt='dbq="path1"
                server=host.port
                user="anonymous"')
                by sasspds;

             execute(create table path1.southeast as
                select a.customer_id,
                       a.region,
                       b.sales
                from   path1.customer a,
                       path2.orders b
                where  a.customer_id = b.customer_id
                and    a.region='SE')
                by sasspds;

                     ....

             execute(create table path1.northeast as
                select a.customer_id,
                       a.region,
                       b.sales
                from   path1.customer a,
                         path2.orders b
                where  a.customer_id = b.customer_id
                and    a.region='NE')
                by sasspds;

             execute(end async operation)
               by sasspds;

             disconnect from sasspds;
         quit;

LOAD Statement

The LOAD statement enables table creation (with one or more indexes) with a single statement. The data source for the statement is a SELECT clause. The SELECT list in the clause defines the columns for the new table. All characteristics of the columns (variables) in the select list are preserved, becoming permanent attributes of the new table's column definitions. The target table for the LOAD TABLE statement must be on the local machine.
The LOAD table is not allowed if the source table and the result table have different domain backup credentials. The source table and result table must both be BACKUP=yes or both be BACKUP=no domains. If the LOAD is not allowed, you must use PROC COPY to copy the tables, or use the SQL CREATE TABLE <tablename> AS SELECT statement.
Usage:
execute (LOAD TABLE table spec
    < WITH index spec
    < WITH index spec>>
  by sasspds;   
In the following example, each execute statement creates a table for one U.S. state using a global table called STATE that contains many states. The first execute statement uses LOAD to create table STATE_AL (Alabama), and creates an index on the COUNTY column. The structure of the state table STATE_AL and the data in the state table both come from the global table STATE. The data in STATE_AL is the subset of all records from STATE where the STATE column variable equals 'AL'. LOAD creates a table for states from Alabama to Wyoming, with each state's table indexed by county and mirroring the structure of the parent table STATE.
    execute(load table state_al
        with index county
        on (county) as
        select *
        from state
        where state='AL')
      by sasspds;

    execute(load table state_az
        with index county
        on (county) as
        select *
        from state
        where state='AZ')
      by sasspds;

        ...

    execute(load table state_wy
        with index county
        on (county) as
        select *
        from state
        where state='WY')
      by sasspds;
In general, the LOAD statement is faster than a corresponding create table / create index statement pair, because it builds the table and associated index(es) asynchronously using parallel processing.

COPY Statement

The COPY table statement creates a copy of an SPD Server table with or without the table index(es). For the COPY table statement to work, the source and target tables must be on the local machine. By default, the software creates an index(es). The COPY table statement is faster than either of the following CREATE and LOAD statements:
create table ...
as select ...
create index ...
or
load table ...
with index...
as select ...
The COPY statement is faster than the two above statements because it uses a more direct access path than the SQL SELECT clause when accessing the data.
The COPY table is not allowed if the source table and the result table have different domain backup credentials. The source table and result table must both be BACKUP=yes or both be BACKUP=no domains. If the COPY is not allowed, you must use PROC COPY to copy the tables, or use the SQL CREATE TABLE <tablename> AS SELECT statement.
In the example that follows, two new tables are created: T_NEWand T2_NEW. The first table, T_NEW, is created with index structures identical to table T_NEW. The second table, T2_NEW, is unindexed regardless of the structure of table T2_OLD.
     execute(copy table t_new
             from t_old)
        by sasspds;

     execute(copy table t2_new
             from t2_old
             without indexes)
        by sasspds;
The COPY statement also supports an ORDER BY clause that you use to create a new table with a sort order on one or more columns of the new table. While COPY TABLE does not support all of the options of PROC SORT, you can achieve substantial performance gains when creating this ordered table by using COPY with an ORDER BY clause when appropriate.
The next example copies the table T_OLD to T_NEW using the order by clause. The data is ordered by columns: x in ascending order, y in descending order, and z in ascending order. The results are the same if you run PROC SORT on the columns using the same BY clause. The syntax of the COPY ORDER BY follows the normal SQL ORDER BY clause, but the column identifiers that you can specify are restricted. You can specify only actual table columns when using the COPY ORDER BY clause.
execute(copy table t_new
       from t_old
       order by x, y desc, z asc)
  by sasspds;