SAS Scalable Performance Data (SPD) Server Table Options

Contents


Introduction

All SAS users who use LIBNAME access to SAS Scalable Performance Data (SPD) Server should read this chapter.   Most table options also work in SQL pass-through statements.

This chapter presents reference information for the SAS Scalable Performance Data (SPD) Server table options. To specify a table option with LIBNAME access,  place the option value in parentheses after the table name. The option value then specifies processing that applies only to that table. To specify a table option with pass-through access, place the option value in brackets after the table name.  The option value then specifies processing that applies only to that table.  The SAS Scalable Performance Data (SPD) Server table options that follow are grouped by the function of their default value.

When using the options in this chapter, remember that if a table option is used subsequent to a LIBNAME option or macro variable, the value of the table option takes precedence. 


Option for Compatibility with Base SAS Software

SYNCADD=


Specifies when appending to a table whether to apply a single or multiple rows at a time.

Syntax

SYNCADD=YES|NO

Default

NO

Corresponding Macro Variable

SPDSSADD

Related Table Option

UNIQUESAVE=

Use the following arguments:

YES
imitates the behavior of the base SAS engine, applying a single row at a time (synchronously).
 
NO
appends multiple rows at a time (asynchronously).

Description

When SYNCADD= is YES, processing performance is slower. Use this setting only to force the server's append processing to be compatible with Base SAS software processing. That is, when the server encounters a row with a non-unique value, to abort the append operation, back out the transactions just added, and leave the original table on disk.

Example

In this example, when executing the first INSERT statement, PROC SQL permits insertion of the values 'rollback1' and 'rollback2' because the row additions to table A are performed asynchronously. PROC SQL does not get the true completion status at the time it adds a row.

When executing the second INSERT statement, PROC SQL performs a rollback on the INSERT, upon encountering the Add error on 'nonunique', and deletes the rows 'rollback3' and 'rollback4'.

data a;
  input z $ 1-20 x y;
  list;

  datalines;

one                  1 10
two                  2 20
three                3 30
four                 4 40
five                 5 50
;

PROC SQL sortseq=ascii exec noerrorstop;
create unique index comp on a (x, y);
insert into a
  values('rollback1', -80, -80)
  values('rollback2',-90, -90)
  values('nonunique', 2, 20);

insert into a(syncadd=yes)
  set z='rollback3', x=-60, y=-60
  set z='rollback4', x=-70, y=-70
  set z='nonunique', x=2, y=20;
quit;

Options That Affect Disk Space

ASYNCINDEX=


Specifies when creating multiple indexes on a SAS Scalable Performance Data (SPD) Server table whether to create the indexes in parallel.

Syntax

ASYNCINDEX=YES|NO

Default

NO

Corresponding Macro Variable

SPDSIASY

Use the following arguments:

YES
creates the indexes in parallel.
 
NO
creates a single index at a time.

Description

The SAS Scalable Performance Data (SPD) Server can create multiple indexes for a table at the same time. To do this, it launches a single thread for each index created, then processes the threads simultaneously. Although creating indexes in parallel is much faster, the default for this option is NO. The reason is because parallel creation requires additional sort work space which may not be available.

For a complete description of the benefits and tradeoffs of creating multiple indexes in parallel, see the SAS Scalable Performance Data (SPD) Server User's Guide Help section on SPDSIASY=.

Example

Since the disk work space required for parallel index creation is available, specify for SAS Scalable Performance Data (SPD) Server to create, in parallel, the X, Y, and COMP indexes for table A.

PROC DATASETS lib=mydatalib;
   modify a(asyncindex=yes);
   index create x;
   index create y;
   index create comp=(x y);
   quit;

COMPRESS=


Compresses SAS Scalable Performance Data (SPD) Server tables on disk.

Syntax

COMPRESS=YES|NO

Default

NO

Use in Conjunction with Table Option

IOBLOCKSIZE=

Corresponding Macro Variable

SPDSDCMP

Use the following arguments:

YES
performs the run-length compression algorithm SPDSRLLC.
 
NO
performs no table compression.

Description

When COMPRESS= is assigned YES, SAS Scalable Performance Data (SPD) Server compresses newly created tables by 'blocks' based on the algorithm specified. To control the amount of compression, use the table option IOBLOCKSIZE=. This option specifies the number of rows that you want to store in the block. 

Note: Once a compressed table is created, you cannot change its block size. To resize the block, you must PROC COPY the table to a new table, setting IOBLOCKSIZE= to the block size desired for the output table.


PARTSIZE=


Specifies the size of a SAS Scalable Performance Data (SPD) Server table partition.

Syntax

PARTSIZE=n

Default

16 Megabytes

Corresponding Macro Variable

SPDSSIZE=

Affected by LIBNAME option

DATAPATH=

Use the following argument:

n
is the size of the partition in megabytes.

Description

Specifying PARTSIZE= forces the software to partition (split) SAS Scalable Performance Data (SPD) Server tables at the given size. The actual size is computed to accommodate the largest number of rows that will fit in the specified size of n Mbytes.

Use this option to improve performance of WHERE Clause evaluation on non-indexed table columns and on SQL GROUP_BY processing.  By splitting the data portion of a Scalable Platform Data Server table at fixed-sized intervals, the software can introduce a high degree of scalability for these operations. The reason:  it can launch threads in parallel to perform the evaluation on different partitions of the table, without the threat of file access contention between the threads. There is, however, a price for the table splits: an increased number of files, which are required to store the rows of the table.

Ultimately, scalability limits using PARTSIZE= depend on how you structure DATAPATH=, a LIBNAME option discussed in the documentation on Scalable Performance Data Server LIBNAME Options. Specifically, the limits depend on how you configure and spread the DATAPATH= file systems across striped volumes. You should spread each individual volume's striping configuration across multiple disk controllers/SCSI channels in the disk storage array. The goal for the configuration is, at the hardware level, to maximize parallelism during data retrieval.

The PARTSIZE= specification is limited by MINPARTSIZE=, a SAS Scalable Performance Data (SPD) Server parameter maintained by the SAS Scalable Performance Data (SPD) Server administrator. MINPARTSIZE= ensures that an over-zealous SAS user does not create arbitrarily small partitions, thereby generating a large number of files. The default for MINPARTSIZE= is 16 Mbytes and probably should not be lowered much beyond this value.

Note: The PARTSIZE value for a table cannot be changed after a table is created. To change the PARTSIZE, you must PROC COPY the table and use a different PARTSIZE option setting on the new (output) table.

Example

Using PROC SQL, extract a set of rows from an existing table to create a non-indexed table with a partition size of 32 Mbytes in a SAS job:

PROC SQL;
create table SPDSCEN.HR80SPDS(partsize=32)
  as select
    state,
    age,
    sex,
    hour89,
    industry,
    occup
   from SPDSCEN.PRECS
   where hour89 > 40;
quit;

Options to Enhance Performance

BYNOEQUALS=

Specifies the output order of table rows with identical values for the BY column.

Syntax

BYNOEQUALS=YES | NO

Use the following arguments:

YES
does not guarantee the output order of table rows with identical values in a BY clause.
 
NO
guarantees the output order of table rows with identical values in a BY clause will be the relative table position of the rows from the input table. This is the default.

Example

Specify for SAS Scalable Performance Data (SPD) Server in the ensuing BY-column operation to output rows with identical values in the key column randomly.

data sport.racquets(index=(string));
   input
     raqname $20.
     @22 weight
     @28 balance $2.
     @32 flex
     @36 gripsize
     @42 string $3.
     @47 price
     @55 instock;

   datalines;

Solo Junior          10.1   N   2  3.75  syn   50.00   6
Solo Lobber          11.3   N  10  5.5   syn  160.00   1
Solo Queensize       10.9  HH   6  5.0   syn  130.00   3
Solo Kingsize        13.1  HH   5  5.6   syn  140.00   3
;

data sport.racqbal(bynoequal=yes);
  set sport.racquets;
  by balance;
run;

IOBLOCKSIZE=

Specifies the number of rows in a block to be stored in or read from a SAS Scalable Performance Data (SPD) Server table.

Syntax

IOBLOCKSIZE=n

Default

4096

Use in Conjunction with 

Macro Variable  SPDSDCMP=   or Table Options  COMPRESS=  or ENCRYPT= .

Use the following argument:

n
is the size of the block.

Description

The software reads and stores a server table in blocks. IOBLOCKSIZE= is useful on compressed or encrypted tables.  SAS Scalable Performance Data (SPD) Server software does not use IOBLOCKSIZE= on noncompressed or nonencrypted tables.

For tables that you compress or encrypt, using either the option COMPRESS= or the macro variable SPDSDCMP=, the IOBLOCKSIZE= specification determines the number of rows to include in the block. The specification applies to block compression as well as data I/O to and from disk. The IOBLOCKSIZE= value affects the table's organization on disk.

When using SAS Scalable Performance Data (SPD) Server table compression or encryption, specify an IOBLOCKSIZE= value that complements how the data is to be accessed, sequentially or randomly. Sequential access or operations requiring full table scans favor a large block size, for example 64K. In contrast, random access favors a smaller block size, for example 8K. 

Example

A huge company mailing list is processed sequentially. Specify a block size for compression that is optimal for sequential access.

 /* IOblocksize set to 64K */

data sport.maillist(ioblocksize=65536 compress=yes);
   input
     name $ 1-20
     address $ 21-57
     phoneno $ 58-69
     sex $71;

   datalines;

Douglas, Mike       3256 Main St., Cary, NC 27511        919-444-5555 M
Walters, Ann Marie  256 Evans Dr., Durham, NC 27707      919-324-6786 F
Turner, Julia       709 Cedar Rd., Cary, NC 27513        919-555-9045 F
Cashwell, Jack      567 Scott Ln., Chapel Hill, NC 27514 919-533-3845 M
Clark, John         9 Church St.,  Durham, NC 27705       919-324-0390 M
;
run;

NETPACKSIZE=

Specifies the size of the SAS Scalable Performance Data (SPD) Server network data packet.

Syntax

NETPACKSIZE=size-of-packet

Use the following argument:

size-of-packet
is the size of the network packet in bytes.

Description

This option controls the size of the buffer used for data transfer between SAS Scalable Performance Data (SPD) Server and a SAS client. The default is 32K bytes. The buffer size is relative to the size of a table row. It cannot be less than the size of a single row. Packet size must be equal to some multiple of the table rows. If it is not, SAS Scalable Performance Data (SPD) Server rounds up the size specified. For example, if the packet buffer size is 4096 bytes and the row size is 3072, the software rounds up the buffer size to 6144.

Select a packet size to complement the bandwidth of the network it must travel through. An optimum size will flow the data continuously without significant pauses between packets.

Example

Create a 12K buffer in the memory of the server to send three rows from MYTABLE in each network packet. (The row size in MYTABLE is 4K.)

data mylib.mytable (netpacksize=12288);

SEGSIZE=

Specifies the size of the segment for an index file associated with a SAS Scalable Performance Data (SPD) Server table.

Syntax

SEGSIZE=number

Use the following argument:

number
is the number of table rows to include in the index segment.

Description

The minimum SEGSIZE= value is 1024 table rows. The default value is 8192 table rows.   The size of the index segment corresponds to the structure of the table and cannot be changed after the table is created.

Example

Specify a segment size of 64K for MYLIB.MYTABLE.

data mylib.mytable (segsize=65536);

Note: Tests show that increasing the size of the segment does not significantly increase performance.


Option to Test Performance

NOINDEX=

Specifies whether to use the table's indexes when processing WHERE Clauses.

Syntax

NOINDEX=YES|NO

Default

NO

Use the following arguments:

YES
ignores indexes when processing WHERE Clauses.
 
NO
uses indexes when processing WHERE Clauses.

Description

Set NOINDEX= to YES to test the effect of indexes on performance or for specific processing. Do not use YES routinely for normal processing.

Example

We created an index for the SEX column but decide to test whether it is necessary for our PROC PRINT processing. Specify for the server not to use the index.

data sport.maillist;
   input
     name $ 1-20
     address $ 21-57
     phoneno $ 58-69
     sex $71;

   datalines;

Douglas, Mike       3256 Main St., Cary, NC 27511        919-444-5555 M
Walters, Ann Marie  256 Evans Dr., Durham, NC 27707      919-324-6786 F
Turner, Julia       709 Cedar Rd., Cary, NC 27513        919-555-9045 F
Cashwell, Jack      567 Scott Ln., Chapel Hill, NC 27514 919-533-3845 M
Clark, John         9 Church St.,  Durham, NC 27705       919-324-0390 M
;

PROC DATASETS lib=sport nolist;
  modify maillist;
  index create sex;
quit;

/*Turn on the macro variable SPDSWDEB */
/* to show that the index is not used */
/* used during the table processing.  */

%let spdswdeb=YES;

title "All Females from Current Mailing List";
PROC PRINT data=sport.maillist(noindex=yes);
where sex="F";
run;

Options for WHERE Clause Evaluations

MINMAXVARLIST=


Creates an index that documents the minimum and maximum values of specified variables.  SAS Scalable Performance Data (SPD) Server WHERE Clause evaluations use MINMAXVARLIST= indexes to include or eliminate member tables in a SAS Scalable Performance Data (SPD) Server dynamic cluster table from SQL evaluation scans..

Syntax

MINMAXVARLIST=(varname1, varname2, ... , varnameN) 

Use the following argument:

varname1, varname2, ... , varname N
are SAS Scalable Performance Data (SPD) Server table variable names.

Description

The primary purpose of the MIINMAXVARLIST= table option is for use with SAS Scalable Performance Data (SPD) Server dynamic cluster tables where specific members in the dynamic cluster contain a set or range of values, such as sales data for a given month. When a SAS Scalable Performance Data (SPD) Server SQL sub setting where- clause specifies specific months from a range of sales data, the WHERE planner checks the min/max indexes. Based on the min/max index information, the SAS Scalable Performance Data (SPD) Server WHERE planner includes or eliminates member tables in the dynamic cluster for evaluation.

MINMAXVARLIST= uses the list of columns you submit to build an index.  The MINMAXVARLIST= index contains only the minimum and maximum values for each column. The WHERE Clause planner uses the index to filter SQL predicates quickly, and to include or eliminate member tables belonging to the cluster table from the evaluation.

Although the MINMAXVARLIST= table option is primarily intended for use with dynamic clusters, it also works on standard SAS Scalable Performance Data (SPD) Server tables. MINMAXVARLIST=  can help reduce the need to create many indexes on a table, which can save  valuable resources and space.

Example

%let domain=path3 ;
%let host=kaboom ;
%let port=5201 ;

libname &domain sasspds "&domain"
   server=&host..&port
   user='anonymous' ;

/* Create three tables called */
/* xy1, xy2, and xy3.         */

data &domain..xy1(minmaxvarlist=(x y));
  do x = 1 to 10;
  do y = 1 to 3;
  output;
  end;
end;
run;

data &domain..xy2(minmaxvarlist=(x y));
  do x = 11 to 20;
  do y = 4 to 6 ;
  output;
  end;
end;
run;

data &domain..xy3(minmaxvarlist=(x y));
  do x = 21 to 30;
  do y = 7 to 9 ;
  output;
  end;
end;
run;


/* Create a dynamic cluster table */
/* called cluster_table out of    */
/* new tables xy1, xy2, and xy3   */

PROC SPDO library=&domain ;
   cluster create cluster_table
      mem=xy1
      mem=xy2
      mem=xy3
      maxslot=10;
quit;


/* Enable WHERE evaluation to see  */
/* how the SQL planner selects     */
/* members from the cluster. Each  */
/* member is evaluated using the   */
/* min-max index.                  */

%let SPDSWDEB=YES;


/* The first member has true rows  */

PROC PRINT data=&domain..cluster_table ;
   where x eq 3
   and y eq 3;
run;


/* Examine the other tables */

PROC PRINT data=&domain..cluster_table ;
   where x eq 3
   and y eq 3 ;
run;

PROC PRINT data=&domain..cluster_table ;
   where x eq 3
   and y eq 3;
run;

PROC PRINT data=&domain..cluster_table ;
   where x between 1 and 10
   and y eq 3;
run;

PROC PRINT data=&domain..cluster_table ;
   where x between 11 and 30
   and y eq 8 ;
run;


/* Delete the dynamic cluster table. */

PROC DATASETS lib=&domain nolist;
   delete cluster_table ;
quit ; 

THREADNUM=


Specifies the number of threads to be used for WHERE Clause evaluations.

Syntax

THREADNUM=n

Default

THREADNUM= is set equal to the value of the MAXWHTHREADS server parameter.

Used in Conjunction with SAS Scalable Performance Data (SPD) Server Parameter

MAXWHTHREADS

Corresponding Macro Variable

 SPDSTCNT=

Use the following argument:

n
is the number of threads.

Description

THREADNUM= allows you to specify the thread count the SAS Scalable Performance Data (SPD) Server should use when performing a parallel WHERE Clause evaluation.

Use this option to explore scalability for WHERE Clause and GROUP_BY evaluations in non-production jobs.  If you use this option for production jobs, you are likely to lower the level of parallelism that is applied to those clause evaluations.

THREADNUM= works in conjunction with MAXWHTHREADS, a configurable system parameter. MAXWHTHREADS imposes an upper limit on the consumption of system resources. The default value of MAXWHTHREADS is dependent on your operating system.  Your SAS Scalable Performance Data (SPD) Server administrator can change the default value for MAXWHTHREADS.

If you do not use THREADNUM=, the software provides a default thread number, up to the value of MAXWHTHREADS as required. If you use THREADNUM=, the value that you specify is also constrained by the MAXWHTHREADS value.

The THREADNUM= value applies both to parallel table scans (EVAL2 strategy), parallel indexed evaluations (EVAL1 strategy), parallel BY-clause processing, and parallel GROUP_BY evaluations. The SAS Scalable Performance Data (SPD) Server User's Guide Help section on Optimizing SAS Scalable Performance Data (SPD) Server Performance, contains more information on WHERE Clause evaluation.

Example

The SAS Scalable Performance Data (SPD) Server administrator set MAXWHTHREADS=128 in the SAS Scalable Performance Data (SPD) Server's parameter file. Explore the effects of parallelism on a given query by using the following SAS macro:

%macro dotest(maxthr);
%do nthr=1 %to &maxthr;
   data _null_;
     set SPDSCEN.PRECS(threadnum=&nthr);
     WHERE
       occup='022'
       and state in('37','03','06','36');
   run;
%mend dotest;

WHERENOINDEX=

Specifies a list of indexes to exclude when making WHERE Clause evaluations.

Syntax

WHERENOINDEX=(name1 name2...)

Use the following arguments:

(name1 name2...)
a list of index names that you wish to exclude from the WHERE planner. 

Example

We have a table PRECS with indexes defined as follows:

PROC DATASETS lib=spdscen;
modify precs(bitindex=(hour89));
index create
  stser=(state serialno)
  occind=(occup industry)
  hour89;
quit;

When evaluating the next query, we want the SAS Scalable Performance Data (SPD) Server to exclude from consideration indexes for both the STATE and HOUR89 columns.

In this case, we know that our AND combination of the predicates for the OCCUP and INDUSTRY columns will produce a very small yield. Few rows satisfy the respective predicates. To avoid the extra index I/O (machine time) that the query requires for a full-indexed evaluation, use the following SAS code:

PROC SQL;
create table hr80spds
  as select
    state,
    age,
    sex,
    hour89,
    industry,
    occup
  from spdscen.precs(wherenoindex=(stser hour89))
    where occup='022'
    and state in('37','03','06','36')
    and industry='012'
    and hour89 > 40;
quit;

Note: Specify index names in the WHERENOINDEX list, not the column names. The example excludes both the composite index for the STATE column STSER and the simple index HOUR89 from consideration by the  WHINIT WHERE planner.


Options for Other Functions

BYSORT=

Perform an implicit automatic sort when SAS Scalable Performance Data (SPD) Server encounters a BY clause for a given table.

Syntax

BYSORT=YES | NO

Use the following arguments:

YES
sorts the data based on the BY columns and returns the sorted data to the SAS client. This powerful capability means the user does not have to sort data using a PROC SORT statement before using a BY clause.
 
NO
does not sort the data based on the BY columns. This may be desirable if a DATA step BY clause has a GROUPFORMAT option or if a PROC step reports grouped and formatted data.

Description

The default is YES. The NO argument means the table must have been previously sorted by the requested BY columns. The NO argument allows grouped data to maintain their precise order in the table. A YES argument groups the data correctly but possibly in a different order from the order in the table.

Example 1 - Group Formatting with BYSORT=

libname sport sasspds 'mylib'
   host='samson'
   user='user19'
   passwd='dummy2';

PROC FORMAT;
   value dollars
     0-99.99="low"
     100-199.99="medium"
     200-1000="high";
run;

data sport.racquets;
   input
     raqname $20.
     @22 weight
     @28 balance $2.
     @32 flex
     @36 gripsize
     @42 string $3.
     @47 price
     @55 instock;

   datalines;
Solo Junior          10.1   N   2  3.75  syn   50.00   6
Solo Lobber          11.3   N  10  5.5   syn  160.00   1
Solo Queensize       10.9  HH   6  5.0   syn  130.00   3
Solo Kingsize        13.1  HH   5  5.6   syn  140.00   3
;
PROC PRINT data=sport.racquets (bysort=yes);
   var raqname instock;
   by price;
   format price dollars.;
title 'Solo Brand Racquets by Price Level';
run;

Output 4. 1 Report Output with BYSORT=

Solo Brand Racquets by Price Level

---------------------------- Price=low ---------------------------

OBS                       RAQNAME                INSTOCK

1                             Solo Junior                            6

-------------------------- Price=medium ------------------------

OBS                       RAQNAME                INSTOCK

  3                        Solo Queensize                         3  

4                        Solo Kingsize                            3

2                        Solo Lobber                             1

 

Example 2 - Group Formatting without BYSORT=

PROC PRINT data=sport.racquets (bysort=no);
   var raqname instock;
   by price;
   format price dollars.;
title 'Solo Brand Racquets by Price Level';
run;

Output 4. 2 Report Output without BYSORT=

Solo Brand Racquets by Price Level

---------------------------- Price=low ---------------------------

OBS                       RAQNAME                INSTOCK

1                             Solo Junior                            6

-------------------------- Price=medium ------------------------

OBS                       RAQNAME                INSTOCK

2                        Solo Lobber                             1

  3                        Solo Queensize                         3  

4                        Solo Kingsize                            3


ENDOBS=

Specifies the end row (observation) number in a user-defined range for the processing of a given table.

Syntax

ENDOBS=n

Use the following argument:

n
is the number of the end row.

Description

By default, SAS Scalable Performance Data (SPD) Server processes the entire table unless the user specifies a range of rows with the STARTOBS= and ENDOBS= options. If the STARTOBS= option is used without the ENDOBS= option, the implied value of ENDOBS= is the end of the table. When both options are used together, the value of ENDOBS= must be greater than STARTOBS=.

In contrast to the Base SAS software options FIRSTOBS= and OBS=, the STARTOBS= and ENDOBS= SAS Scalable Performance Data (SPD) Server options can be used for WHERE Clause processing in addition to table input operations.

Example

Print only rows 2-4 of the table INVENTORY.OLD_AUTOS.

libname inventory sasspds 'conversion_area'
   server=husky.5105
   user='siteusr1'
   prompt=yes;

data inventory.old_autos;
   input
     year $4.
     @6 manufacturer $12.
     model $10.
     body_style $5.
     engine_liters
     @39 transmission_type $1.
     @41 exterior_color $10.
     options $10.
     mileage conditon;

   datalines;

1971 Buick       Skylark   conv  5.8  A  yellow    00000001 143000 2
1982 Ford        Fiesta    hatch 1.2  M  silver    00000001  70000 3
1975 Lancia      Beta      2door 1.8  M  dk blue   00000010  80000 4
1966 Oldsmobile  Toronado  2door 7.0  A  black     11000010 110000 3
1969 Ford        Mustang   sptrf 7.1  M  red       00000111 125000 3
;

PROC PRINT data=inventory.old_autos (startobs=2 endobs=4);
run;

Output 4. 3 Data in the Printed Output

1982   Ford               Fiesta           hatch    1.2    M   silver         00000001     70000     3

1975   Lancia            Beta             2door   1.3    M   dk blue      00000010     80000    4

1966   Oldsmobile   Toronado      2door   7.0    A    black        11000010    110000    3


STARTOBS=

Specifies the start row (observation) number in a user-defined range for the processing of a given table.

Syntax

STARTOBS=n

Use the following argument:

n
is the number of the start row.

Description

By default, SAS Scalable Performance Data (SPD) Server processes the entire table unless the user specifies a range of rows with the STARTOBS= and ENDOBS= options. If the ENDOBS= option is used without the STARTOBS= option, the implied value of STARTOBS= is 1. When both options are used together, the value of STARTOBS= must be less than ENDOBS=.

In contrast to the Base SAS software options FIRSTOBS= and OBS=, the STARTOBS= and ENDOBS= SAS Scalable Performance Data (SPD) Server options can be used for WHERE Clause processing in addition to table input operations.

Example

Print only rows 2-4 of the table INVENTORY.OLD_AUTOS.

libname inventory sasspds 'conversion_area'
   server=husky.5105
   user='siteusr1'
   prompt=yes;

data inventory.old_autos;
   input
     year $4.
     @6 manufacturer $12.
     model $10.
     body_style $5.
     engine_liters
     @39 transmission_type $1.
     @41 exterior_color $10.
     options $10.
     mileage conditon;

   datalines;
1971 Buick       Skylark   conv  5.8  A  yellow    00000001 143000 2
1982 Ford        Fiesta    hatch 1.2  M  silver    00000001  70000 3
1975 Lancia      Beta      2door 1.8  M  dk blue   00000010  80000 4
1966 Oldsmobile  Toronado  2door 7.0  A  black     11000010 110000 3
1969 Ford        Mustang   sptrf 7.1  M  red       00000111 125000 3
;

proc print data=inventory.old_autos (startobs=2 endobs=4);
run;

UNIQUESAVE=


Specifies to save rows with non-unique key values (the rejected rows) to a separate table when appending data to tables with unique indexes.

Syntax

UNIQUESAVE=YES|NO|REP

Default

  NO

Complements the Table Option

SYNCADD=

Used in Conjunction with Macro Variable

SPDSUSDS=

Corresponding Macro Variable:  

SPDSUSAV=

Use the following arguments:

YES
writes rejected rows to a separate, system-created table file which can be accessed by a reference to the macro variable SPDSUSDS=.
NO
does not write rejected rows to a separate table, that is, ignores non-unique key values. 

REP
when updating a master table from a transaction table, where the two tables share identical variable structures, the UNIQUESAVE=REP option replaces the row updated row in the master table instead of appending a row to the master table. The REP option only functions in the presence of a /UNIQUE index on the MASTER table. Otherwise the REP setting is ignored..

Description

SYNCADD= is defaulted to NO. When NO, table appends are 'pipelined', meaning that the server data is sent in a stream a block at a time (see table option NETPACKSIZE=). While pipelining is faster than a synchronous append, SAS reports the results of the append operation differently for these two modes.

When applying only a single row (SYNCADD=NO), SAS returns a status code for each ADD operation. The application can determine the next action based upon the status value. If a row is rejected due to containing a non-unique value for a unique index, the user receives a status message. In contrast, when data is pipelined (SYNCADD=YES), SAS returns a status code only after all the rows are applied to a table. As a consequence, the user does not know which rows have been rejected.

To enjoy the performance of data pipelining but still retain the rejected rows, use the UNIQUESAVE= option. When set to YES, SAS Scalable Performance Data (SPD) Server will save any rows that are rejected to a hidden SAS table.

When using this option, SAS returns the name of the hidden table containing the rejected rows in the macro variable SPDSUSDS.  If you want to report the contents of the table, reference SPDSUSDS= .

Note: If SYNCADD= YES is set, data pipelining is overridden and the data is processed synchronously. In this situation, the UNIQUESAVE= option is not relevant and, if set, is ignored.

Example 1
We want to append two tables, NAMES2 and NAMES3, which contain employees' names, to the NAMES1 table. Before performing our append, we create an index on the NAME column in NAMES1, declaring the index unique.

Specify for SAS Scalable Performance Data (SPD) Server, during the append operation, to store rows found with duplicate employee names to a separate table file generated by the macro variable SPDSUSDS=.

Use a %PUT statement to display the table name for SPDSUSDS=. Then request a printout of the duplicate rows to review later.

data employee.names1;
input name $ exten;
datalines;
Jill 4344
Jack 5589
Jim  8888
Sam  3334
;
run;

data employee.names2;
input name $ exten;
datalines;
Jack  4443
Ann   8438
Sam   3334
Susan 5321
Donna 3332
;
run;

data employee.names3;
input name $ exten;
datalines;
Donna 3332
Jerry 3268
Mike  2213
;
run;

PROC DATASETS lib=employee nolist;
  modify names1;
  index create name/unique;
quit;

PROC APPEND data=employee.names2
            out=employee.names1(uniquesave=yes); run;

title 'The NAMES1 table with unique names
       from NAMES2';

PROC PRINT data=employee.names1;
run;

%put Set the macro variable spdsusds to &spdsusds;

title 'Duplicate (non-unique) name rows found in
       NAMES2';

PROC PRINT data=&spdsusds;
run;

PROC APPEND data=employee.names3
   out=employee.names1(uniquesave=yes);
run;

The SAS log provides the messages:

WARNING: Duplicate values not allowed on index NAME for
         file EMPLOYEE.NAMES1. (Occurred 2 times.)
NOTE: Duplicate records have been stored in file
      EMPLOYEE._30E3FD5.

And, an extract from our PROC PRINT shows:

The NAMES1 table with unique names from NAMES2

         OBS    NAME     EXTENs

          1     Jill      4344
          2     Jack      5589
          3     Jim       8888
          4     Sam       3334
          5     Ann       8438
          6     Susan     5321
          7     Donna     3332

Duplicate (non-unique) name rows found in NAMES2

        OBS    NAME     EXTEN    XXX00000

          1     Jack      4443      NAME
          2     Sam       3334      NAME

 

Example 2

Use the UNIQUESAVE=REP option to perform an update / append case using PROC APPEND instead of a DATA step:

/* A MASTER table to update. ID */
/* will get a UNIQUE index      */

  DATA SPDS.MASTER;
    INPUT ID VALUE $;
    CARDS;
     1 one
     2 two
     3 three
  ;

  PROC DATASETS LIB=SPDS;
    MODIFY MASTER;
    INDEX CREATE ID/UNIQUE;
  QUIT;

  /* A transaction table TRANS to use to   */
  /* drive update/appends to MASTER        */

  DATA SPDS.TRANS;
    INPUT ID VALUE $;
     1 ONE
     3 THREE
     4 FOUR
     4 FOUR*
  ;

  /* Use of UNIQUESAVE=REP to update/append  */
  /* TRANS rows to MASTER based on whether   */
  /* TRANS records have an ID column that    */
  /* matches an existing row from the MASTER */
  /* table. Update MASTER rows with a match, */
  /* otherwise append TRANS row to MASTER    */

  PROC APPEND DATA=SPDS.TRANS
    OUT=SPDS.MASTER(UNIQUESAVE=REP);
  run;
Output of the resulting MASTER table would look like:
   Obs       ID      VALUE

    1         1      ONE
    2         2      two
    3         2      THREE
    4         4      FOUR*

VERBOSE=

Provides details of all indexes and ACL information associated with a SAS Scalable Performance Data (SPD) Server table.

Syntax

VERBOSE= YES | NO

Use the following arguments:

YES
requests detail information for the indexes, ACLs, and other SAS Scalable Performance Data (SPD) Server table values. This argument must be used with the CONTENTS procedure.
NO
suppresses detail information for the indexes, ACLs, and other SAS Scalable Performance Data (SPD) Server table values. This is the default.

Example

Request details of all the indexes for the table TEMP1 in the domain SPDS44.

PROC CONTENTS data=SPDS44 (verbose=yes);
run; 
 
                          The CONTENTS Procedure

Data Set Name           SPDS44.TEMP1                            Observations            1000
Member Type             DATA                                    Variables               2
Engine                  SASSPDS                                 Indexes                 2
Created                 Tuesday, May 10, 2005 10:00:02 AM       Observation Length      16
Last Modified           Tuesday, May 10, 2005 11:01:36 AM       Deleted Observations    0
Protection                                                      Compressed              NO
Data Set Type                                                   Sorted                  NO
Label
Data Representation     Default
Encoding                Default


                        Engine / Host Dependent Information

                        Blocking Factor (obs/block)     2047
                        ACL Entry                       NO
                        ACL User Access(R,W,A,C)        (Y,Y,Y,Y)
                        ACL User Name                   ANONYMOU
                        ACL Owner Name                  ANONYMOU
                        Data Set is Ranged              NO
                        Alphabetic List of Index Info   .
                        Bitmap index (No Global Index ) i
                        Keyvalue (Min)                  1
                        Keyvalue (Max)                  100
                        # of Discrete values            100
                        Bitmap index (No Global Index ) j
                        Keyvalue (Min)                  1
                        Keyvalue (Max)                  10
                        # of Discrete values            10
                        Data Partsize                   16777216


                        Alphabetic List of Variables and Attributes

                        *       Variable        Type    Len

                        1       i               Num     8
                        2       j               Num     9


                        Alphabetic List of Indexes and Attributes

                                                # of
                                                Unique
                        *       Index           Values

                        1       i               100
                        2       j               10

Options for Security  

ENCRYPT=

Encrypts SAS Scalable Performance Data (SPD) Server tables on disk.  Encryption is a security mechanism that protects table contents from users who have system access to raw SAS Scalable Performance Data (SPD) Server tables.  Access to tables is normally controlled by SAS Scalable Performance Data (SPD) Server ACLs. The SAS Scalable Performance Data (SPD) Server Administrator's Guide contains detailed information about using SAS Scalable Performance Data (SPD) Server ACLs to control access to tables.

When the ENCRYPT= option setting is set to YES, SAS Scalable Performance Data (SPD) Server encrypts newly created tables by blocks. To control the amount of encryption per block, use the table option IOBLOCKSIZE=. The IOBLOCKSIZE= option specifies the number of rows to be encrypted in each block.

Syntax

ENCRYPT= YES | NO

Arguments

YES
encrypts the data set. The encryption method uses passwords. At a minimum, you must specify the READ= or the PW= data set option at the same time that you specify an ENCRYPT=YES option setting.
NO
no table encryption is performed. NO is the default setting for the ENCRYPT= option.

Usage Notes

  1. Depending on your query patterns, increasing or decreasing the block size can affect performance.
     
  2. SAS Scalable Performance Data (SPD) Server does not encrypt table indexes or metadata. Only table row data are encrypted.
     
  3. To encrypt SPD tables with pass-through SQL, use only the READ= or PW= table option. With pass-through SQL, ENCRYPT=YES is implied with these options.
     
  4. To access an encrypted table, the user must have appropriate ACL permissions to the table and must provide the encryption key via the READ= or PW= table option.
     
  5. Encrypting a SAS Scalable Performance Data (SPD) Server table provides security from users that have system access to dump raw SAS Scalable Performance Data (SPD) Server tables. The section on Security in the SAS Scalable Performance Data (SPD) Server Administrator's Guide contains more information about how to controll system access to SAS Scalable Performance Data (SPD) Server tables.