All SAS users who use LIBNAME access to SAS Scalable Performance Data (SPD) Server should read this chapter. Most table options also work in SQL pass-through statements.
This chapter presents reference information for the SAS Scalable Performance Data (SPD) Server table options. To specify a table option with LIBNAME access, place the option value in parentheses after the table name. The option value then specifies processing that applies only to that table. To specify a table option with pass-through access, place the option value in brackets after the table name. The option value then specifies processing that applies only to that table. The SAS Scalable Performance Data (SPD) Server table options that follow are grouped by the function of their default value.
When using the options in this chapter, remember that if a table option is used subsequent to a LIBNAME option or macro variable, the value of the table option takes precedence.
Specifies when appending to a table whether to apply a single or multiple rows at a time.Syntax
SYNCADD=YES|NODefault
NO
Corresponding Macro Variable
SPDSSADD
Related Table Option
UNIQUESAVE=
Use the following arguments:
- YES
- imitates the behavior of the base SAS engine, applying a single row at a time (synchronously).
- NO
- appends multiple rows at a time (asynchronously).
Description
When SYNCADD= is YES, processing performance is slower. Use this setting only to force the server's append processing to be compatible with Base SAS software processing. That is, when the server encounters a row with a non-unique value, to abort the append operation, back out the transactions just added, and leave the original table on disk.
Example
In this example, when executing the first INSERT statement, PROC SQL permits insertion of the values 'rollback1' and 'rollback2' because the row additions to table A are performed asynchronously. PROC SQL does not get the true completion status at the time it adds a row.
When executing the second INSERT statement, PROC SQL performs a rollback on the INSERT, upon encountering the Add error on 'nonunique', and deletes the rows 'rollback3' and 'rollback4'.
data a; input z $ 1-20 x y; list; datalines; one 1 10 two 2 20 three 3 30 four 4 40 five 5 50 ; PROC SQL sortseq=ascii exec noerrorstop; create unique index comp on a (x, y); insert into a values('rollback1', -80, -80) values('rollback2',-90, -90) values('nonunique', 2, 20); insert into a(syncadd=yes) set z='rollback3', x=-60, y=-60 set z='rollback4', x=-70, y=-70 set z='nonunique', x=2, y=20; quit;
Specifies when creating multiple indexes on a SAS Scalable Performance Data (SPD) Server table whether to create the indexes in parallel.Syntax
ASYNCINDEX=YES|NO
Default
NO
Corresponding Macro Variable
SPDSIASY
Use the following arguments:
- YES
- creates the indexes in parallel.
- NO
- creates a single index at a time.
Description
The SAS Scalable Performance Data (SPD) Server can create multiple indexes for a table at the same time. To do this, it launches a single thread for each index created, then processes the threads simultaneously. Although creating indexes in parallel is much faster, the default for this option is NO. The reason is because parallel creation requires additional sort work space which may not be available.
For a complete description of the benefits and tradeoffs of creating multiple indexes in parallel, see the SAS Scalable Performance Data (SPD) Server User's Guide Help section on SPDSIASY=.
Example
Since the disk work space required for parallel index creation is available, specify for SAS Scalable Performance Data (SPD) Server to create, in parallel, the X, Y, and COMP indexes for table A.
PROC DATASETS lib=mydatalib; modify a(asyncindex=yes); index create x; index create y; index create comp=(x y); quit;
Compresses SAS Scalable Performance Data (SPD) Server tables on disk.Syntax
COMPRESS=YES|NO
Default
NO
Use in Conjunction with Table Option
IOBLOCKSIZE=
Corresponding Macro Variable
SPDSDCMP
Use the following arguments:
- YES
- performs the run-length compression algorithm SPDSRLLC.
- NO
- performs no table compression.
Description
When COMPRESS= is assigned YES, SAS Scalable Performance Data (SPD) Server compresses newly created tables by 'blocks' based on the algorithm specified. To control the amount of compression, use the table option IOBLOCKSIZE=. This option specifies the number of rows that you want to store in the block.
Note: Once a compressed table is created, you cannot change its block size. To resize the block, you must PROC COPY the table to a new table, setting IOBLOCKSIZE= to the block size desired for the output table.
Specifies the size of a SAS Scalable Performance Data (SPD) Server table partition.Syntax
PARTSIZE=n
Default
16 Megabytes
Corresponding Macro Variable
SPDSSIZE=
Affected by LIBNAME option
DATAPATH=
Use the following argument:
- n
- is the size of the partition in megabytes.
Description
Specifying PARTSIZE= forces the software to partition (split) SAS Scalable Performance Data (SPD) Server tables at the given size. The actual size is computed to accommodate the largest number of rows that will fit in the specified size of n Mbytes.
Use this option to improve performance of WHERE Clause evaluation on non-indexed table columns and on SQL GROUP_BY processing. By splitting the data portion of a Scalable Platform Data Server table at fixed-sized intervals, the software can introduce a high degree of scalability for these operations. The reason: it can launch threads in parallel to perform the evaluation on different partitions of the table, without the threat of file access contention between the threads. There is, however, a price for the table splits: an increased number of files, which are required to store the rows of the table.
Ultimately, scalability limits using PARTSIZE= depend on how you structure DATAPATH=, a LIBNAME option discussed in the documentation on Scalable Performance Data Server LIBNAME Options. Specifically, the limits depend on how you configure and spread the DATAPATH= file systems across striped volumes. You should spread each individual volume's striping configuration across multiple disk controllers/SCSI channels in the disk storage array. The goal for the configuration is, at the hardware level, to maximize parallelism during data retrieval.
The PARTSIZE= specification is limited by MINPARTSIZE=, a SAS Scalable Performance Data (SPD) Server parameter maintained by the SAS Scalable Performance Data (SPD) Server administrator. MINPARTSIZE= ensures that an over-zealous SAS user does not create arbitrarily small partitions, thereby generating a large number of files. The default for MINPARTSIZE= is 16 Mbytes and probably should not be lowered much beyond this value.
Note: The PARTSIZE value for a table cannot be changed after a table is created. To change the PARTSIZE, you must PROC COPY the table and use a different PARTSIZE option setting on the new (output) table.
Example
Using PROC SQL, extract a set of rows from an existing table to create a non-indexed table with a partition size of 32 Mbytes in a SAS job:
PROC SQL; create table SPDSCEN.HR80SPDS(partsize=32) as select state, age, sex, hour89, industry, occup from SPDSCEN.PRECS where hour89 > 40; quit;
Specifies the output order of table rows with identical values for the BY column.
Syntax
BYNOEQUALS=YES | NO
Use the following arguments:
- YES
- does not guarantee the output order of table rows with identical values in a BY clause.
- NO
- guarantees the output order of table rows with identical values in a BY clause will be the relative table position of the rows from the input table. This is the default.
Example
Specify for SAS Scalable Performance Data (SPD) Server in the ensuing BY-column operation to output rows with identical values in the key column randomly.
data sport.racquets(index=(string)); input raqname $20. @22 weight @28 balance $2. @32 flex @36 gripsize @42 string $3. @47 price @55 instock; datalines; Solo Junior 10.1 N 2 3.75 syn 50.00 6 Solo Lobber 11.3 N 10 5.5 syn 160.00 1 Solo Queensize 10.9 HH 6 5.0 syn 130.00 3 Solo Kingsize 13.1 HH 5 5.6 syn 140.00 3 ; data sport.racqbal(bynoequal=yes); set sport.racquets; by balance; run;
Specifies the number of rows in a block to be stored in or read from a SAS Scalable Performance Data (SPD) Server table.
Syntax
IOBLOCKSIZE=n
Default
4096
Use in Conjunction with
Macro Variable SPDSDCMP= or Table Options COMPRESS= or ENCRYPT= .
Use the following argument:
- n
- is the size of the block.
Description
The software reads and stores a server table in blocks. IOBLOCKSIZE= is useful on compressed or encrypted tables. SAS Scalable Performance Data (SPD) Server software does not use IOBLOCKSIZE= on noncompressed or nonencrypted tables.
For tables that you compress or encrypt, using either the option COMPRESS= or the macro variable SPDSDCMP=, the IOBLOCKSIZE= specification determines the number of rows to include in the block. The specification applies to block compression as well as data I/O to and from disk. The IOBLOCKSIZE= value affects the table's organization on disk.
When using SAS Scalable Performance Data (SPD) Server table compression or encryption, specify an IOBLOCKSIZE= value that complements how the data is to be accessed, sequentially or randomly. Sequential access or operations requiring full table scans favor a large block size, for example 64K. In contrast, random access favors a smaller block size, for example 8K.
Example
A huge company mailing list is processed sequentially. Specify a block size for compression that is optimal for sequential access.
/* IOblocksize set to 64K */ data sport.maillist(ioblocksize=65536 compress=yes); input name $ 1-20 address $ 21-57 phoneno $ 58-69 sex $71; datalines; Douglas, Mike 3256 Main St., Cary, NC 27511 919-444-5555 M Walters, Ann Marie 256 Evans Dr., Durham, NC 27707 919-324-6786 F Turner, Julia 709 Cedar Rd., Cary, NC 27513 919-555-9045 F Cashwell, Jack 567 Scott Ln., Chapel Hill, NC 27514 919-533-3845 M Clark, John 9 Church St., Durham, NC 27705 919-324-0390 M ; run;
Specifies the size of the SAS Scalable Performance Data (SPD) Server network data packet.
Syntax
NETPACKSIZE=size-of-packet
Use the following argument:
- size-of-packet
- is the size of the network packet in bytes.
Description
This option controls the size of the buffer used for data transfer between SAS Scalable Performance Data (SPD) Server and a SAS client. The default is 32K bytes. The buffer size is relative to the size of a table row. It cannot be less than the size of a single row. Packet size must be equal to some multiple of the table rows. If it is not, SAS Scalable Performance Data (SPD) Server rounds up the size specified. For example, if the packet buffer size is 4096 bytes and the row size is 3072, the software rounds up the buffer size to 6144.
Select a packet size to complement the bandwidth of the network it must travel through. An optimum size will flow the data continuously without significant pauses between packets.
Example
Create a 12K buffer in the memory of the server to send three rows from MYTABLE in each network packet. (The row size in MYTABLE is 4K.)
data mylib.mytable (netpacksize=12288);
Specifies the size of the segment for an index file associated with a SAS Scalable Performance Data (SPD) Server table.
Syntax
SEGSIZE=number
Use the following argument:
- number
- is the number of table rows to include in the index segment.
Description
The minimum SEGSIZE= value is 1024 table rows. The default value is 8192 table rows. The size of the index segment corresponds to the structure of the table and cannot be changed after the table is created.
Example
Specify a segment size of 64K for MYLIB.MYTABLE.
data mylib.mytable (segsize=65536);Note: Tests show that increasing the size of the segment does not significantly increase performance.
Specifies whether to use the table's indexes when processing WHERE Clauses.
Syntax
NOINDEX=YES|NO
Default
NO
Use the following arguments:
- YES
- ignores indexes when processing WHERE Clauses.
- NO
- uses indexes when processing WHERE Clauses.
Description
Set NOINDEX= to YES to test the effect of indexes on performance or for specific processing. Do not use YES routinely for normal processing.
Example
We created an index for the SEX column but decide to test whether it is necessary for our PROC PRINT processing. Specify for the server not to use the index.
data sport.maillist; input name $ 1-20 address $ 21-57 phoneno $ 58-69 sex $71; datalines; Douglas, Mike 3256 Main St., Cary, NC 27511 919-444-5555 M Walters, Ann Marie 256 Evans Dr., Durham, NC 27707 919-324-6786 F Turner, Julia 709 Cedar Rd., Cary, NC 27513 919-555-9045 F Cashwell, Jack 567 Scott Ln., Chapel Hill, NC 27514 919-533-3845 M Clark, John 9 Church St., Durham, NC 27705 919-324-0390 M ; PROC DATASETS lib=sport nolist; modify maillist; index create sex; quit; /*Turn on the macro variable SPDSWDEB */ /* to show that the index is not used */ /* used during the table processing. */ %let spdswdeb=YES; title "All Females from Current Mailing List"; PROC PRINT data=sport.maillist(noindex=yes); where sex="F"; run;
Creates an index that documents the minimum and maximum values of specified variables. SAS Scalable Performance Data (SPD) Server WHERE Clause evaluations use MINMAXVARLIST= indexes to include or eliminate member tables in a SAS Scalable Performance Data (SPD) Server dynamic cluster table from SQL evaluation scans..Syntax
MINMAXVARLIST=(varname1, varname2, ... , varnameN)
Use the following argument:
- varname1, varname2, ... , varname N
- are SAS Scalable Performance Data (SPD) Server table variable names.
Description
The primary purpose of the MIINMAXVARLIST= table option is for use with SAS Scalable Performance Data (SPD) Server dynamic cluster tables where specific members in the dynamic cluster contain a set or range of values, such as sales data for a given month. When a SAS Scalable Performance Data (SPD) Server SQL sub setting where- clause specifies specific months from a range of sales data, the WHERE planner checks the min/max indexes. Based on the min/max index information, the SAS Scalable Performance Data (SPD) Server WHERE planner includes or eliminates member tables in the dynamic cluster for evaluation.
MINMAXVARLIST= uses the list of columns you submit to build an index. The MINMAXVARLIST= index contains only the minimum and maximum values for each column. The WHERE Clause planner uses the index to filter SQL predicates quickly, and to include or eliminate member tables belonging to the cluster table from the evaluation.
Although the MINMAXVARLIST= table option is primarily intended for use with dynamic clusters, it also works on standard SAS Scalable Performance Data (SPD) Server tables. MINMAXVARLIST= can help reduce the need to create many indexes on a table, which can save valuable resources and space.Example
%let domain=path3 ; %let host=kaboom ; %let port=5201 ; libname &domain sasspds "&domain" server=&host..&port user='anonymous' ; /* Create three tables called */ /* xy1, xy2, and xy3. */ data &domain..xy1(minmaxvarlist=(x y)); do x = 1 to 10; do y = 1 to 3; output; end; end; run; data &domain..xy2(minmaxvarlist=(x y)); do x = 11 to 20; do y = 4 to 6 ; output; end; end; run; data &domain..xy3(minmaxvarlist=(x y)); do x = 21 to 30; do y = 7 to 9 ; output; end; end; run; /* Create a dynamic cluster table */ /* called cluster_table out of */ /* new tables xy1, xy2, and xy3 */ PROC SPDO library=&domain ; cluster create cluster_table mem=xy1 mem=xy2 mem=xy3 maxslot=10; quit; /* Enable WHERE evaluation to see */ /* how the SQL planner selects */ /* members from the cluster. Each */ /* member is evaluated using the */ /* min-max index. */ %let SPDSWDEB=YES; /* The first member has true rows */ PROC PRINT data=&domain..cluster_table ; where x eq 3 and y eq 3; run; /* Examine the other tables */ PROC PRINT data=&domain..cluster_table ; where x eq 3 and y eq 3 ; run; PROC PRINT data=&domain..cluster_table ; where x eq 3 and y eq 3; run; PROC PRINT data=&domain..cluster_table ; where x between 1 and 10 and y eq 3; run; PROC PRINT data=&domain..cluster_table ; where x between 11 and 30 and y eq 8 ; run; /* Delete the dynamic cluster table. */ PROC DATASETS lib=&domain nolist; delete cluster_table ; quit ;
THREADNUM=
Specifies the number of threads to be used for WHERE Clause evaluations.Syntax
THREADNUM=n
Default
THREADNUM= is set equal to the value of the MAXWHTHREADS server parameter.
Used in Conjunction with SAS Scalable Performance Data (SPD) Server Parameter
MAXWHTHREADS
Corresponding Macro Variable
SPDSTCNT=
Use the following argument:
- n
- is the number of threads.
Description
THREADNUM= allows you to specify the thread count the SAS Scalable Performance Data (SPD) Server should use when performing a parallel WHERE Clause evaluation.
Use this option to explore scalability for WHERE Clause and GROUP_BY evaluations in non-production jobs. If you use this option for production jobs, you are likely to lower the level of parallelism that is applied to those clause evaluations.
THREADNUM= works in conjunction with MAXWHTHREADS, a configurable system parameter. MAXWHTHREADS imposes an upper limit on the consumption of system resources. The default value of MAXWHTHREADS is dependent on your operating system. Your SAS Scalable Performance Data (SPD) Server administrator can change the default value for MAXWHTHREADS.
If you do not use THREADNUM=, the software provides a default thread number, up to the value of MAXWHTHREADS as required. If you use THREADNUM=, the value that you specify is also constrained by the MAXWHTHREADS value.
The THREADNUM= value applies both to parallel table scans (EVAL2 strategy), parallel indexed evaluations (EVAL1 strategy), parallel BY-clause processing, and parallel GROUP_BY evaluations. The SAS Scalable Performance Data (SPD) Server User's Guide Help section on Optimizing SAS Scalable Performance Data (SPD) Server Performance, contains more information on WHERE Clause evaluation.
Example
The SAS Scalable Performance Data (SPD) Server administrator set MAXWHTHREADS=128 in the SAS Scalable Performance Data (SPD) Server's parameter file. Explore the effects of parallelism on a given query by using the following SAS macro:
%macro dotest(maxthr); %do nthr=1 %to &maxthr; data _null_; set SPDSCEN.PRECS(threadnum=&nthr); WHERE occup='022' and state in('37','03','06','36'); run; %mend dotest;
WHERENOINDEX=
Specifies a list of indexes to exclude when making WHERE Clause evaluations.
Syntax
WHERENOINDEX=(name1 name2...)
Use the following arguments:
- (name1 name2...)
- a list of index names that you wish to exclude from the WHERE planner.
Example
We have a table PRECS with indexes defined as follows:
PROC DATASETS lib=spdscen; modify precs(bitindex=(hour89)); index create stser=(state serialno) occind=(occup industry) hour89; quit;When evaluating the next query, we want the SAS Scalable Performance Data (SPD) Server to exclude from consideration indexes for both the STATE and HOUR89 columns.
In this case, we know that our AND combination of the predicates for the OCCUP and INDUSTRY columns will produce a very small yield. Few rows satisfy the respective predicates. To avoid the extra index I/O (machine time) that the query requires for a full-indexed evaluation, use the following SAS code:
PROC SQL; create table hr80spds as select state, age, sex, hour89, industry, occup from spdscen.precs(wherenoindex=(stser hour89)) where occup='022' and state in('37','03','06','36') and industry='012' and hour89 > 40; quit;Note: Specify index names in the WHERENOINDEX list, not the column names. The example excludes both the composite index for the STATE column STSER and the simple index HOUR89 from consideration by the WHINIT WHERE planner.
Options for Other Functions
BYSORT=
Perform an implicit automatic sort when SAS Scalable Performance Data (SPD) Server encounters a BY clause for a given table.
Syntax
BYSORT=YES | NOUse the following arguments:
- YES
- sorts the data based on the BY columns and returns the sorted data to the SAS client. This powerful capability means the user does not have to sort data using a PROC SORT statement before using a BY clause.
- NO
- does not sort the data based on the BY columns. This may be desirable if a DATA step BY clause has a GROUPFORMAT option or if a PROC step reports grouped and formatted data.
Description
The default is YES. The NO argument means the table must have been previously sorted by the requested BY columns. The NO argument allows grouped data to maintain their precise order in the table. A YES argument groups the data correctly but possibly in a different order from the order in the table.
Example 1 - Group Formatting with BYSORT=
libname sport sasspds 'mylib' host='samson' user='user19' passwd='dummy2'; PROC FORMAT; value dollars 0-99.99="low" 100-199.99="medium" 200-1000="high"; run; data sport.racquets; input raqname $20. @22 weight @28 balance $2. @32 flex @36 gripsize @42 string $3. @47 price @55 instock; datalines; Solo Junior 10.1 N 2 3.75 syn 50.00 6 Solo Lobber 11.3 N 10 5.5 syn 160.00 1 Solo Queensize 10.9 HH 6 5.0 syn 130.00 3 Solo Kingsize 13.1 HH 5 5.6 syn 140.00 3 ;PROC PRINT data=sport.racquets (bysort=yes); var raqname instock; by price; format price dollars.; title 'Solo Brand Racquets by Price Level'; run;Output 4. 1 Report Output with BYSORT=
Solo Brand Racquets by Price Level
---------------------------- Price=low ---------------------------
OBS RAQNAME INSTOCK
1 Solo Junior 6
-------------------------- Price=medium ------------------------
OBS RAQNAME INSTOCK
3 Solo Queensize 3
4 Solo Kingsize 3
2 Solo Lobber 1
Example 2 - Group Formatting without BYSORT=
PROC PRINT data=sport.racquets (bysort=no); var raqname instock; by price; format price dollars.; title 'Solo Brand Racquets by Price Level'; run;Output 4. 2 Report Output without BYSORT=
Solo Brand Racquets by Price Level
---------------------------- Price=low ---------------------------
OBS RAQNAME INSTOCK
1 Solo Junior 6
-------------------------- Price=medium ------------------------
OBS RAQNAME INSTOCK
2 Solo Lobber 1
3 Solo Queensize 3
4 Solo Kingsize 3
ENDOBS=
Specifies the end row (observation) number in a user-defined range for the processing of a given table.
Syntax
ENDOBS=n
Use the following argument:
- n
- is the number of the end row.
Description
By default, SAS Scalable Performance Data (SPD) Server processes the entire table unless the user specifies a range of rows with the STARTOBS= and ENDOBS= options. If the STARTOBS= option is used without the ENDOBS= option, the implied value of ENDOBS= is the end of the table. When both options are used together, the value of ENDOBS= must be greater than STARTOBS=.
In contrast to the Base SAS software options FIRSTOBS= and OBS=, the STARTOBS= and ENDOBS= SAS Scalable Performance Data (SPD) Server options can be used for WHERE Clause processing in addition to table input operations.
Example
Print only rows 2-4 of the table INVENTORY.OLD_AUTOS.
libname inventory sasspds 'conversion_area' server=husky.5105 user='siteusr1' prompt=yes; data inventory.old_autos; input year $4. @6 manufacturer $12. model $10. body_style $5. engine_liters @39 transmission_type $1. @41 exterior_color $10. options $10. mileage conditon; datalines; 1971 Buick Skylark conv 5.8 A yellow 00000001 143000 2 1982 Ford Fiesta hatch 1.2 M silver 00000001 70000 3 1975 Lancia Beta 2door 1.8 M dk blue 00000010 80000 4 1966 Oldsmobile Toronado 2door 7.0 A black 11000010 110000 3 1969 Ford Mustang sptrf 7.1 M red 00000111 125000 3 ; PROC PRINT data=inventory.old_autos (startobs=2 endobs=4); run;Output 4. 3 Data in the Printed Output
1982 Ford Fiesta hatch 1.2 M silver 00000001 70000 3
1975 Lancia Beta 2door 1.3 M dk blue 00000010 80000 4
1966 Oldsmobile Toronado 2door 7.0 A black 11000010 110000 3
STARTOBS=
Specifies the start row (observation) number in a user-defined range for the processing of a given table.
Syntax
STARTOBS=n
Use the following argument:
- n
- is the number of the start row.
Description
By default, SAS Scalable Performance Data (SPD) Server processes the entire table unless the user specifies a range of rows with the STARTOBS= and ENDOBS= options. If the ENDOBS= option is used without the STARTOBS= option, the implied value of STARTOBS= is 1. When both options are used together, the value of STARTOBS= must be less than ENDOBS=.
In contrast to the Base SAS software options FIRSTOBS= and OBS=, the STARTOBS= and ENDOBS= SAS Scalable Performance Data (SPD) Server options can be used for WHERE Clause processing in addition to table input operations.
Example
Print only rows 2-4 of the table INVENTORY.OLD_AUTOS.
libname inventory sasspds 'conversion_area' server=husky.5105 user='siteusr1' prompt=yes; data inventory.old_autos; input year $4. @6 manufacturer $12. model $10. body_style $5. engine_liters @39 transmission_type $1. @41 exterior_color $10. options $10. mileage conditon; datalines;1971 Buick Skylark conv 5.8 A yellow 00000001 143000 2 1982 Ford Fiesta hatch 1.2 M silver 00000001 70000 3 1975 Lancia Beta 2door 1.8 M dk blue 00000010 80000 4 1966 Oldsmobile Toronado 2door 7.0 A black 11000010 110000 3 1969 Ford Mustang sptrf 7.1 M red 00000111 125000 3 ; proc print data=inventory.old_autos (startobs=2 endobs=4); run;
UNIQUESAVE=
Specifies to save rows with non-unique key values (the rejected rows) to a separate table when appending data to tables with unique indexes.Syntax
UNIQUESAVE=YES|NO|REP
Default
NO
Complements the Table Option
SYNCADD=
Used in Conjunction with Macro Variable
SPDSUSDS=
Corresponding Macro Variable:
SPDSUSAV=
Use the following arguments:
- YES
- writes rejected rows to a separate, system-created table file which can be accessed by a reference to the macro variable SPDSUSDS=.
- NO
- does not write rejected rows to a separate table, that is, ignores non-unique key values.
REP- when updating a master table from a transaction table, where the two tables share identical variable structures, the UNIQUESAVE=REP option replaces the row updated row in the master table instead of appending a row to the master table. The REP option only functions in the presence of a /UNIQUE index on the MASTER table. Otherwise the REP setting is ignored..
Description
SYNCADD= is defaulted to NO. When NO, table appends are 'pipelined', meaning that the server data is sent in a stream a block at a time (see table option NETPACKSIZE=). While pipelining is faster than a synchronous append, SAS reports the results of the append operation differently for these two modes.
When applying only a single row (SYNCADD=NO), SAS returns a status code for each ADD operation. The application can determine the next action based upon the status value. If a row is rejected due to containing a non-unique value for a unique index, the user receives a status message. In contrast, when data is pipelined (SYNCADD=YES), SAS returns a status code only after all the rows are applied to a table. As a consequence, the user does not know which rows have been rejected.
To enjoy the performance of data pipelining but still retain the rejected rows, use the UNIQUESAVE= option. When set to YES, SAS Scalable Performance Data (SPD) Server will save any rows that are rejected to a hidden SAS table.
When using this option, SAS returns the name of the hidden table containing the rejected rows in the macro variable SPDSUSDS. If you want to report the contents of the table, reference SPDSUSDS= .
Note: If SYNCADD= YES is set, data pipelining is overridden and the data is processed synchronously. In this situation, the UNIQUESAVE= option is not relevant and, if set, is ignored.
Example 1
We want to append two tables, NAMES2 and NAMES3, which contain employees' names, to the NAMES1 table. Before performing our append, we create an index on the NAME column in NAMES1, declaring the index unique.Specify for SAS Scalable Performance Data (SPD) Server, during the append operation, to store rows found with duplicate employee names to a separate table file generated by the macro variable SPDSUSDS=.
Use a %PUT statement to display the table name for SPDSUSDS=. Then request a printout of the duplicate rows to review later.
data employee.names1; input name $ exten; datalines; Jill 4344 Jack 5589 Jim 8888 Sam 3334 ; run; data employee.names2; input name $ exten; datalines; Jack 4443 Ann 8438 Sam 3334 Susan 5321 Donna 3332 ; run; data employee.names3; input name $ exten; datalines; Donna 3332 Jerry 3268 Mike 2213 ; run; PROC DATASETS lib=employee nolist; modify names1; index create name/unique; quit; PROC APPEND data=employee.names2 out=employee.names1(uniquesave=yes); run; title 'The NAMES1 table with unique names from NAMES2'; PROC PRINT data=employee.names1; run; %put Set the macro variable spdsusds to &spdsusds; title 'Duplicate (non-unique) name rows found in NAMES2'; PROC PRINT data=&spdsusds; run; PROC APPEND data=employee.names3 out=employee.names1(uniquesave=yes); run;The SAS log provides the messages:
WARNING: Duplicate values not allowed on index NAME for file EMPLOYEE.NAMES1. (Occurred 2 times.) NOTE: Duplicate records have been stored in file EMPLOYEE._30E3FD5.And, an extract from our PROC PRINT shows:
The NAMES1 table with unique names from NAMES2 OBS NAME EXTENs 1 Jill 4344 2 Jack 5589 3 Jim 8888 4 Sam 3334 5 Ann 8438 6 Susan 5321 7 Donna 3332 Duplicate (non-unique) name rows found in NAMES2 OBS NAME EXTEN XXX00000 1 Jack 4443 NAME 2 Sam 3334 NAME
Example 2
Use the UNIQUESAVE=REP option to perform an update / append case using PROC APPEND instead of a DATA step:
/* A MASTER table to update. ID */ /* will get a UNIQUE index */ DATA SPDS.MASTER; INPUT ID VALUE $; CARDS; 1 one 2 two 3 three ; PROC DATASETS LIB=SPDS; MODIFY MASTER; INDEX CREATE ID/UNIQUE; QUIT; /* A transaction table TRANS to use to */ /* drive update/appends to MASTER */ DATA SPDS.TRANS; INPUT ID VALUE $; 1 ONE 3 THREE 4 FOUR 4 FOUR* ; /* Use of UNIQUESAVE=REP to update/append */ /* TRANS rows to MASTER based on whether */ /* TRANS records have an ID column that */ /* matches an existing row from the MASTER */ /* table. Update MASTER rows with a match, */ /* otherwise append TRANS row to MASTER */ PROC APPEND DATA=SPDS.TRANS OUT=SPDS.MASTER(UNIQUESAVE=REP); run;Output of the resulting MASTER table would look like:Obs ID VALUE 1 1 ONE 2 2 two 3 2 THREE 4 4 FOUR*
VERBOSE=
Provides details of all indexes and ACL information associated with a SAS Scalable Performance Data (SPD) Server table.
Syntax
VERBOSE= YES | NO
Use the following arguments:
- YES
- requests detail information for the indexes, ACLs, and other SAS Scalable Performance Data (SPD) Server table values. This argument must be used with the CONTENTS procedure.
- NO
- suppresses detail information for the indexes, ACLs, and other SAS Scalable Performance Data (SPD) Server table values. This is the default.
Example
Request details of all the indexes for the table TEMP1 in the domain SPDS44.
PROC CONTENTS data=SPDS44 (verbose=yes); run;The CONTENTS Procedure Data Set Name SPDS44.TEMP1 Observations 1000 Member Type DATA Variables 2 Engine SASSPDS Indexes 2 Created Tuesday, May 10, 2005 10:00:02 AM Observation Length 16 Last Modified Tuesday, May 10, 2005 11:01:36 AM Deleted Observations 0 Protection Compressed NO Data Set Type Sorted NO Label Data Representation Default Encoding Default Engine / Host Dependent Information Blocking Factor (obs/block) 2047 ACL Entry NO ACL User Access(R,W,A,C) (Y,Y,Y,Y) ACL User Name ANONYMOU ACL Owner Name ANONYMOU Data Set is Ranged NO Alphabetic List of Index Info . Bitmap index (No Global Index ) i Keyvalue (Min) 1 Keyvalue (Max) 100 # of Discrete values 100 Bitmap index (No Global Index ) j Keyvalue (Min) 1 Keyvalue (Max) 10 # of Discrete values 10 Data Partsize 16777216 Alphabetic List of Variables and Attributes * Variable Type Len 1 i Num 8 2 j Num 9 Alphabetic List of Indexes and Attributes # of Unique * Index Values 1 i 100 2 j 10
Options for Security
ENCRYPT=
Encrypts SAS Scalable Performance Data (SPD) Server tables on disk. Encryption is a security mechanism that protects table contents from users who have system access to raw SAS Scalable Performance Data (SPD) Server tables. Access to tables is normally controlled by SAS Scalable Performance Data (SPD) Server ACLs. The SAS Scalable Performance Data (SPD) Server Administrator's Guide contains detailed information about using SAS Scalable Performance Data (SPD) Server ACLs to control access to tables.
When the ENCRYPT= option setting is set to YES, SAS Scalable Performance Data (SPD) Server encrypts newly created tables by blocks. To control the amount of encryption per block, use the table option IOBLOCKSIZE=. The IOBLOCKSIZE= option specifies the number of rows to be encrypted in each block.
Syntax
ENCRYPT= YES | NO
Arguments
- YES
- encrypts the data set. The encryption method uses passwords. At a minimum, you must specify the READ= or the PW= data set option at the same time that you specify an ENCRYPT=YES option setting.
- NO
- no table encryption is performed. NO is the default setting for the ENCRYPT= option.
Usage Notes
- Depending on your query patterns, increasing or decreasing the block size can affect performance.
- SAS Scalable Performance Data (SPD) Server does not encrypt table indexes or metadata. Only table row data are encrypted.
- To encrypt SPD tables with pass-through SQL, use only the READ= or PW= table option. With pass-through SQL, ENCRYPT=YES is implied with these options.
- To access an encrypted table, the user must have appropriate ACL permissions to the table and must provide the encryption key via the READ= or PW= table option.
- Encrypting a SAS Scalable Performance Data (SPD) Server table provides security from users that have system access to dump raw SAS Scalable Performance Data (SPD) Server tables. The section on Security in the SAS Scalable Performance Data (SPD) Server Administrator's Guide contains more information about how to controll system access to SAS Scalable Performance Data (SPD) Server tables.