Comparing the Default Base SAS Engine and the SPD Engine

Overview of Comparisons

Default Base SAS engine data sets and SPD Engine data sets have many similarities. They both store data in a SAS library, which is a collection of files that reside in one or more directories. However, because the SPD Engine data libraries can span devices and file systems, the SPD Engine is ideal for use with very large data sets. Also, the SPD Engine enables you to specify separate directories, or devices, for each component in the LIBNAME statement. Creating and Loading SPD Engine Files provides details about designing and setting up the SPD Engine data libraries.

The SPD Engine Libraries and File Systems

An SPD Engine library can contain data files, metadata files, and index files. The SPD Engine does not support catalogs, SAS views, MDDBs, or other utility (byte) files.
The SPD Engine uses the zFS file system for z/OS and the ODS-5 file system for OpenVMS on HP Integrity Servers. This means that some functionality might be slightly different on these platforms. For example, for z/OS, the user must have a home directory on zFS.

Utility File Workspace

Utility files are generated during the SPD Engine operations that need extra space (for example, when creating parallel indexes or when sorting very large files). Default locations exist for all platforms but, if you have large amounts of data to process, the default location might not be large enough. The SPD Engine system option SPDEUTILLOC= lets you specify a set of file locations in which to store utility scratch files. For more information, see SPDEUTILLOC= System Option.

Temporary Storage of Interim Data Sets

To create a library to store interim data sets, specify the SPD Engine option TEMP= in the LIBNAME statement. If you want current applications to refer to these interim files using one-level names, specify the library on the USER= system option.
The following example code creates a user libref for interim data sets. It is deleted at the end of the session.
libname user spde '/mydata' temp=yes; 
data a; x=1; 
run; 
proc print data=a;   
The USER= option can be set in the configuration file so that applications that reference interim data sets with one-level names can run in the SPD Engine.

Differences between the Default Base SAS Engine Data Sets and the SPD Engine Data Sets

The following chart compares the SPD Engine capabilities to default Base SAS engine capabilities.
Comparing the Default Base SAS Engine Data Sets and the SPD Engine Data Sets
Feature
SPD Engine
Default Base SAS Engine
Partitioned data sets
yes
no
Parallel WHERE optimization
yes
no
Lowest locking level
member
record
Concurrent access from multiple SAS sessions on a given data set
READ (INPUT open mode)
READ and WRITE (all open modes)
Remote computing via SAS/CONNECT
no
yes
Data transfer via SAS/CONNECT
no
yes
RLS (Remote Library Services) via SAS/CONNECT
no
yes
Available via SAS/CONNECT
no
yes
Support in SAS/SHARE
no
yes
Automatic sort for SAS BY processing (sort a temporary copy of the data to support BY processing)
yes
no
User-defined formats and informats
yes, except in WHERE(footnote1)
yes
Catalogs
no
yes
Views
no
yes
MDDBs
no
yes
Integrity constraints
no
yes
Data set generations
no
yes
CEDA
no
yes
Audit trail
no
yes
NLS transcoding
no
yes
Number of observations that can be counted
263-1 (on all hosts)
231-1 (on 32-bit hosts)263-1 (on 64-bit hosts)
COMPRESS=
YES|NO|CHAR|BINARY (only if the file is not encrypted)
YES|NO|CHAR|BINARY
DLCREATEDIR
no
yes
ENCRYPT=
cannot be used with COMPRESS=
can be used with COMPRESS=
Encryption
data files only
yes (all files)
FIRSTOBS= system option and data set option
no
yes
OBS= system option and data set option
yes, if used without ENDOBS= or STARTOBS= SPD Engine options
yes
Functions and call routines
yes, with some exceptions(footnote2)
yes
Move table via OS utilities to a different directory or folder
no
yes
Observations returned in physical order
no, if BY or WHERE is present
yes
DLDMGACTION= system option and data set option
yes
FOOTNOTE 1:In WHERE processing, user-defined formats and informats are passed to the supervisor for handling. Therefore, they are not processed in parallel.[return]
FOOTNOTE 2:In WHERE processing, functions and call routines introduced in SAS 9 or later are passed to the supervisor for handling. Therefore, they are not processed in parallel.[return]
FOOTNOTE 3:Damage to partition data files or metadata files is not detected. The files cannot be repaired. Damage to index files is detected. Indexes might be repaired with the REPAIR statement in PROC DATASETS.[return]