Understanding SAS

SAS Data Sets

A SAS data set is a file structured in a format that SAS can access. The physical object contains the following elements:

data values that are stored in tabular form
a descriptor portion that defines the types of data to SAS

The physical locations of the data values and the descriptor are not necessarily contiguous.

SAS data sets have two forms: data files and data views. SAS data files are relational tables with columns (or variables) and rows (or observations). The SAS data file structure can have many of the characteristics of a DBMS, including indexing, compression, and password protection.

SAS data views are definitions or descriptions of data that resides elsewhere. SAS data views enable you to use SAS to access many different data sources, including flat files, VSAM files, and DBMS structures, as well as native SAS data files. A data view eliminates the need to know anything about the structure of the data or the software that created it. Data views need very little storage because they contain no data. They access the most current data from their defined sources because they collect the actual data values only when they are called.

You can use data views to define subsets of larger structures, or supersets of data that have been enhanced with calculated values. You can create SAS data views that combine views of dissimilar data sources. For example, you can combine a view of a relational DB2 table with a view of a SAS data file, a view of hierarchical IMS data, or even a view from a PC-based dBASE file.

You create SAS data views in three ways:

with the DATA step (DATA step views)
with the SQL procedure (PROC SQL views)
with SAS/ACCESS software (SAS/ACCESS views)

You can use SAS/ACCESS software to work directly with DBMS tables, such as DB2 and Oracle, as if they were SAS data sets and data views by using the SAS/ACCESS LIBNAME statement. For more information, see SAS/ACCESS for Relational Databases: Reference.

These data views have some variation:

DATA step views

describe data from one or more sources, including flat files, VSAM files, and SAS data sets (either data files or other data views). You cannot use a DATA step view to update the view's underlying data because DATA step views only read other files.

For more information about how to create and use DATA step views, see “DATA Step Views,” in SAS Language Reference: Concepts .

PROC SQL views

define either a subset or a superset of data from one or more SAS data sets. These data sets can be data files or data views, and can include data sets composed of DBMS data that are created with the SAS/ACCESS LIBNAME statement, and data views that are created with the SQL Pass-Through Facility to access DBMS data. You can also create PROC SQL views of DBMS data by using an embedded LIBNAME statement.

For example, an SQL procedure can combine data from PROC SQL views, DATA step views, and SAS/ACCESS views with data in a SAS data file. You cannot use a PROC SQL view to update the view's underlying files or tables. However, with some restrictions, you can use the UPDATE, DELETE, and INSERT statements in the SQL procedure to update data that is described by SAS/ACCESS views.

For more information about the SQL Pass-Through Facility, see “Overview: SQL Procedure,” in Base SAS Procedures Guide .

Unicode UTF-8 Support

The SAS ODBC Driver supports data sets with UTF-8 encoding.

SAS Libraries

SAS data sets are contained in libraries. Each SAS library has two names: a physical name and a logical name (libref). The physical name of the library fully identifies the directory or operating system data structure that contains the data sets. Therefore, the physical name must conform to the rules for naming files within your operating system.

You use the libref to identify a group of data sets (files or views) within SAS. The libref is a temporary name that you associate with the physical name of the SAS library during each SAS job or session. After the libref is assigned, you can read, create, or update files in the library. A libref is valid only for the current SAS job or session and you can reference it repeatedly within that job or session.

You can use SAS/ACCESS software to associate a SAS libref with a DBMS database, schema, server, or group of tables and views, such as a DB2 database or group of Oracle tables and views. For more information about using SAS/ACCESS software, see SAS/ACCESS for Relational Databases: Reference.

For more information about SAS libraries, see “SAS Libraries,” in SAS Language Reference: Concepts.

SAS Servers

To access your SAS data sources, the SAS ODBC Driver uses a SAS server. A SAS server is a SAS procedure (either PROC SERVER or PROC ODBCSERV) that runs in its own SAS session. It accepts input and output requests from other SAS sessions and from the SAS ODBC Driver on behalf of the applications that are ODBC compliant. While the server is running, the SAS session does not accept input from the keyboard.

The type of server that the driver uses depends on whether you are accessing local data or remote data:

local data

The SAS ODBC Driver uses a SAS ODBC server to access your data. If you do not already have a SAS session running on your computer, the driver starts a SAS session and executes PROC ODBCSERV, which automatically starts the SAS ODBC server when you connect to your local data source. For more information, see Understanding Access to Local Data Sources. If you have a SAS session running on your computer (but not a SAS ODBC server), then you must either start the SAS ODBC server manually or end the SAS session before you connect to your local data sources.

remote data

The SAS ODBC Driver uses a SAS/SHARE server or a SAS Scalable Performance Data (SPD) Server to access remote data. SAS/SHARE software or SPD Server must be licensed on the remote host. The driver requires TCP/IP software that is included with your operating system to communicate with either type of SAS server. For a SAS/SHARE server, your server administrator uses PROC SERVER to start the server on the remote host. For more information, see Understanding Access to SAS/SHARE Data Sources.

The SAS ODBC Driver can communicate with a SAS/SHARE server or an SPD Server. You can interchange SAS data and SPD Server data by using the LIBNAME statement engine option in either SAS or SPD Server.

SAS Scalable Performance Data (SPD) Server

The SPD Server uses the latest parallel processing methods and data server capabilities to efficiently access large volumes of data and to serve large numbers of concurrent users. The SPD Server provides efficient data access for hundreds of users across multiple processors. The SAS ODBC Driver can be configured for a direct connection to an SPD Server. ODBC connections to SPD SNET are not supported with SAS 9 and later versions of the SAS ODBC Driver.

The SPD Server allows access to SAS data for intensive processing (queries and sorts) on the host server machine. It organizes and processes SAS data to take advantage of parallel processors on specific host servers.

You must have the SPD Server licensed on your client machine. Then, you can interchange SAS data and SPD Server data by using the LIBNAME statement engine option either in SAS or on the SPD Server. For more information, see SAS Scalable Performance Data Server: User's Guide.

SAS Terminology

Software products often include similar components or constructs that are known by different names. For the ODBC standard and SAS, the following correspondences exist:

ODBC term	SAS Term
owner	library name (libref)
table	data set
qualifier	SAS data does not use a qualifier

Therefore, if your application that is ODBC compliant asks you to specify the owner for a SAS library, you should specify the libref. If the application asks for a table name, you should specify the name of the SAS data set. If a qualifier is requested, you can usually leave the field blank.