Organizing SAS Data

SPD Server Tables

SPD Server software alters SAS tables to enable high-performance processing. SPD Server tables are physically different from a Base SAS table. You can use tables in either SAS or native SPD Server format. Accessing and Creating SAS Scalable Performance Data (SPD) Server Tables discusses how a simple PROC COPY statement handles Migrating Tables between SAS and SPD Server.
SAS tables store a single file that contains the data descriptors and the table data. The data are column values. The descriptors are metadata that describe the column and data formatting that the table uses.
SPD Server tables do not reuse space. When an SQL command to delete one or more rows from a table is issued, the row is marked deleted and the space is not reused. To recapture the space, the table must be copied.
SPD Server Component Files shows differences in the architecture between SPD Server tables and SAS tables. SPD Server uses component files to store tables. One component file stores the stream of data values. Another component file stores the column and data descriptors. If you create an index for a column or a composite of columns, SPD Server creates component files for each index.

SPD Server Component Files

SPD Server uses four types of component files to store SPD Server tables. SPD Server Component Files shows the components of SPD Server tables. Two component files store table information. The *.dpf component file stores a stream of the table's data values. The *.mdf component file stores the table's metadata. SPD Server creates two more component files to manage index data. *.hbx components are unique global B-tree indexes. *.idx components are segmented views of the indexed column data. The *.idx components are especially useful in evaluating parallel WHERE clauses.
SPD Server Component Files
SPD Server Component Files
SPD Server partitions component files when they are created to prevent them from growing too large. Each partitioned component file is stored as one or more disk files. There are several advantages to partitioning the component files.
  • Very Large Tables: SPD Server bypasses file size limits imposed by many applications and operating systems. By using partitioned component files, SPD Server can support any file system transparently.
  • Multiple Directory Paths: SPD Server can access data libraries that span numerous directory paths and storage devices. SPD Server software partitions massive data libraries into component files. The component architecture enables rapid threaded data access, while circumventing device capacity and file size limitation issues. Storage lists transparently track component file locations so users can access multiple storage devices as a single volume, even if file partitions exist in different locations.
  • Flexibility in Storage: There is no need to store data tables and associated indexes in the same location when using SPD Server component files. Data files and associated indexes can be stored in different directory structures or on different devices if you want. When deciding where to store component SPD Server tables, you only need to consider the cost, performance, and availability of the disk space.
  • Improved Table Scan Performance: Data component partitions that are created using fixed-size intervals perform aggressively during parallelized full-table scans. SPD Server Table Options contains information about how to use the PARTSIZE= option to control partition size.

SPD Server Table Indexes

SPD Server allows you to create indexes on table columns. SPD Server can thread WHERE clause evaluations for tables that are not indexed. Indexes enable rapid WHERE clause evaluations. In particular, large tables should be indexed to exploit SPD Server performance. For more information about SPD Server indexes, see Indexing a Table.