Organizing SAS Data

SPD Server Tables

SPD Server software alters SAS tables to enable high-performance processing. SPD Server tables are physically different from a Base SAS table. You can use tables in either SAS or native SPD Server format. For more information about how to migrate tables between SAS and SPD Server, see Migrating Tables between SAS and SPD Server.
SAS tables store a single file that contains the data descriptors and the table data. The data are column values. The descriptors are metadata that describe the column and data formatting that the table uses.
SPD Server tables do not reuse space. When an SQL command to delete one or more rows from a table is issued, the row is marked deleted and the space is not reused. You must copy the table in order to recapture the space.
SPD Server Component Files shows differences in the architecture between SPD Server tables and SAS tables. SPD Server uses component files to store tables. One component file stores the stream of data values. Another component file stores the column and data descriptors. If you create an index for a column or a composite of columns, SPD Server creates component files for each index.

SPD Server Component Files

SPD Server Component Files shows the components of SPD Server tables.
SPD Server Component Files
SPD Server Component Files
SPD Server uses four types of component files to store SPD Server tables.
These two component files store table information:
*.dpf
stores a stream of the table's data values.
*.mdf
stores the table's metadata.
These two component files manage index data:
*.hbx
unique global B-tree indexes.
*.idx
segmented views of the indexed column data. The *.idx components are useful when you are evaluating parallel WHERE clauses.
SPD Server partitions component files when they are created to prevent the files from growing too large. SPD Server stores each partitioned component file as one or more disk files. The partitioning provides the following advantages:
  • Support for very large tables: SPD Server bypasses the file size limits that are imposed by many applications and operating systems. By using partitioned component files, SPD Server can support any file system transparently.
  • Access via multiple directory paths: SPD Server can access data libraries that span numerous directory paths and storage devices. SPD Server software partitions massive data libraries into component files. The component architecture enables rapid, threaded data access, and circumvents device capacity and file size limitation issues. Storage lists transparently track component file locations, so users can access multiple storage devices as a single volume, even if file partitions exist in different locations.
  • Flexibility in storage: You do not need to store data tables and associated indexes in the same location when you use SPD Server component files. You can store data files and associated indexes in different directory structures or on different devices. When you are deciding where to store component SPD Server tables, you need to consider only the cost, performance, and availability of disk space.
  • Improved table scan performance: Data component partitions that are created using fixed-size intervals perform well during parallelized full-table scans.

SPD Server Table Indexes

SPD Server enables you to create indexes on table columns. SPD Server can thread WHERE clause evaluations for tables that are not indexed. Indexes enable rapid WHERE clause evaluations. You should index large tables to optimize SPD Server performance. For more information about SPD Server indexes, see Indexing Tables.