Organizing Domains for Scalability

Overview of Organizing Domains

SPD Server performance is based on scalable I/O. To exploit scalable I/O, you can use the libnames.parm parameter file to optimize how SPD Server stores files. Domain Path Options describes how to specify named paths for the data components of server tables (data tables, index tables, and metadata tables), and how to specify paths for temporary intermediate calculation tables. Domains can specify the system paths that are associated with each tablespace component. However, you must allocate the correct amount of disk space and I/O redundancy to the various paths.
This section provides functional information about the tablespaces that are defined by the DATAPATH=, INDEXPATH=, WORKPATH=, and METAPATH= options. Use this information to determine the best sizing, I/O, and redundancy requirements to optimize performance and scalability for domain paths.

Data Tablespace

When you define a domain, data tables are stored in the space that is defined in the PATHNAME= specification, unless you specify the DATAPATH= option. The PATHNAME= space contains metadata tables for a domain, but it can also contain data tables. As the size and complexity of a domain increase, so do the benefits of organizing data tables into their own DATAPATH= space.
Organizing your data tablespace significantly impacts I/O scalability. The disk space that is allocated to data tables stores permanent warehouse tables that users will access. This disk space should support scalable I/O because it facilitates both parallel processing and real-time multi-user access to the data. In a large warehouse, this disk space probably has the greatest proportion of Read and Write I/O.
Typically, you load and refresh tables in the data tablespace using batch processes during evenings or off-peak hours. You can restrict access to data tablespace to Read-Only access for all users except administrators who perform the load and refresh processes.
To ensure reliability, organize data tablespace into RAID 1+0 or RAID-5 disk configurations. For large warehouses, consider a RAID-5 configuration with a second storage array to mirror the data.

Index Tablespace

When you define a domain, index tables are stored in the space that is defined in the PATHNAME= specification, unless you specify the INDEXPATH= option. The PATHNAME= space contains metadata tables for a domain, but it can also contain index tables. As the size and complexity of a domain increase, so do the benefits of organizing index tables into their own INDEXPATH= space.
Index space typically does not require the high-level scalability that data space, temporary tablespace, or workspace needs for I/O performance. When a process is using an index, the Read access pattern is different from a parallel I/O Read access pattern of data, or multiple user Read access patterns against data.
Typically, you configure index space as a large striped file system across a large number of disks and I/O channels. A typical configuration such as RAID 1+0 or RAID 5 supports some redundancy to ensure the availability of index space.

Metadata Tablespace

When you define a domain, metadata tables are stored in the space that is specified in the PATHNAME= parameter. If the space configured in PATHNAME= is full, SPD Server stores overflow metadata for existing tables in the space that is specified in the METAPATH= option, if it is specified. The PATHNAME= and METAPATH= spaces contain metadata tables for a domain.
Compared to the other space categories, metadata space is relatively small and usually does not require scalability. If compressed data in a given warehouse uses 10 terabytes of disk space, then there are approximately 10 gigabytes of metadata. When you are setting up metadata space, plan to allot 20 gigabytes of metadata space for every 10 terabytes of physical data disk space. When new data paths are added to expand a server, you should add more metadata space within the primary path of the server. Even though the metadata requires only a small amount of space, the disk space must be expandable and mirrored. You also need to back up the metadata.
The metadata for a table becomes larger when rows in the table are marked as deleted. Bitmaps are stored in the metadata that is used to filter the deleted rows. The space required depends on the number of rows that were deleted and on their distribution within the table.

SPD Server Workspaces

You reserve a space for intermediate calculations and temporary files in statements that are in the body of the spdsserv.parm parameter file. The workspace that you configure in spdsserv.parm is shared by all SPD Server users.
Some users have data needs that might be constrained by using the common intermediate calculation and file space that is reserved for all users. Use the libnames.parm parameter file to create and reserve a workspace that is specifically associated with a single domain and its approved users. Doing so can improve both security and performance. As the size and complexity of a domain increase, so do the benefits of organizing temporary and intermediate tables into their own workspace, defined by the WORKPATH= option.
A workspace is an area on disk that SPD Server uses to store required files when the available CPU memory cannot contain the entire set of calculations. When sufficient memory is not available, some utility files are written to disk. Workspaces are important to scalability. Tasks such as large sorts, index creation, parallel group-by operations, and SQL joins can require dedicated workspace to store temporary utility files.
You typically configure a workspace as part of a large striped file system that spans as many disks and I/O channels as possible. Workspace I/O can critically impact the performance behavior of an SPD Server host.
Workspace on disk is typically a RAID 0 configuration or a hardware-redundant RAID design. RAID 0 configurations are risky because if the RAID 0 disk goes down, the system is also affected; any process that was running at the time of failure is also likely to be affected.
Last updated: February 3, 2017