Glossary
- block
- a group of observations in a data set. By using
blocks, thread-enabled applications can read, write, and process the
observations faster than if they are delivered as individual observations.
- compound WHERE expression
- a WHERE expression that contains more than one
operator, as in WHERE X=1 and Y>3. See also WHERE expression.
- controller
- a computer component that manages the interaction
between the computer and a peripheral device such as a disk or a RAID.
For example, a controller manages data I/O between a CPU and a disk
drive. A computer can contain many controllers. A single CPU can command
more than one controller, and a single controller can command multiple
disks.
- CPU-bound application
- an application whose performance is constrained
by the speed at which computations can be performed on the data. Multiple
CPUs and threading technology can alleviate this problem.
- data partition
- a physical file that contains data and which is
part of a collection of physical files that comprise the data component
of a SAS Scalable Performance Data Engine data set. See also partition.
- I/O-bound application
- an application whose performance is constrained
by the speed at which data can be delivered for processing. Multiple
CPUs, partitioned I/O, threading technology, RAID (redundant array
of independent disks) technology, or a combination of these can alleviate
this problem.
- light-weight process thread
- a single-threaded subprocess that is created and
controlled independently, usually with operating system calls. Multiple
light-weight process threads can be active at one time on symmetric
multiprocessing (SMP) hardware or in thread-enabled operating systems.
- parallel I/O
- a method of input and output that takes advantage
of multiple CPUs and multiple controllers, with multiple disks per
controller to read or write data in independent threads.
- parallel processing
- a method of processing that divides a large job
into multiple smaller jobs that can be executed simultaneously on
multiple CPUs. See also threading.
- partition
- part or all of a logical file that spans devices
or directories. In the SPD Engine, a partition is one physical file.
Data files, index files, and metadata files can all be partitioned,
resulting in data partitions, index partitions, and metadata partitions,
respectively. Partitioning a file can improve performance for very
large data sets. See also data partition.
- primary path
- the location in which SPD Engine metadata files
are stored. The other SPD Engine component files (data files and index
files) are stored in separate storage paths in order to take advantage
of the performance boost of multiple CPUs.
- RAID
- See redundant array of independent disks.
- redundancy
- a characteristic of computing systems in which
multiple interchangeable components are provided in order to minimize
the effects of failures, errors, or both. For example, if data is
stored redundantly (in a RAID, for example), then if one disk is lost,
the data is still available on another disk. See also redundant array of independent disks.
- redundant array of independent disks (RAID)
- a type of interleaved storage system that comprises
multiple disks to store large amounts of data inexpensively. RAIDs
can have several levels. For example, a level-0 RAID combines two
or more hard drives into one logical disk drive. Various RAID levels
provide different amounts of redundancy and storage capability. Also,
because the same data is stored in different places, I/O operations
can overlap, which can result in improved performance. See also redundancy.
- SAS Scalable Performance Data Engine (SPD Engine)
- a SAS engine that organizes data into a streamlined
file format, enabling rapid delivery of data to applications. See also parallel I/O, parallel processing.
- sasroot
- a representation of the name for the directory
or folder in which SAS is installed at a site or a computer.
- scalability
- the ability of a software application to function
well and with minimal loss of performance, despite changing computing
environments, and despite changes in the volume of computations, users,
or data. Scalable software is able to take full advantage of increases
in computing capability such as those that are provided by the use
of SMP hardware and threaded processing. See also scalable software.
- scalable software
- software that responds to increased computing
capability on SMP hardware in the expected way. For example, if the
number of CPUs is increased, the time to solution for a CPU-bound
problem decreases by a proportionate amount. And if the throughput
of the I/O system is increased, the time to solution for an I/O-bound
problem decreases by a proportionate amount.
- server scalability
- the ability of a server to take advantage of SMP
hardware and threaded processing in order to process multiple client
requests simultaneously. That is, the increase in computing capacity
that SMP hardware provides increases proportionately the number of
transactions that can be processed per unit of time. See also symmetric multiprocessing.
- SMP
- See symmetric multiprocessing.
- sort indicator
- an attribute of a data file that indicates whether
a data set is sorted, how it was sorted, and whether the sort was
validated. Specifically, the sort indicator attribute indicates the
following information: 1) the BY variable(s) that were used in the
sort; 2) the character set that was used for the character variables;
3) the collating sequence of character variables that was used; 4)
whether the sort information has been validated. This attribute is
stored in the data file descriptor information. Any SAS procedure
that requires data to be sorted as a part of its process uses the
sort indicator.
- spawn
- to start a process or a process thread such as
a light-weight process thread (LWPT). See also thread.
- SPD Engine
- See SAS Scalable Performance Data Engine.
- SPD Engine data file
- the data component of an SPD Engine data set.
In contrast to SAS data files, SPD Engine data files contain only
data; they do not contain metadata. The SPD Engine does not support
data views. See also SPD Engine data set.
- SPD Engine data set
- a data set created by the SPD Engine that has
up to four component files: one for data, one for metadata, and two
for any indexes. The minimum number of component files is two: data
and metadata. Data is separated from the metadata for SPD Engine file
organization.
- symmetric multiprocessing (SMP)
- a type of hardware and software architecture that
can improve the speed of I/O and processing. An SMP machine has multiple
CPUs and a thread-enabled operating system. An SMP machine is usually
configured with multiple controllers and with multiple disk drives
per controller.
- thread
- the smallest unit of processing that can be scheduled
by an operating system.
- thread-enabled operating system
- an operating system that can coordinate symmetric
access by multiple CPUs to a shared main memory space. This coordinated
access enables threads from the same process to share data very efficiently.
- thread-enabled procedure
- a SAS procedure that supports threaded I/O or
threaded processing.
- threaded I/O
- I/O that is performed by multiple threads in order
to increase its speed. In order for threaded I/O to improve performance
significantly, the application that is performing the I/O must be
capable of processing the data rapidly as well. See also I/O-bound application, thread.
- threaded processing
- processing that is performed in multiple threads
in order to improve the speed of CPU-bound applications. See also CPU-bound application, symmetric multiprocessing.
- threading
- a high-performance technology for either data
processing or data I/O in which a task is divided into threads that
are executed concurrently on multiple cores on one or more CPUs.
- time to solution
- the elapsed time that is required for completing
a task. Time-to- solution measurements are used to compare the performance
of software applications in different computing environments. In other
words, they can be used to measure scalability. See also scalability.
- WHERE expression
- is a syntax string within a WHERE clause that
defines the criteria for selecting observations. For example, in a
membership database, the expression "WHERE member_type=Senior"
returns all senior members. See also compound WHERE expression.
Copyright © SAS Institute Inc. All Rights Reserved.